For a document di in the testing set, there are K words Wdi = {wi1, wi2, ... , wiK}, and Wdi is a subset of W. The purpose is to classify this document into category c or not c. We assume independence among each word in this document, and any word wik conditioned on c or c' follows
multinomial distribution. Therefore, according to Bayes’ Theorem, the probability that di belongs to category c is