Because pðc j diÞ þ pðc0 j diÞ ¼ 1, we normalize the latter two items which are proportional to pðc j diÞ and pðc0 j diÞ to get the real values of pðc j diÞ. If pðc j diÞis larger than the probability threshold T, then di belongs to category c, otherwise, di does belong to category c. Then repeat this procedure for each category. In our implementation, if for a certain document, there is no category with a positive probability larger than T, we assign the one category with the largest probability to this document. In addition, “others” is an exclusive category. A tweet is only assigned to “others” when “others” is the only category with probability larger than T. Section 5.4 details the choices of the threshold values T.