From the inductive content analysis stage, we had a total of 2,785 #engineeringProblems tweets annotated with 6 categories. We used 70 percent of the 2,785 tweets for training (1,950 tweets), and 30 percent for testing (835 tweets). 85.5 percent (532/622) of words occurred more than once in the testing set were found in the training data set. Table 2 shows the six evaluation measures at each probability threshold values from 0 to 1 with a segment of 0.1. We assigned the one category with the largest probability value to the document when there was no category with a positive probability value larger than T. So when the probability threshold was 1, it was equivalent to outputting the largest possible one category for all the tweets.