The binary event space of the multiple-Bernoulli model is overly simplistic, as it
does not model the number of times that a term occurs in a document. Term frequency
has been shown to be an important feature for retrieval and classification,
especially when used on long documents. When documents are very short,
it is unlikely that many terms will occur more than one time, and therefore the
multiple-Bernoulli model will be an accurate model. However, more often than