When text is modeled as a finite sequence of words, where at each point in the sequence there are t different possible words, this corresponds to assuming a multinomial distribution over words. Although there are alternatives, multinomial language models are the most common in information retrieval. One of the limitations of multinomial models that has been pointed out is that they do not describe text burstiness well, which is the observation that once a word is “pulled out of the bucket,” it tends to be pulled out repeatedly.