Using our “bucket” analogy for language models, we would need multiple
buckets to describe this process. For each document, we would have one bucket of
topics, with the number of instances of each topic depending on the distribution
of topics we had picked. For each topic, there would be another bucket containing
words, with the number of instances of the words depending on the probabilities
in the topic language model. Then, to generate a document, we first select a topic
from the topic bucket (still without looking), then go to the bucket of words for
the topic that had been selected and pick out a word. The process is then repeated
for the next word.