In documents there may be pairs of words which always go together. Going back to our example, the pair "data" and "mining" may appear together several times in the same posting. Identifying such word (or word root) pairs will allow us to parse through the document in a more intelligent way. Of course n-gramming is not limited to 2-word pairs. You can have 3-word terms such as "large data sets" or "interpret data models" and so on