Language can be divided up into pieces at a variety of different grain sizes, ranging from sounds up
to utterances and even to documents. In this chapter, we will focus on a very important level for much
work in computational linguistics: words. Just what are words, and how should we represent them in a
machine? At first, these may seem like trivial questions, but it turns out that there are some important
issues involved in defining and encoding words.