Typical system has 5 components:
1. Speech capture device - Analog -> digital converter
2. Digital Signal Processor - Gets word boundaries,
scales, filters, cuts out extra stuff
3. Preprocessed signal storage - Processed speech
buffered for recognition algorithm
4. Reference speech patterns - Stored templates or
generative speech models for comparisons
5. Pattern matching algorithm - Goodness of fit from
templates/model to user’s speech