Different people speak differently:
accent, intonation, stress, idiom, volume, etc.
The syntax of semantically similar sentences may vary.
Background noises can interfere.
People often “ummm.....” and “errr.....”
Words not enough - semantics needed as well
requires intelligence to understand a sentence
context of the utterance often has to be known
also information about the subject and speaker
e.g. even if “Errr.... I, um, don’t like this” is recognised, it is a fairly useless piece of information on it’s own