Summary: When collecting usability metrics, testing 20 users typically offers a reasonably tight confidence interval.
We can define usability in terms of quality metrics, such as learning time, efficiency of use, memorability, user errors, and subjective satisfaction. Sadly, few projects collect such metrics because doing so is expensive: it requires four times as many users as simple user testing.
Many users are required because of the substantial individual differences in user performance. When you measure people, you'll always get some who are really fast and some who are really slow. Given this, you need to average these measures across a fairly large number of observations to smooth over the variability.