In theory, the sum of squares of log returns sampled at high frequency estimates their
variance. When market microstructure noise is present but unaccounted for, however,
we show that the optimal sampling frequency is finite and derives its closed-form
expression. But even with optimal sampling, using say 5-min returns when transactions
are recorded every second, a vast amount of data is discarded, in contradiction
to basic statistical principles. We demonstrate that modeling the noise and using all
the data is a better solution, even if one misspecifies the noise distribution. So the
answer is: sample as often as possible.