WHY SAMPLE?
Surveys are conducted to gather information
about a population. Sometimes the survey is
conducted as a census, where the goal is to
survey every unit in the population. However,
it is frequently impractical or impossible to
survey an entire population, perhaps owing
to either cost constraints or some other
practical constraint, such as that it may notbe possible to identify all the members of the
population.
An alternative to conducting a census is
to select a sample from the population and
survey only those sampled units. As shown
in Figure 11.1, the idea is to draw a sample
from the population and use data collected
from the sample to infer information about
the entire population. To conduct statistical
inference (i.e., to be able to make quantitative
statements about the unobserved population
statistic), the sample must be drawn in such a
fashion that one can both calculate appropriate
sample statistics and estimate their standard
errors. To do this, as will be discussed in
this chapter, one must use a probability-based
sampling methodology.
A survey administered to a sample can
have a number of advantages over a census,
including:
• lower cost
• less effort to administer
• better response rates
• greater accuracy.
The advantages of lower cost and less
effort are obvious: keeping all else constant,
reducing the number of surveys should cost
less and take less effort to field and analyze.
However, that a survey based on a sample
rather than a census can give better response
rates and greater accuracy is less obvious.
Yet, greater survey accuracy can result when
the sampling error is more than offset by
a decrease in nonresponse and other biases,
perhaps due to increased response rates. That
is, for a fixed level of effort (or funding), a
sample allows the surveying organization to
put more effort into maximizing responses
from those surveyed, perhaps via more effort
invested in survey design and pre-testing,
or perhaps via more detailed non-response
follow-up.
What does all of this have to do with
Internet-based surveys? Before the Internet,
large surveys were generally expensive to
administer and hence survey professionals
gave careful thought to how to best conduct
a survey in order to maximize information
accuracy while minimizing costs. However,as illustrated in Figure 11.2, the Internet
now provides easy access to a plethora
of inexpensive survey software, as well as
to millions of potential survey respondents,
and it has lowered other costs and barriers
to surveying. While this is good news for
survey researchers, these same factors have
also facilitated a proliferation of bad surveyresearch
practice.
For example, in an Internet-based survey
the marginal cost of collecting additional data
can be virtually zero. At first blush, this seems
to be an attractive argument in favour of
attempting to conduct censuses, or for simply
surveying large numbers of individuals
without regard to how the individuals are
recruited into the sample. And, in fact, these
approaches are being used more frequently
with Internet-based surveys, without much
thought being given to alternative sampling
strategies or to the potential impact such
choices have on the accuracy of the survey
results. The result is a proliferation of poorly
conducted ‘censuses’ and surveys based on
large convenience samples that are likely to
yield less accurate information than a wellconducted
survey of a smaller sample.
Conducting surveys, as in all forms of data
collection, requires making compromises.
Specifically, there are almost always tradeoffs
to be made between the amount of data
that can be collected and the accuracy of
the data collected. Hence, it is critical for
researchers to have a firm grasp of the tradeoffs
they implicitly or explicitly make when
choosing a sampling method for collecting
their data.