3 The power (the probability of correctly detecting the difference as significant).
4 The design itself (how well it controls experimental error).
5 The number of replicates tested (the number of units per treatment).
Precise details of how to decide upon the appropriate size of experiment are discussed in
a variety of books, and the reader is referred to Desu (1990). However, it is worth
stressing that an over-precise experiment is just as much to be avoided as an underprecise
one. An experiment which can detect differences between treatment population
means, which are too small to be of any practical (as opposed to statistical) significance is
wasteful of valuable resources that could otherwise have been put to better use.
Determining sample size is a compromise between available resources, power,
expected variability of the outcome measure and effect size. The first two on this list are
usually known or specified, but the latter two are not. When you have obtained as much
information as possible (How important is the question? Has it been tackled before? etc.),
the statistician will then come up with some values. However, this should not be seen as
final, but rather as an opening bid in a bargaining procedure. It is not a one-way process,
but rather an iterative one. At the end of a consultation, the experimenter should have a
range of options; the width of this range will be proportional to the uncertainty about the
various options.
In designing experiments, we should allow for involuntary censoring on sample size.
In other words, a study might start off with enough units for analysis, but provide no
margin of error should any unit be withdrawn before the end of the experiment. Just
enough experimental units per group frequently leaves too few at the end to allow
meaningful statistical analysis, and allowances should be made accordingly in
establishing group size.
4.4 Statistical Analysis
It is important to realize from the outset that the observations are usually a sample from
the set of all possible outcomes of the experiment (the population). A sample is taken
because it is too expensive and time-consuming to take all possible measurements.
Statistics are based on the idea that the sample will be ‘typical’ in some way and that it
will enable us to make predictions about the whole population.
The data usually consist of a series of measurements on some feature of an
experimental situation or on some property of an object. The phenomenon being
investigated is usually called the variate. It is useful to distinguish between types of
observation:
1 Nominal, measurements at various unordered discrete levels, examples are blood type,
hair colour.
2 Ordinal, measurements at various ordered discrete levels, examples are clinical
observations on animals and severity of lesions.
3 Continuous, measurements which can assume a continuous uninterrupted range of
values, examples are weight, blood pressure.
Strictly, each type of response requires a different sort of statistical technique. Methods
for nominal data can be used for ordinal or continuous data by defining categories, but