While sampling uncertainty may be easily recognized as an important concept in itself, its consequences for the interpretation of observed findings is, however, often neglected. For example, in a randomized clinical trial with patients who had a single chronic symptomatic cartilage defect on the femoral condyle4, 40 patients were treated with
Fig. 1. A super- and a finite population of osteoporosis patients (symbolized by black and white dots; white dots representing some clinically meaningful characteristic; the finite population defined by the continuous frame). Two random samples of 50 patients each (the gray rectangles) have been drawn from the finite population. While the finite population has 10% white dots, sampling uncertainty is manifested in the sampled units by the varying proportion of white dots, one sample having 8% (4 of 50), the other one 16% (8 of 50). As the super-population is infinitely large it cannot be described graphically, but if the two finite samples can be assumed to be randomly sampled from the superpopulation, its proportion of white dots can be estimated to 8% (2.2%e19.2%) and to 16% (7.2%e29.1%), respectively. The range within brackets are 95% confidence intervals describing the sampling uncertainty.
1417 autologous chondrocyte implantation, and 40 were treated with microfracture. At the 5-year follow-up, there were nine failures in each of the two groups. The authors observed and concluded that ‘‘there was no significant difference’’. However, even if no difference can be observed in the sample, there may well be one in the population. Sampling of patients with individually varying failure rates may prevent a true average treatment difference from being observed in the sample. The sampling uncertainty can be described by a 95% confidence interval for the estimated true failure rate ratio. Using the data presented in the paper the interval can be calculated as (0.3 e3.2), which implies that observed data actually only speaks against an average failure rate difference in the population greater than about 300%. A claim that the two treatments have the same failure rates would thus not have much empirical basis.