Numerous situations exist in which a set of judges rates a set of objects. Common professional situations in which this occurs are certain types of athletic competitions (figure skating, diving) in which performance is measured not by the clock but by "form" and "artistry," and consumer product evaluations, such as those conducted by Consumer Reports, in which a large number of different brands of certain items (e.g., gas barbecue grills, air conditioners, etc.) are compared for performance.2 All of these situations are characterized by the fact that a truly "objective" measure of quality is missing, and thus quality can be assayed only on the basis of the (subjective) impressions of judges.
The tasting of wine is, of course, an entirely analogous situation. While there are objective predictors of the quality of wine,3 which utilize variables such as sunshine and rainfall during the growing season, they would be difficult to apply to a sample of wines representing many small vineyards exposed to identical weather conditions, such as might be the case in Burgundy, and would not in any event be able to predict the impact on wine quality of a faulty cork. Hence, wine tasting is an important example in which judges rate a set of objects.
In principle, ratings can be either "blind" or "not blind," although it may be difficult to imagine how a skating competition could be judged without the judges knowing the identities of the contestants. But whenever possible, blind ratings are preferable, because they remove one important aspect of inter-judge variation that most people would claim is irrelevant, and in fact harmful to the results, namely "brand loyalty." Thus, wine bottles are typically covered in blind tastings or wines are decan ted, and identified only with code names such as A, B, etc.4 But even blind tastings do not remove all source of unwanted variation. When we ask judges to take a position as to which wine is best, second best, and so on, we cannot control for the fact that some people like tannin more than others, or that some are offended by traces of oxidation more than others. Another source of variation is that some judge might rate a wine on the basis of how it tastes now, while another judge rates the wine on how he or she thinks the wine might taste at its peak.5
Wine tastings can generate data from which we can learn about the charateristics of both the wines and the judges. In Section 2, we concentrate on what the ratings of wines can tell us about the wines themselves, while in Section 3 we deal with what the ratings can tell us about the judges. Both sets of questions are interesting and can utilize straightforward statistical procedures.