Another way of (implicitly) standardizing the data is by using the correlation
between the objects instead of distance measures. For example, suppose a respon-
dent rated price consciousness 2 and brand loyalty 3. Now suppose a second
respondent indicated 5 and 6, whereas a third rated these variables 3 and 3. Eucli-
dean,city-block,andChebychevdistanceswouldindicatethatthefirstrespondentis
more similar to the third than to the second. Nevertheless, one could convincingly
argue that the first respondent’s ratings are more similar to the second’s, as both rate
brand loyalty higher than price consciousness. This can be accounted for by com-
puting the correlation between two vectors of values as a measure of similarity (i.e.,
high correlation coefficients indicate a high degree of similarity). Consequently,
similarity is no longer defined by means of the difference between the answer
categoriesbutbymeansofthesimilarityoftheansweringprofiles.Usingcorrelation
is also a way of standardizing the data implicitly.
Whether you use correlation or one of the distance measures depends on whether
you think the relative magnitude of the variables within an object (which favors
correlation) matters more than the relative magnitude of each variable across
objects (which favors distance). However, it is generally recommended that one
uses correlations when applying clustering procedures that are susceptible to out-
liers, such as complete linkage, average linkage or centroid (see next section).
Whereas the distance measures presented thus far can be used for metrically and –
in general – ordinally scaled data, applying them to nominal or binary data is
meaningless. In this type of analysis, you should rather select a similarity measure
expressing the degree to which variables’ values share the same category. These so-
called matching coefficients can take different forms but rely on the same allocation
scheme shown in Table 9.5.