Cluster analysis is a method for grouping observations together based on their similarity (statistical distance) on several variables. A thorough explanation of the theory and practice of using cluster analysis is found in Green (1978). In this application, several accounting variables are selected based on their likelihood of having a strong relationship with returns to equity holders. Our procedure is as follows: principal components analysis is performed on the accounting data to obtain a maximal variance set of orthogonal linear representations of the original data set. Use of the standardized principal components eliminates scale and inter-correlation effects in the data. These new representations of the data set are used in the subsequent cluster analysis. There is no data reduction, the number of principal components analyzed in the cluster analysis is the same as the number of accounting variables selected, and all of the firms in the sample are used in the cluster analysis. The clusters
are formed using Ward’s minimum variance approach to ensure the least within-cluster variance on the principal components. Each cluster, or portfolio, of firms can then be used as a proxy for each of the remaining firms in the cluster in estimating the equity cost of capital for the target. Clustering continues until the target is in a cluster of 11 or 12 other firms. This number was selected arbitrarily to provide a useful sample size for comparisons. The clustering algorithm actually continues until all of the firms are united in a single cluster containing the entire data set. Since the first clusters formed will contain the firms whose data are statistically closest to each other, theoretically the first clusters formed will have the best proxies. These smaller clusters may give more statistically similar proxy portfolios, but will also be more sensitive to outliers. We also tested the results using smaller clusters of five to six firms and did not find a significant difference in outcomes.