Frequency Polygons: Theory and Application
In this article I investigate the theoretical properties and applications of the frequency polygon, which is constructed by connecting with straight lines the mid-bin values of a histogram. For estimating an unknown probability density function using a random sample, the frequency polygon is shown to dominate the histogram with respect to the criterion of integrated mean squared error, achieving the same rate of convergence to zero of the integrated mean squared error as non-negative kernel estimators. Data-based algorithms for constructing frequency polygons are discussed and illustrated. One is based on a histogram with bin width equal to 2.15sn -1/5, where s is an estimate of the standard deviation from a sample of size n. Another approach is based on the method of generalized crossvalidation. The bivariate frequency polygon is also investigated. Comparisons are made between the frequency polygon and other density estimators. Examples with data in one and two dimensions are presented.