We adopted a proportional stratified random sampling design to
select samples into estimation and validation datasets, where
k-means clusters served as individual strata for sample selection.
Sample 95% confidence ellipses fit to each of these two selected
samples showed that estimation and validation datasets covered
approximately equal regions of the dimensionally reduced data space.
The similarity between estimation and validation datasets guaranteed
that statistical measures of model prediction accuracy obtained from
validation datasets would not be adversely affected by extrapolation
errors. Stratified random sampling also ensured that both samples
were uniformly dispersed across all regions of the PC (synthetic X)
space. MRPP results indicated no statistical difference (A=−0.003,
pN0.999) between estimation and validation datasets with respect to
the original 45 LiDAR canopy height and density metrics.