3) Principal Component Analysis (PCA): The 67 features
extracted from the data produce a 67-dimensional space.
However, not all the features contribute equally to overall
data variance. To reduce dimensionality while maximizing
the data variance in the lower-dimensional space, principal
component analysis (PCA) is used. The eigenvalue
decomposition of the covariance matrix allows for each
feature to be ranked in order of its contribution to overall
data variance.
4) Feature selection: Using the results of the PCA,
features are selected for use in classification. Of the 67
features, the first 11 principal components are selected for
the first classification task of distinguishing between PD and
control. These 11 features cumulatively account for 80.2% of
the variance in the data. The subsequent contributions of
additional features to the data variance are less than 2.3% per
feature, decreasing to 0.0%. An additional four features are
selected for the second classification task of characterizing
parkinsonian gait for a total of 15 features. All selected
features for both classification tasks are listed in Table I.
These features include statistical variability as measured
using standard deviation (SD). The (+) or (-) next to the
feature names indicate whether PD patients exhibit higher or
lower values for the features, respectively, relative to
controls. These features allow us to see which components of
gait are most significant in characterizing parkinsonian gait.