Step-7-Choosing components and forming a feature vector
here is where the notion of data compression and reduced
dimensionality comes into it.
If we look at the eigenvectors and Eigen values from the
previous section, we will observe that the Eigen values are
quite different values.
In fact, it turns out that the eigenvector with the highest
Eigen value is the principle component of the data set. What
needs to be done now is you need to form a feature vector.
Taking the eigenvectors that we want to keep from the list
of eigenvectors, and forming a matrix with these eigenvectors
in the columns construct this: -
Feature Vector = (eig1 eig2 eig3……….eig n)
Step-8-Deriving the new data set, is the final step in PCA
and it is also the easiest. Once we have chosen the
components (eigenvectors) that we wish to keep in our data
and formed a feature vector, we simply take the transpose of
the vector and multiply it on the left of the original data set,
transposed.
Final Data=Row Feature Vector*Row Data
Adjust where Row Feature Vector is the matrix with the
eigenvectors in the columns transposed so that the
eigenvectors are now in the rows, with the most significant
eigenvectors at the top, and Row Data Adjust is the meanadjusted
data transposed, i.e. the data items are in each
column, with each row holding a separate dimension [7].