significant
latent variables was determined using leave many outcross
validation. Then, it was used to predict the class variable of
the prediction samples. Following the prediction error as function
of number of latent variables showed that the best performances
(lowest classification error) could be obtained when 7 numbers
of PLS latent variables are used. In Fig. 2A the plot of first 3 PLS factors
are shown for both training and prediction samples. Obviously,
butter samples have been separated from the vegetable oil
ones. However, there is severe overlapping between vegetable oils.
The classification results for calibrations and predictions samples
are summarized in Table S3. Obviously, PLS-DA failed to individual
discrimination between different types of oils. None of the Canola
and Corn samples could be assigned to their original groups and
thus they associated with 100% misclassification errors. The best
results were obtained for butter samples where 85% and 89% of
samples in the calibration and prediction sets respectively, were
correctly assigned to their respective group. As it is shown in
Fig. 2B and Table S3 (from supplementary information), no
improvement was obtained when the IR spectra were pre-processed
by extended multiplicative scatter correction (EMSC).