number of PLS latent variables was optimized separately. Fig. S2A
in supplementary information shows the prediction error of crossvalidation
for each selected interval (bars) using optimized number
of latent variables. Also in this figure, the prediction error for the
full-spectrum model (line) was shown. Obviously, the interval
number 17, which is related to the wave-number interval of
1162–984 cm1, produced better results with respect to other
intervals. The results obtained by this interval are close to the
model of whole spectral region. The prediction error of the reminder
intervals was larger than the whole spectral region.
Monitoring of the prediction errors as function of spectral interval
explain which spectral parts are more informative for classification
of oil samples. It is observed from Fig. S2A that interval 17
(1162–984 cm1), is the most informative. Absorptions in this region
(the fingerprint region) include the contributions from complex
interacting vibrations, lead to the generally unique
fingerprint for each compound. However detailed interpretation
of IR bands in this region is difficult. Beside, the intervals 6 and
7, relating to the spectral region of 3116–2762 cm1 (related to
the CAH, OAH and NAH vibrations) is in the second order of
importance [29]. The pattern distribution of the oil samples in
the three-dimensional plot of PLS scores (for interval 17), which reveals
a partial discrimination between oil types is represented in
Fig. S2B. It should be mentioned that the best iPLS-DA model was
obtained for this spectral region using 10 latent variables. But,
the data of this dimension could not be visualized. The overall performance
of the iPLS-DA for classification of oil samples using 10
PLS latent variables are summarized in Table S3. One can observe
that classification by iPLS-DA resulted in lower classification errors,
e.g., all samples of butter were classified to their respective group
and thus this group is associated with zero misclassification error.
Although, there is still a problem for correct classification of canola,
corn, olive, soya and sunflower oil samples.