Fluorescence spectroscopy as a means to detect low levels of treated wastewater impact on two source waters was investigated using effluents from five wastewater facilities. To identify how best to interpret the fluorescence excitation-emission matrices (EEMs) for detecting the presence of wastewater, several feature selection and classification methods were compared. An expert supervised regional integration approach was used based on previously identified features which distinguish biologically processed organic matter including protein-like
fluorescence and the ratio of protein to humic-like fluorescence. Use of nicotinamide adenine dinucleotide-like (NADH) fluorescence was found to result in higher linear correlations for low levels of wastewater presence. Parallel factors analysis (PARAFAC) was also applied to contrast an unsupervised multiway approach to identify
underlying fluorescing components. A humic-like component attributed to reduced semiquinone-like structures
was found to best correlate with wastewater presence. These fluorescent features were used to classify, by
volume, low (0.1–0.5%), medium (1–2%), and high (5–15%) levels by applying support vector machines
(SVMs) and logistic regression. The ability of SVMs to utilize high-dimensional input data without prior feature
selection was demonstrated through their performance when considering full unprocessed EEMs (66.7%
accuracy). The observed high classification accuracies are encouraging when considering implementation of
fluorescence spectroscopy as a water quality monitoring tool. Furthermore, the use of SVMs for classification of
fluorescence data presents itself as a promising novel approach by directly utilizing the high-dimensional EEMs.