each log10 K in a row is multivariate expression of the row polymer where the five vapor solvation parameters are independent variables. In other words, a polymer is represented by a set of linear algebraic equations with each equation specified by a different vapor. The whole KT-matrix can thus be treated as multivariate data about the prospective polymers as objects and the vapors as observations. Therefore, one can apply clustering methods of data mining for grouping objects (polymers) of similar type according to some measure of similarity in relation to the vapor types. In the present context of polymer selection, if the similar polymers are used as sensing materials in a sensor array the response generated by the sensors in the will carry similar information about analyte vapors. Hence, by using those for making sensor array would be of no help in vapor discrimination. Only a single polymer from a group will suffice. Thus, the objective of a clustering method would be to create polymer groups or clusters according to solvation interaction similarities with the analyte vapors, and at the same time different clusters must have maximally diverse characteristics in relation to analyte interactions. One can then expect that by selection of a single polymer from each cluster and by using these as coating materials in a sensor array would encode optimally diverse vapor class information in the sensor array response patterns. The partition coefficient data in Tables 4(A, B) and 5 can be seen as points in a multidimensional data space where the set of vapors in columns defines dimensionality and the set of K-values in rows define polymers as points in this space. All the listed polymers are thus data points in multidimensional vapor space. A number of clustering algorithms have been used for data mining that seek grouping or clustering of data points of similar type with simultaneously enhanced separation between dissimilar clusters [112]. In some earlier studies [60–62], the application
of principal component analysis (PCA), hierarchical clustering (HC) and fuzzy c-means (FCM) clustering methods for polymer selection were explored for several application targets of security (explosives, chemical weapon agents and narcotics) and human safety concerns (breath and drugs of abuse). The continued efforts for developing more efficient and less cumbersome polymer selection procedures we realized that the fuzzy c-means (FCM) clustering method has much greater potential than that utilized in our earlier works. To obtain the best results we need to find optimum values
of the FCM algorithm parameters (number of clusters c, exponentm of the weighting function in defining objective function and the parameter ε for stopping criterion as explained below). In this study we find that by tuning these parameters for specific applications the fuzzy c-means clustering can select sufficiently diverse set of a small number of polymers for differentiation of food freshness and spoilage. The milk and fish food items have been targeted as case studies for validation.