Incidence prediction
As for clustering method, discriminant analysis using regression formula is utilized to classify the data with 2 classes. In Table 2, “SCORE”, “ATC”, and “Gene ID” in pinpoint data are the explanatory variables. The response variable is “Incidence”. All of those explanatory variables are not always prepared because the data is sparse. So, 7 dataset patterns of pinpoint data are prepared to be able to predict side effect incidence even if some of those explanatory variables are missing. A mark on Table 2 indicates the existing pinpoint data.