Prediction Results
We now focus on prediction of future episodes of depression. We first present some results of statistical significance of the behavioral features, as measured through their mean, variance, momentum, and entropy values over the one year period of analysis (Table 5). We use independent sample t-tests, where df=474: the values of the t-statistic and the corresponding p-values are given in the table. Note that we have 188 feature variables; hence to counteract the problem of multiple comparisons, we adopt Bonferroni correction. We choose a significance level of α=0.05/188= 2.66e-4. In Table 5, we report the features for which we have at least one of mean, variance, momentum or entropy values to be statistically significant.
The results align with our findings described earlier. Across the feature types, certain stylistic, engagement, emotion measures, and use of depression terms and mentions of antidepressant medication bear distinctive markers across the two classes. In general, momentum seems to be a feature that shows statistical significance across a number of measures, demonstrating that not only is the absolute degree of behavioral change important (indicated by the mean), but the trend of its change over time bears useful markers of distinguishing depressive behavior.
Now we utilize our proposed classification framework to examine how well we can predict, whether or not an individual is vulnerable to depression, ahead of its onset. In order to understand the importance of various feature types, we trained a number of models.
We present the results of these prediction models in Table 6. The results indicate that the best performing model (dimension-reduced features) in our test set yields an aver-age accuracy of ~70% and high precision of 0.74, corresponding to the depression class. Note that a baseline marginal model would yield accuracy of only 64%, i.e., when all data points are labeled per the majority class which is the non-depressed class. Good performance of this classifier is also evident from the receiver-operator characteristic (ROC) curves in Figure 4. The dimension-reduced feature model gives slightly greater traction compared to the one that uses all features; demonstrating utility of reducing feature
redundancy.
Prediction ResultsWe now focus on prediction of future episodes of depression. We first present some results of statistical significance of the behavioral features, as measured through their mean, variance, momentum, and entropy values over the one year period of analysis (Table 5). We use independent sample t-tests, where df=474: the values of the t-statistic and the corresponding p-values are given in the table. Note that we have 188 feature variables; hence to counteract the problem of multiple comparisons, we adopt Bonferroni correction. We choose a significance level of α=0.05/188= 2.66e-4. In Table 5, we report the features for which we have at least one of mean, variance, momentum or entropy values to be statistically significant.The results align with our findings described earlier. Across the feature types, certain stylistic, engagement, emotion measures, and use of depression terms and mentions of antidepressant medication bear distinctive markers across the two classes. In general, momentum seems to be a feature that shows statistical significance across a number of measures, demonstrating that not only is the absolute degree of behavioral change important (indicated by the mean), but the trend of its change over time bears useful markers of distinguishing depressive behavior.Now we utilize our proposed classification framework to examine how well we can predict, whether or not an individual is vulnerable to depression, ahead of its onset. In order to understand the importance of various feature types, we trained a number of models.We present the results of these prediction models in Table 6. The results indicate that the best performing model (dimension-reduced features) in our test set yields an aver-age accuracy of ~70% and high precision of 0.74, corresponding to the depression class. Note that a baseline marginal model would yield accuracy of only 64%, i.e., when all data points are labeled per the majority class which is the non-depressed class. Good performance of this classifier is also evident from the receiver-operator characteristic (ROC) curves in Figure 4. The dimension-reduced feature model gives slightly greater traction compared to the one that uses all features; demonstrating utility of reducing feature
redundancy.
การแปล กรุณารอสักครู่..
