Accuracy-parsimony analysis
The Lasso regression performs variable selection and parameter estimation at the same time. It encourages sparse solutions, i.e. solutions that make use of a small number of variables [31,38]. In order to study how the parsimony of the model affects its predictive accuracy, we studied the performance of the model when fitted under a constraint on the maximum number of variables, n, to be included. We refer to this approach as “constrained Lasso model”.
In particular, for a given n, we trained the constrained Lasso model according to the 10-fold cross-validation procedure described above. We calculated the mean number of variables actually included in the models averaging over the 10 regression models (one for each fold of the cross-validation) fitted during the cross-validation procedure. We evaluated the accuracy of the model with AUCs for single and multiple fallers, and mean squared error (MSE). MSE was calculated as the mean squared difference between the observed number of falls and the predicted fall rate μ. We repeated this analysis varying n from 1 to 40.