In most cases, we prefer the model that has fewest parameter to estimate, provided that
each one of the candidate models is correctly specified. This is called the most parsimonious
model of the set. The AIC does not always suggests the most parsimonious model, because
the AIC function is largely based on the log likelihood function. Davidson and McKinnon
(2004, 676) indicates that whenever two or more models are nested, the AIC may fail to
choose the most parsimonious one, if that these models are correctly specified. In another
case, if all the models are nonnested, and only one is well specified, the AIC chooses the wellspecified
model asymptotically, because this model has the largest value of the log likelihood
function.