2.3. Selecting the optimal number of mixture components
Choosing the number of components in a mixture can be seen as a choice of a model problem. Varying the number of components k is equivalent to defining different models, like choosing the number of lags in a dynamic model. However, from a different point of view, k could be seen as a parameter for which a posterior density could be derived. Such an approach is illustrated for instance by the reversible jump algorithm of Green (1995). See also the birth-and-death approach of Stephens (2000a). Both methods are discussed in Marin et al. (2005).
In a Bayesian framework, model choice relies on the evaluation of the marginal likelihood of the different models and on Bayes factors. The evaluation of a marginal likelihood is a difficult task because it means integrating the likelihood function with respect to the prior and this integral does not exist if the prior is non-informative. However, even if the prior is informative, the result is very often numerically unstable, so other ways have been looked at in the literature (see Kass and Raftery, 1995 for a survey). We shall illustrate in three different methods which make use of the MCMC output. They correspond either to an information criterion which penalizes a measure of fit by a measure of complexity or to an evaluation of a MCMC predictive density. The problem is made more complex for mixtures of densities because the parameters of interest and the dimension of the model are not precisely defined. The BIC criterion of Schwarz (1978), which is based on asymptotic expansions, simply considers the number of initial parameters equal to 3k−1 and corresponds to: