Cluster analysis was used to identify distinct health behaviour clusters. This multivariate analysis can be useful for finding homogeneous subgroups within heterogeneous samples.32 The procedure employed was in accordance with the most recent developments in cluster analysis.33 As the precise number of identifiable clusters was not known a priori, Ward agglomerative hierarchical clustering was used as it is particularly suitable for binary data.34,35 First, the Ward method treated each individual observation as its own cluster. These clusters were gradually agglomerated to one large cluster on the basis of a proximity measure using a predefined fusion algorithm.32 To enable identification of robust groups of observations, the fusion algorithm was stopped at the point where the individual clusters were as homogenous as possible within clusters and as heterogeneous as possible in relation to all the other clusters.34,36 The established measures R2, semi-partial R2, pseudo F and pseudo t2-statistics were used as the criteria for decisions regarding the total number of clusters. Finally, root mean square standard deviation (RMSSTD) was calculated as a measure of homogeneity.