Abstract
Objective: Classification tree analysis is a potentially powerful tool for investigating multilevel interactions. Within
the context of colon cancer etiology it may help identify disease pathways and evaluate important interactions of
risk factors.
Methods: We apply classification tree analysis as a statistical method to investigate interactions of risk factors for
colon cancer. We use data collected from a population-based case–control study of newly diagnosed cases of colon
cancer (N¼4403 cases and controls).
Results: Our results indicate that, as expected, there are many factors that influence colon cancer risk, and that they
interact on many levels. We find that the most important factor is the utilization of aspirin and/or non-steroidal
anti-inflammatory drugs (NSAID), with those taking this medication having lower risk. Family history appears as a
level two modifying factor when NSAID are not used, whereas Western diet is the second factor when NSAID are
taken. The final tree has six levels, contains several modifying factors and correctly classifies case or control status
for 60.8% (95% CI 59.4–62.2) of all individuals.
Conclusions: Our results suggest that risk factors work together to determine disease risk. By accounting for
interactions between risk factors we become better able to dissect disease pathways and determine those risk factors
that increase susceptibility to disease. Our results highlight the importance of designing studies so that interactions
can be addressed.