To handle large data sets, our basic method is again to
use sampling technique. Currently we implement random
sampling only. More sophisticated sampling algorithm that
can help find decision trees as different as possible, given
the number of trees, will be explored in the future