We assume that the unknown values (in the training cases or in the observations) are missing completely at random.
To select an attribute in the building process of the decision tree, a general splitting criterion will be formally defined.
To do so, the missing value is first taken as a new value of the attribute. Second, the splitting criterion takes into
account this new value. Finally, a reduced weight is assigned to the attribute with missing values. In this way, every
splitting criterion defined as working without missing values is adapted to work with missing attribute values.