a patient (questions are immediate tests) before deciding what blood tests to order (blood
tests are delayed tests).4
When all of the tests are delayed (as they are in the BUPA data in Appendix A.1), we
must decide in advance (before we see any test results) what tests are to be performed. For a
given decision tree, the total cost of tests will be the same for all cases. In situations of this
type, the problem of minimizing cost simplifies to the problem of choosing the best subset of
the set of available tests (Aha and Bankert, 1994). The sequential order of the tests is no
longer important for reducing cost.
Let us consider a simple example to illustrate the method. Table 1 shows the test costs
for four tests. Two of the tests are immediate and two are delayed. The two delayed tests
share a common cost of $2.00. There are two classes, 0 and 1. Table 2 shows the classification
cost matrix. Figure 1 shows a decision tree. Table 3 traces the path through the tree for a
particular case and shows how the cost is calculated. The first step is to do the test at the root
of the tree (test alpha). In the second step, we encounter a delayed test (delta), so we must
calculate the cost of the entire subtree rooted at this node. Note that epsilon only costs $8.00,
since we have already selected delta, and delta and epsilon have a common cost. In the third
step, we do test epsilon, but we do not need to pay, since we already paid in the second step.
In the fourth step, we guess the class of the case. Unfortunately, we guess incorrectly, so we
pay a penalty of $50.00.