To obtain a better picture of the relation of 6 and the experimental number of trials with misses, we con- ducted the following test. We took 100 samples (for each frequency threshold and sample size) and determ- ined the lowered frequency threshold that would have given misses in one out of the hundred trials. Figure 2 presents these results (as points), together with lines showing the lowered thresholds with 6 = 0.01 or 0.001, i.e., the thresholds corresponding to miss probabilit- ies of 0.01 and 0.001 for a given frequent set. The frequency thresholds that would give misses in frac- tion 0.01 of cases approximate surprisingly closely the thresholds for S = 0.01. Experiments with a larger scale of sample sizes give comparable results. There are two explanations for the similarity of the values. One reason is that there are not necessarily many potential misses, i.e., not many frequent sets with
frequency relatively close to the threshold. Another reason that contributes to the similarity is that the sets are not independent.
In the case of a possible failure, Algorithm 2 gener- ates itera.tively all new candidates and makes another pass over the database. In our experiments the number of frequent set.s missed-when any were missed-was one or two for 6 = 0.001, and one to 16 for 6 = 0.01. The number of candidates checked on the second pass was very small compared to the total number of item- sets checked.