level. From these results, we can see that the pattern of the simulated significance levels for the test procedures become
stable for 1 >10 (similar pattern are observed under other settings) and we will focus on the comparison between test
procedures for 1 >10.
For each given d-value (ratio of the time frames t1/t0) and alternative -value, we compute the percentage of
configurations (based on 50 different 1 = 11(1)60) that are conservative (i.e., simulated significance level 0.04), or
liberal (i.e., simulated significance level 0.06) for the test procedures based on p(A)
j , p(U)
j , j =1, 2, 3 and pL. These
values are presented in Tables 1–4. For the sake of comparison, we also present the percentages of configurations in the
intervals (0.04, 0.05), (0.05, 0.06) and 0.05±SE=(0.0478, 0.0522). One should view a rejection of the null hypothesis
for a liberal test with caution since the type I error rate exceeds the pre-chosen nominal error rate. Conservative tests
are of less concern, because the type I error rate is controlled. From the first column of Tables 1, 3 and 4, we conclude
that the test procedure based on p(A)
1 is too conservative when d >1 and much too liberal when d <1. Therefore, we
will not consider p(A)
1 in the subsequent comparisons.
Our assessment is that the tests remaining under investigation may be classified into three groups: p(A)
2 and p(A)
3
(A-tests, say), p(U)
j , j = 1, 2, 3 (P-tests, say) and pL (LRT). From Tables 1 to 4, the A-tests, P-tests and LRT are
considered to be robust in all the situations, therefore, we will compare the percentage of configurations in (0.04, 0.05)
and (0.05, 0.06). For d <1, A-tests are relatively more conservative (have higher percentages in (0.04, 0.05)) than
the P-tests while the LRT is relatively liberal (have higher percentages in (0.05, 0.06)). For d = 1, LRT is relatively
liberal compared to the A-tests and P-tests. For d >1, A-tests have higher percentages in (0.05, 0.06) than the P-tests
and LRT. In other words, P-tests and LRT keep the type-I error rates below the desired level better compared to the
A-tests.
การแปล กรุณารอสักครู่..
