Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes Model in Estimating the Positive Predictive Value of Multiple Psychological or Medical Tests
Clarence D. Kreiter
Office of Consultation and Research in Medical Education, Department of Family Medicine, University of Iowa, Iowa City, USA.
Email: clarence-kreiter@uiowa.edu
Received May 10th, 2010; revised June 14th, 2010; accepted June 16th, 2010.
ABSTRACT
Introduction: It is a common finding that despite high levels of specificity and sensitivity, many medical tests are not highly effective in diagnosing diseases exhibiting a low prevalence within a clinical population. What is not widely known or appreciated is how the results of retesting a patient using the same or a different medical or psychological test impacts the estimated probability that a patient has a particular disease. In the absence of a ‘gold standard’ special techniques are required to understand the error structure of a medical test. Generalizability can provide guidance as to whether a serial Bayes model accurately updates the positive predictive value of multiple test results. Methods: In or-der to understand how sources of error impact a test’s outcome, test results should be sampled across the testing condi-tions that may contribute to error. A generalizability analysis of appropriately sampled test results should allow re-searchers to estimate the influence of each error source as a variance component. These results can then be used to determine whether, or under what conditions, the assumption of test independence can be approximately satisfied, and whether Bayes theorem accurately updates probabilities upon retesting. Results: Four hypothetical generalizability study outcomes are displayed as variance component patterns. Each pattern has a different practical implication re-lated to achieving independence between test results and deriving an enhanced PPV through retesting an individual patient. Discussion: The techniques demonstrated in this article can play an important role in achieving an enhanced positive predictive value in medical and psychological diagnostic testing and can help ensure greater confidence in a wide range of testing contexts.
Keywords: Generalizability Theory, Bayes, Serial Bayes Estimation, Positive Predictive Value, Psychological Testing, Serial Medical Testing
1. Introduction
When a medical disease’s prevalence and a medical test’s specificity and sensitivity are known, an equations based on Bayes Theorem provides useful information related to the diagnostic power of a medical test. It is a common finding that despite high levels of specificity and sensitivity, many medical tests are not highly effec-tive in diagnosing diseases with a low prevalence within a clinical population [1]. Since a large number of dis-eases occur only in a small proportion of the population (i.e. have low prevalence), the low positive predictive value (PPV) of medically diagnostic tests is of obvious concern to physicians attempting to identify the presence of a low prevalence disease. To provide an example, let’s suppose a physician is attempting to determine whether a patient has a disease that occurs in 1% of a defined pa-tient population. When the test is performed on patients with the disease, it yields a positive test result indicating the presence of the disease in 90% of the patients (sensi-tivity equals .90). When the test is performed on patients without the disease, it correctly identifies 98% of those patients as disease free (specificity equals .98). An equa-tion based on Bayes Theorem can be used to calculate the probability that a patient with a positive test result actually has the disease. The simple equation for calcu-lating this probability is:
P (A | B) = P (B | A) * P (A) / P (B) (1)
Equation (1) describes the probability that a patient
Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes Model in Estimating the 195 Positive Predictive Value of Multiple Psychological or Medical Tests has the disease given a positive test result [P (A | B)], and equals the probability of a positive test result given the patient has the disease [P (B | A) - sensitivity] multiplied by the probability of the disease [P (A) - prevalence] divided by the overall probability of a positive test result within the population [P (B)]. The denominator in Equa-tion (1), the overall prior probability of a positive test result, is derived as shown in Equation (2), where j is 1, 2… and takes on as many values as there are hypotheses. In the case being discussed in this example problem, there are just two possible hypotheses (Ho1: the patient has the disease – Ho2: the patient does not have the dis-ease) and hence in this example the sum is taken over just two levels. Hence, the overall probability of a posi-tive test result is the sum of the probabilities of a positive test in those with (sensitivity) and without (1 – specificity) the disease each multiplied by their prevalence in the population.
P ( B ) = [Σ j P ( B | A j ) P ( A j )] (2)
Equation (3) displays the calculation using the levels of specificity, sensitivity and prevalence discussed in our example. Despite high levels of specificity and sensitiv-ity, the patient with a positive test result has only a 31% chance of actually having the disease. This is a common and well known type of finding related to medical testing designed to detect low prevalence diseases.
P ( A | B ) = .90 * .01 / ((.90 * .01) + (.02 * .99)) = .31
(3)
What is not widely known or appreciated is how the results of retesting a patient using the same or different test will impact the estimated probability that the patient has the disease. There is little guidance in the medical or psychological literature regarding whether or how the results from serial testing improve the ability to diagnosis disease when the structure or cause of the dependence between tests is uncertain. However, it is clearly impor-tant for clinicians to understand how the PPV changes when a patient is administered a second or third medical or psychological test. When the assumption of test inde-pendence applies, a serial Bayes model may provide guidance within contexts like those presented in the ex-ample just discussed.
When probabilities from a previous Bayes calculation are used to update estimates of the prior probability [P (A)], and when independence is confirmed, we can use a Bayes serial calculation to derive the probability that a patient has the disease given a second test result. Equa-tion (4) presents the next step in the context of our ex-ample using a Bayes serial calculation for a second con-secutive positive test under the assumption that the two tests are independent. With a second positive result, the probability of having the disease goes from .31 to .95, and our confidence in the diagnosis appears to improve dramatically. It should be noted that under the assump-tion of independence, parallel testing may also yield an outcome similar to serial testing. So, although the focus of this paper is on sequential or serially administered tests, when time or the occasion of the test is not an im-portant factor in determining test independence, what is reported and discussed here may also apply to parallel testing.
P ( A | B ) = .90 * .31 / ((.90 * .31) + (.02 * .69)) = .95
(4)
From the outcome presented in Equation (4), it appears that the PPV of tests used to detect low prevalence dis-eases may be dramatically improved simply by adminis-tering the test a second or third time. However as men-tioned, such positive outcomes rely on an independence assumption that is critical to the valid application of the serial Bayes probability model and implies that the error rate for each test is independent. Therefore, to determine whether an enhancement of PPV can be achieved by re-testing, it is necessary to first establish the primary source(s) of test error and whether, or under what condi-tions, each medical test can be regarded as independent.
When a “gold standard” is available for determining the accuracy of a fallible test, establishing the independ-ence between two test administrations is straight forward. One needs simply to twice calculate the specificity and sensitivity for the second test administration, once for the group of patients who test positive on the first test and once for the group of patients who tested negative on the first test. If the two calculations are in close agreement, the assumption of independence is satisfied. Unfortu-nately, a “gold standard” method for checking test accu-racy is often not available, and other procedures are re-quired.
Independence between test results can be achieved when clinicians randomly sample from the test-related variables that contribute to error and when each disease positive patient is equally likely to display a false nega-tive test result and when each disease negative patient is equally likely to display a false positive test result. In-deed, when the conditions leading to test independence are understood, the utility of testing in a low prevalence disease context can often be dramatically enhanced by a simple random replication of a testing process that sam-ples from the variables contributing to error. To ascertain under what conditions an independence assumption is satisfied, researchers must first investigate and under-stand the error structure of medical or psychological test outcomes. Given the potential for dramatically enhanced diagnostic accuracy, such research is critically important in improving the utility of certain tests with low PPV.
Within many testing contexts, it is often not possible to establish the accuracy of a fallible test by comparing it
Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes Model in Estimating the 196 Positive Predictive Value of Multiple Psychological or Medical Tests to
Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes Model in Estimating the Positive Predictive Value of Multiple Psychological or Medical Tests
Clarence D. Kreiter
Office of Consultation and Research in Medical Education, Department of Family Medicine, University of Iowa, Iowa City, USA.
Email: clarence-kreiter@uiowa.edu
Received May 10th, 2010; revised June 14th, 2010; accepted June 16th, 2010.
ABSTRACT
Introduction: It is a common finding that despite high levels of specificity and sensitivity, many medical tests are not highly effective in diagnosing diseases exhibiting a low prevalence within a clinical population. What is not widely known or appreciated is how the results of retesting a patient using the same or a different medical or psychological test impacts the estimated probability that a patient has a particular disease. In the absence of a ‘gold standard’ special techniques are required to understand the error structure of a medical test. Generalizability can provide guidance as to whether a serial Bayes model accurately updates the positive predictive value of multiple test results. Methods: In or-der to understand how sources of error impact a test’s outcome, test results should be sampled across the testing condi-tions that may contribute to error. A generalizability analysis of appropriately sampled test results should allow re-searchers to estimate the influence of each error source as a variance component. These results can then be used to determine whether, or under what conditions, the assumption of test independence can be approximately satisfied, and whether Bayes theorem accurately updates probabilities upon retesting. Results: Four hypothetical generalizability study outcomes are displayed as variance component patterns. Each pattern has a different practical implication re-lated to achieving independence between test results and deriving an enhanced PPV through retesting an individual patient. Discussion: The techniques demonstrated in this article can play an important role in achieving an enhanced positive predictive value in medical and psychological diagnostic testing and can help ensure greater confidence in a wide range of testing contexts.
Keywords: Generalizability Theory, Bayes, Serial Bayes Estimation, Positive Predictive Value, Psychological Testing, Serial Medical Testing
1. Introduction
When a medical disease’s prevalence and a medical test’s specificity and sensitivity are known, an equations based on Bayes Theorem provides useful information related to the diagnostic power of a medical test. It is a common finding that despite high levels of specificity and sensitivity, many medical tests are not highly effec-tive in diagnosing diseases with a low prevalence within a clinical population [1]. Since a large number of dis-eases occur only in a small proportion of the population (i.e. have low prevalence), the low positive predictive value (PPV) of medically diagnostic tests is of obvious concern to physicians attempting to identify the presence of a low prevalence disease. To provide an example, let’s suppose a physician is attempting to determine whether a patient has a disease that occurs in 1% of a defined pa-tient population. When the test is performed on patients with the disease, it yields a positive test result indicating the presence of the disease in 90% of the patients (sensi-tivity equals .90). When the test is performed on patients without the disease, it correctly identifies 98% of those patients as disease free (specificity equals .98). An equa-tion based on Bayes Theorem can be used to calculate the probability that a patient with a positive test result actually has the disease. The simple equation for calcu-lating this probability is:
P (A | B) = P (B | A) * P (A) / P (B) (1)
Equation (1) describes the probability that a patient
Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes Model in Estimating the 195 Positive Predictive Value of Multiple Psychological or Medical Tests has the disease given a positive test result [P (A | B)], and equals the probability of a positive test result given the patient has the disease [P (B | A) - sensitivity] multiplied by the probability of the disease [P (A) - prevalence] divided by the overall probability of a positive test result within the population [P (B)]. The denominator in Equa-tion (1), the overall prior probability of a positive test result, is derived as shown in Equation (2), where j is 1, 2… and takes on as many values as there are hypotheses. In the case being discussed in this example problem, there are just two possible hypotheses (Ho1: the patient has the disease – Ho2: the patient does not have the dis-ease) and hence in this example the sum is taken over just two levels. Hence, the overall probability of a posi-tive test result is the sum of the probabilities of a positive test in those with (sensitivity) and without (1 – specificity) the disease each multiplied by their prevalence in the population.
P ( B ) = [Σ j P ( B | A j ) P ( A j )] (2)
Equation (3) displays the calculation using the levels of specificity, sensitivity and prevalence discussed in our example. Despite high levels of specificity and sensitiv-ity, the patient with a positive test result has only a 31% chance of actually having the disease. This is a common and well known type of finding related to medical testing designed to detect low prevalence diseases.
P ( A | B ) = .90 * .01 / ((.90 * .01) + (.02 * .99)) = .31
(3)
What is not widely known or appreciated is how the results of retesting a patient using the same or different test will impact the estimated probability that the patient has the disease. There is little guidance in the medical or psychological literature regarding whether or how the results from serial testing improve the ability to diagnosis disease when the structure or cause of the dependence between tests is uncertain. However, it is clearly impor-tant for clinicians to understand how the PPV changes when a patient is administered a second or third medical or psychological test. When the assumption of test inde-pendence applies, a serial Bayes model may provide guidance within contexts like those presented in the ex-ample just discussed.
When probabilities from a previous Bayes calculation are used to update estimates of the prior probability [P (A)], and when independence is confirmed, we can use a Bayes serial calculation to derive the probability that a patient has the disease given a second test result. Equa-tion (4) presents the next step in the context of our ex-ample using a Bayes serial calculation for a second con-secutive positive test under the assumption that the two tests are independent. With a second positive result, the probability of having the disease goes from .31 to .95, and our confidence in the diagnosis appears to improve dramatically. It should be noted that under the assump-tion of independence, parallel testing may also yield an outcome similar to serial testing. So, although the focus of this paper is on sequential or serially administered tests, when time or the occasion of the test is not an im-portant factor in determining test independence, what is reported and discussed here may also apply to parallel testing.
P ( A | B ) = .90 * .31 / ((.90 * .31) + (.02 * .69)) = .95
(4)
From the outcome presented in Equation (4), it appears that the PPV of tests used to detect low prevalence dis-eases may be dramatically improved simply by adminis-tering the test a second or third time. However as men-tioned, such positive outcomes rely on an independence assumption that is critical to the valid application of the serial Bayes probability model and implies that the error rate for each test is independent. Therefore, to determine whether an enhancement of PPV can be achieved by re-testing, it is necessary to first establish the primary source(s) of test error and whether, or under what condi-tions, each medical test can be regarded as independent.
When a “gold standard” is available for determining the accuracy of a fallible test, establishing the independ-ence between two test administrations is straight forward. One needs simply to twice calculate the specificity and sensitivity for the second test administration, once for the group of patients who test positive on the first test and once for the group of patients who tested negative on the first test. If the two calculations are in close agreement, the assumption of independence is satisfied. Unfortu-nately, a “gold standard” method for checking test accu-racy is often not available, and other procedures are re-quired.
Independence between test results can be achieved when clinicians randomly sample from the test-related variables that contribute to error and when each disease positive patient is equally likely to display a false nega-tive test result and when each disease negative patient is equally likely to display a false positive test result. In-deed, when the conditions leading to test independence are understood, the utility of testing in a low prevalence disease context can often be dramatically enhanced by a simple random replication of a testing process that sam-ples from the variables contributing to error. To ascertain under what conditions an independence assumption is satisfied, researchers must first investigate and under-stand the error structure of medical or psychological test outcomes. Given the potential for dramatically enhanced diagnostic accuracy, such research is critically important in improving the utility of certain tests with low PPV.
Within many testing contexts, it is often not possible to establish the accuracy of a fallible test by comparing it
Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes Model in Estimating the 196 Positive Predictive Value of Multiple Psychological or Medical Tests to
การแปล กรุณารอสักครู่..
