Classification of the lCC
There are actually six different equations for calculating the ICC,differentiated by purpose of the reliability study, the design of the type of measurements taken. is necessary to distinguish among these approaches,as under some condition the results be decidedly different. To explanations, we will proceed with discussion in the context of a reliability study with rater as the facet interest however, emphasize that these applications are equally valid to study other facets.
Models of the ICC Random and Fixed Effects
Shrout and Fleiss describe three models of the ICC. They distinguish these models according to how the raters are chosen and assigned to subjects
Model 1 In model 1, each subject is assessed by a different set of kraters. The raters considered randomly chosen from a larger population of raters; that is, rater is a random effect. However, the raters for one subject are not necessarily the same raters that take measurements on another subject. Therefore, in this design there is no way to variables being The that can actually be assessed is the difference among subjects. Other sources of error ance, including rater or measurement error, cannot be separated out.
Model 2. Model 2 assessin inter-rater reliability. this design, each subject is assessed by the same set of The raters are randomly chosen; that is, they are expected to represent the population of raters from which they were and results be with similar characteristics. subjects are also considered to be randomly chosen from the population individuals who would receive the measurement. Therefore, subject both effects. This randomness may be on theoretical in practice that is, we choose subjects and raters who we believe represent the populations of interest, as we do not have access to the entire population. But the intent of the study is to demonstrate that the measurement reliability can be applied to others.
Model 3. In model 3, each subject is assessed by the same set of raters, but the rate represent the only raters of interest. In this case, there is no intention to generalize findings beyond the raters involved. In this design, rater is considered a fixed effect because the raters have been purposely not random selected. subjects are still considered a random effect. Therefore, model 3 is a mixed model. This model is used when a researcher wants to establish that specific investigators are reliable in their data collection, but the reliability of others is relevant. Model 3 is the appropriate statistic to measure intrarater reliability, as the measurements of a single cannot be generalized to other raters
Forms of the ICC: Single and Average Ratings
Each of the ICC models can be expressed in two forms, depending on whether the scores are single ratings or mean ratings. Most often, reliability studies are based on comparison of scores from individual raters. There are times, however, when the mean of several raters or ratings may be used as the unit of reliability. For instance, when measurements are unstable, it may be necessary to use the mean of several measurements as the individual's score to obtain satisfactory reliability. Using mean scores has the effect of increasing reliability estimates, as means are considered better estimates of true scores, theoretically reducing error variance.
The six types of ICC are classified using two numbers in parentheses. The first number designates the model, and the second number signifies the form, using either a single measurement or the mean of several measurements as the unit of analysis. For example, when using single measurements in a generalization study, we would specify use of ICC. The type of ICC used should always be indicated.
Classification of the lCC There are actually six different equations for calculating the ICC,differentiated by purpose of the reliability study, the design of the type of measurements taken. is necessary to distinguish among these approaches,as under some condition the results be decidedly different. To explanations, we will proceed with discussion in the context of a reliability study with rater as the facet interest however, emphasize that these applications are equally valid to study other facets. Models of the ICC Random and Fixed Effects Shrout and Fleiss describe three models of the ICC. They distinguish these models according to how the raters are chosen and assigned to subjects Model 1 In model 1, each subject is assessed by a different set of kraters. The raters considered randomly chosen from a larger population of raters; that is, rater is a random effect. However, the raters for one subject are not necessarily the same raters that take measurements on another subject. Therefore, in this design there is no way to variables being The that can actually be assessed is the difference among subjects. Other sources of error ance, including rater or measurement error, cannot be separated out. Model 2. Model 2 assessin inter-rater reliability. this design, each subject is assessed by the same set of The raters are randomly chosen; that is, they are expected to represent the population of raters from which they were and results be with similar characteristics. subjects are also considered to be randomly chosen from the population individuals who would receive the measurement. Therefore, subject both effects. This randomness may be on theoretical in practice that is, we choose subjects and raters who we believe represent the populations of interest, as we do not have access to the entire population. But the intent of the study is to demonstrate that the measurement reliability can be applied to others. Model 3. In model 3, each subject is assessed by the same set of raters, but the rate represent the only raters of interest. In this case, there is no intention to generalize findings beyond the raters involved. In this design, rater is considered a fixed effect because the raters have been purposely not random selected. subjects are still considered a random effect. Therefore, model 3 is a mixed model. This model is used when a researcher wants to establish that specific investigators are reliable in their data collection, but the reliability of others is relevant. Model 3 is the appropriate statistic to measure intrarater reliability, as the measurements of a single cannot be generalized to other raters Forms of the ICC: Single and Average Ratings Each of the ICC models can be expressed in two forms, depending on whether the scores are single ratings or mean ratings. Most often, reliability studies are based on comparison of scores from individual raters. There are times, however, when the mean of several raters or ratings may be used as the unit of reliability. For instance, when measurements are unstable, it may be necessary to use the mean of several measurements as the individual's score to obtain satisfactory reliability. Using mean scores has the effect of increasing reliability estimates, as means are considered better estimates of true scores, theoretically reducing error variance. The six types of ICC are classified using two numbers in parentheses. The first number designates the model, and the second number signifies the form, using either a single measurement or the mean of several measurements as the unit of analysis. For example, when using single measurements in a generalization study, we would specify use of ICC. The type of ICC used should always be indicated.
การแปล กรุณารอสักครู่..
