The World Health Organization’s (WHO’s) International Classification of Functioning, Disability and Health (ICF)1 is gaining recognition in physical therapy and rehabilitation.2–5 The ICF provides a conceptual basis and a universal common language for understanding and describing patients’ health status, reaching beyond mortality, diseases, and medical diagnoses. The use of the ICF promotes a comprehensive, multidisciplinary, and patient-centered perspective in health care. The ICF has been applied in physical therapy and rehabilitation, especially in the field of neurorehabilitation, to facilitate multidisciplinary team communication, to structure the rehabilitation process, for goal setting and assessment, for documentation, and for reporting.5–11
However, with the practical application of the ICF, important challenges arise. The main challenge is the length of the classification, with more than 1,400 categories. To address this challenge, internationally agreed-on ICF Core Sets for various health conditions have been developed in a scientific evidence-based process.12 The common standardized procedure for developing ICF Core Sets integrates evidence gathered from preliminary studies with a formal decision-making and expert consensus process. The methodological approaches in the preliminary studies include, for each health condition: (1) systematic literature reviews of outcome measurements used in clinical trials, (2) Delphi exercises capturing experts’ views, and (3) collection of empirical data from people undergoing inpatient or outpatient rehabilitation. The results of the preliminary studies are the foundation for the subsequent decisionmaking and consensus process with a nominal group technique. The resulting ICF Core Sets are practical tools that represent selections of categories from the whole classification. They comprehensively describe the prototypical spectrum of problems in the functioning of patients with specific health conditions. They are based on the universal language of the ICF but enhance its applicability through their manageable size.
In the context of neurorehabilitation, stroke plays a prominent role. In this field, 3 ICF Core Sets can be applied, namely, the ICF Core Set for Stroke13 and the ICF Core Sets for patients with neurological conditions in acute care hospitals14 and early postacute care rehabilitation facilities.15 These ICF Core Sets have been combined to create the Extended ICF Core Set for Stroke. It contains all ICF categories that have been selected for any of the 3 ICF Core Sets mentioned above. The Extended ICF Core Set for Stroke contains 166 categories of the ICF, 59 categories of the component “body functions,” 11 categories of the component “body structures,” and 59 categories of the component “activity and participation.” The influence of the component “environmental factors” is described by 37 categories.
A further challenge for the implementation of the ICF is the operationalization of ICF categories. The ICF comprises “qualifiers” to quantify the level of functioning or the severity of the problem in the various ICF categories. The WHO suggests that all categories of the classi- fication be quantified with the same generic scale (Tab. 1).
According to the WHO, broad ranges of percentages are provided for situations in which calibrated assessment instruments or other standards are available to quantify the impairment, activity limitation, participation restriction, or environmental barrier or facilitator.1 However, calibrated assessment instruments based on the ICF category system are scarcely available at present. Using existing instruments along with the ICF would also require the concepts measured by the instruments and their resulting scores to be translated into corresponding ICF categories and qualifiers. Accomplishing such a translation procedure in a scientific way would demand extensive research efforts, which have not been undertaken yet.
Therefore, the application of the ICF and the ICF Core Sets is a challenge to the user and poses the question of reliability and rater agreement when qualifiers are assigned to describe patients’ functioning and disability. So far, only a few studies have dealt with the reliability of ICF qualifiers, and the interrater reliability of the qualifiers used with the ICF Core Set for Stroke has not been studied yet. Okochi et al16 used the ICF Checklist to examine test-retest reliability in geriatric patients and found moderate overall reliability during retesting after 1 week. Reliability varied among categories of the ICF (weighted kappa values .46 for body functions and .55 for activity and participation). Van Triet et al17 studied the intertester reliability of a schedule based on the International Classification of Impairments, Disabilities, and Handicaps (ICIDH) in patients with musculoskeletal problems. The ICIDH18 is the predecessor of the ICF. Kappa values ranged from .06 to 1.00 and were higher in “disability” categories than in “impairment” categories.
The study by van Triet et al,17 however, is clearly outdated because it is based on the ICIDH. The authors departed greatly from the categories of the classification and from its quali- fier scale in creating their assessment schedule. Both studies16,17 were conducted with poorly specified mixed samples; thus, the results were not generalizable to the functioning ratings for patients with stroke. In addition, the investigators in both studies made arbitrary selections of various areas of functioning, not covering the full scope of the ICF, as reflected in the carefully chosen categories of the Extended ICF Core Set for Stroke. In neither of the studies did the investigators consider the full qualifier scale in their analyses. The investigators in both studies applied designs in which different raters completed their recording of patients’ functioning at different time points, mixing variation of time points with variation of raters. Thus, the type and the amount of information underlying the ratings might not have been comparable.
Although Okochi et al16 examined the influence of the experience of raters on retest reliability, rater con- fidence and core competence might be more proximate variables connected to reliability. Within the context of reliability, rater confidence is an important variable frequently examined in clinical research. For example, in studies dealing with the reliability of imaging techniques19,20 and behavioral observations,21,22 con- fidence ratings are often used as independent outcomes to demonstrate diagnostic accuracy. The results of these studies hint at a possible relationship between agreement and confidence. Confidence might serve as an explanatory factor for rater agreement. Thus, with regard to the reliability and the application of the Extended ICF Core Set for Stroke, the association of rater agreement and rater confidence is of interest.
Furthermore, in reliability studies, the experience and training of raters seemed to be highly relevant and were frequently reported.23–29 In these studies, “experience” and “training” referred not only to the handling of the specific rating instrument used, but also to the clinical experience of the raters within the field and the concepts to be rated and the patient group with the given disease condition. These studies drew an equivocal picture of the relationship between rater competence and interrater reliability and suggested that the different results might have depended on the specific rating instrument examined. Thus, the role of raters’ areas of competence should be considered for any new rating tool.
Therefore, the overall objective of this investigation was to study the interrater reliability of physical therapists’ ratings of the functioning of study participants with the Extended ICF Core Set for Stroke. The specific aims were: (1) to study the agreement of the 2 physical therapists in rating participants’ functioning with the Extended ICF Core Set for Stroke, (2) to explore the relationship between rater agreement and rater confidence, and (3) to explore rater agreement in relation to physical therapists’ areas of core competence.