Measurement and evaluation of competence
Development, measurement and evaluation of competence has become an important issue in education and training. Various factors contribute to that: (a) the shift of focus in education from input to output, stimulated by standard-based assessment; accountability systems; (b) international comparisons of school system achievement; (c) the transition from subject to literacy orientation in education; (d) recognition of learning in non-formal and informal settings, the vision of lifelong learning as a prominent goal of the European Union (EU). Nevertheless, electronic literature research into vocational education and training (VET) using, in combination, the keywords ‘measurement’, ‘evaluation’ and ‘competence’ generated a negligible number of references. Better results were obtained using ‘competence’ and ‘evaluation’. This may indicate that the link between measurement, evaluation and competence is still absent in European VET research. Measurement requires deciding what is to be measured. In order to provide specification, a general diagnostic framework is introduced.
It consists of three properly differentiated levels: external conditions (e.g. situations, products), actual episodes realised by an individual (e.g. behaviour, cognitive operations, individually created information, motivation), and personal internal conditions (e.g. knowledge, skills, motives). Within this conceptualisation measurement faces the problem that knowledge, skills, and cognitive operations are not visible to outsiders. The only way to achieve insight is by observation of specified external conditions and/or behaviour (e.g. realised behaviour, changed situations, or created products). From this observation, specified/defined elements of the internal conditions (e.g. knowledge, skills) are inferred. The relations between the observable and non-observable are established through interpretation rules and hypotheses. Evaluation, in this context, is judging the observed competences against defined benchmarks. Such benchmarks may be measured knowledge, skills, actions, or performance of other persons (norm-referenced); or they may be theoretically specified types and levels of knowledge, skills, actions, or performances (criterion-referenced). Evaluation and measurement themselves are subsumed under the term assessment. Against the background of this general diagnostic framework, selected competence definitions in the EU are analysed. Most of them bridge all the three levels of the model. Such a broad and multilevel concept of competence might generate more misunderstanding than understanding in public and scientific discussions. Therefore, two recommendations are made on how to define and assess competence: (a) an accurate description of the tasks and requirements (= external conditions); (b) a specification/characterisation of the psychic attributes a person should possess or that has built up in a specific occupational domain. Selected procedures of assessing competence in the EU are introduced: (a) the bilan de competences (France); (b) the national vocational qualifications (NVQ) (UK); (c) dimensions of action competence in the German dual system; (d) assessing competences at work (the Netherlands); (e) realkompetanse (Norway); (f) recreational activities (Finland); (g) competence evaluation in continuing IT-training (Germany). The analysis of these conceptions using interrelated assessment quality criteria (validity, reliability, objectivity, fairness and usability) revealed that, to date, empirically grounded findings about fulfilling these criteria are scarce. Furthermore, methodological considerations indicate that self and other observation, even guided by criteria and realised with more than one external assessor, produce many errors. The validity of the approaches varies considerably when it comes to diagnosing occupational competence and occupational success in general.
The discussions often focus on the format of the tasks (closed versus open-ended). In this respect, the advantages of performance-based assessment through open-ended tasks requiring complex skills (i.e. involving a significant number of decisions) should not be overestimated. The problem is that the increased cost of evaluation is not justified by the increase in external validity. In addition, there is evidence that the key is not the format but rather the content requirements of the task. Concerning the timing of assessments (e.g. continuously, sequential or punctual) valid evidence can only be obtained through systematic empirical investigations of assessment procedures in relation to concepts of competence development. In addition, there are methodological reasons to supplement self- and peer assessment with standard oriented assessment and accountability considerations. Assessment should be done with regard to criteria, such as those proposed by the American Educational Research Association (AERA). On the basis of these findings, the following recommendations are given to correspond with the EU goals of transparency and mobility: (a) initiating a detailed and summarising review of the diverse practices of competence assessment in the EU, focusing on the methodological dimension; (b) promoting empirical investigation about the measurement quality of selected and prototypical assessment procedures practised in the EU; (c) activating conceptual and empirical research on how to define and validate competence and its development in VET; (d) advocating a VET-PISA in selected occupations or sectors under the patronage of the EU.