Record Examination (GRE®), also addresses argumentation skills but
focuses on the last stage identified by Deane and Song, namely
presenting an argument, which is appropriate given the purpose of
the GRE.
Deane and Song review the developmental literature to identify
the possible levels of a progression and begin by postulating a set of
skills (KSAs) that underlie the different phases of mastering
argumentation from appeal building to building a case. The explicitly
developmental goal is to identify how proficiency in a domain like
argumentation develops, so that assessments can serve the multiple
functions that CBAL aspires to, namely not just assess the students’
current standing but also serve to promote learning. CBAL provides a
working prototype for the use of cognitive models for assessment
design, implementation, and analysis.
In short, ECD has been useful in designing a complex assessment
involving a learning progression. Taking into account multiple
considerations during the design of the development process is more
likely to result in an assessment that yields valid scores. However,
even when following a disciplined approach to design much can go
wrong and for that reason the process of validation is still necessary,
as we discuss next.
Validity
To validate an interpretation or use of assessment results is to
evaluate whether the proposed interpretation and use of the results is
adequately supported by appropriate evidence. The validation can be
facilitated by first stating the proposed interpretation and use in some
detail, in terms of an interpretation/use argument (or IUA) that lays
out the inferences and assumptions inherent in the interpretation and
use, and the interpretation and use can then be validated by evaluating
the completeness and coherence of the IUA and by evaluating the
plausibility of the inferences and assumptions in the IUA (Kane, 2013).
(When ECD has been used to design an assessment, the proposed
interpretation and use and the argument to support that interpretation
exists, at least in part, as a byproduct of the design process.)
Learning progressions provide an interpretation based on a
developmental model of performance in a discipline. Rather than
reporting results in terms of a continuous score scale, a student’s
assessment performance is reported and interpreted in terms of the
student’s standing in the learning progression, where the
achievement levels are intended to represent qualitatively different
levels of sophistication in the discipline. Alternately, an interpretation
based on a cognitive model might describe a student’s current state
of mastery of a topic or domain in terms of their mastery or
nonmastery of each of a set of binary attributes (skills, understandings)
specified in a cognitive model.
An assessment designed to identify students’ levels in a learning
progression would need to involve tasks that require the kinds of
performances associated with the different levels in the learning
progression. An assessment task or a part of an assessment task
associated with a particular achievement level would require the
kind of performance that students at that level of achievement
should be capable of performing.
An assessment designed to provide estimates of each student’s
mastery or nonmastery of the attributes in a cognitive diagnostic
model would need to involve assessment tasks that require different
subsets of the attributes and would need to include a sufficient
number and variety of such tasks to identify the particular attributes
that each student has mastered and those that the student has not
mastered.
The IUA and the Validity Argument
As noted, a learning progression is an ordered set of levels defined
with respect to a developmental or curricular model. What are
relevant criteria for evaluating an assessment based on a learning
progression? That question becomes all the more important in light
of the shift toward assessments that are used across jurisdictions,
such as countries in the case of international assessments, or states
in the case of the U. S., where consortia are developing assessments
intended to be used across states. In international assessments the
potential for country-by-item interactions has been noted when
different languages are involved (Ercikan, 2002). Similarly, the
potential for jurisdiction-by-item interactions could be relevant if
consequential inferences are to be drawn regarding the relative
performance of the different jurisdictions.
The IUA for assessments based on a learning progression would
start with the student performances on the assessment tasks and
would end with conclusions about the student (e.g., where the
student is in the learning progression), and in applied settings, with
suggestions about what to do next.
Scoring
Given the structure, interpretation, and expected uses of
assessment results in terms of learning progressions or cognitive
diagnostic models, the scoring system would be designed to assign
each student to a particular level in the progression or to an attribute
profile, based on the requirements built into the model. The
assignment might also include some differentiation within levels to
distinguish, for example, between students who have clearly
mastered a level, students who seem to be at the level but are
somewhat inconsistent, and students who have mastered the
previous level and are beginning to develop the skills of this level.
For the scoring procedures to make sense, they must be consistent
with the assumptions built into the model and with the structure
and content of the assessment; we have to collect appropriate data
for the estimation of the attributes used to characterize each
student’s achievement. A careful analysis of the performance domain
and of the model being adopted (e.g., using ECD) can make a strong
preliminary case for the fit between the performance domain, the
theoretical model, and the data collection procedures (de la Torre &
Minchen, this issue).
In addition, the observed relationships within the data should be
consistent with the assumptions built into the model and any
empirical predictions that can be derived from the model (van Rijn
et al., this issue). For example, the achievement levels in learning
progressions are typically strongly hierarchical in the sense that a
student who is assigned to a level in the progression should generally
be able to meet the requirements for lower levels, and should
generally not be able to meet the requirements for higher levels.
There may be some exceptions and slippage, especially for adjacent
levels, but the hierarchical structure of the learning progression
should generally hold.
Van Rijn et al. (this issue) propose two criteria for evaluation. One
is whether the leaning progressions can be ‘recovered” from test
data; the second criterion is whether tasks that are built based on a
learning progression and intended to be parallel to each other, in
fact, behave in that manner.
Most students classified as being at a certain level in the
progression should also be more or less at the levels of the progress
variables associated with that level of the progression, and this
pattern should hold across major subgroups of students (e.g., defined
by gender, race) as well as across jurisdictions. It will generally not
be possible to evaluate all such relations across all groups (e.g.,
because of small sample sizes), but where possible, the differences
should be evaluated, to ensure that the model-based interpretations
are invariant across relevant groups.
Similarly, the fit of cognitive diagnostic models should be
invariant across relevant groupings of students, as well as across
jurisdictions, where the potential exists that the match between the
Record Examination (GRE®), also addresses argumentation skills but
focuses on the last stage identified by Deane and Song, namely
presenting an argument, which is appropriate given the purpose of
the GRE.
Deane and Song review the developmental literature to identify
the possible levels of a progression and begin by postulating a set of
skills (KSAs) that underlie the different phases of mastering
argumentation from appeal building to building a case. The explicitly
developmental goal is to identify how proficiency in a domain like
argumentation develops, so that assessments can serve the multiple
functions that CBAL aspires to, namely not just assess the students’
current standing but also serve to promote learning. CBAL provides a
working prototype for the use of cognitive models for assessment
design, implementation, and analysis.
In short, ECD has been useful in designing a complex assessment
involving a learning progression. Taking into account multiple
considerations during the design of the development process is more
likely to result in an assessment that yields valid scores. However,
even when following a disciplined approach to design much can go
wrong and for that reason the process of validation is still necessary,
as we discuss next.
Validity
To validate an interpretation or use of assessment results is to
evaluate whether the proposed interpretation and use of the results is
adequately supported by appropriate evidence. The validation can be
facilitated by first stating the proposed interpretation and use in some
detail, in terms of an interpretation/use argument (or IUA) that lays
out the inferences and assumptions inherent in the interpretation and
use, and the interpretation and use can then be validated by evaluating
the completeness and coherence of the IUA and by evaluating the
plausibility of the inferences and assumptions in the IUA (Kane, 2013).
(When ECD has been used to design an assessment, the proposed
interpretation and use and the argument to support that interpretation
exists, at least in part, as a byproduct of the design process.)
Learning progressions provide an interpretation based on a
developmental model of performance in a discipline. Rather than
reporting results in terms of a continuous score scale, a student’s
assessment performance is reported and interpreted in terms of the
student’s standing in the learning progression, where the
achievement levels are intended to represent qualitatively different
levels of sophistication in the discipline. Alternately, an interpretation
based on a cognitive model might describe a student’s current state
of mastery of a topic or domain in terms of their mastery or
nonmastery of each of a set of binary attributes (skills, understandings)
specified in a cognitive model.
An assessment designed to identify students’ levels in a learning
progression would need to involve tasks that require the kinds of
performances associated with the different levels in the learning
progression. An assessment task or a part of an assessment task
associated with a particular achievement level would require the
kind of performance that students at that level of achievement
should be capable of performing.
An assessment designed to provide estimates of each student’s
mastery or nonmastery of the attributes in a cognitive diagnostic
model would need to involve assessment tasks that require different
subsets of the attributes and would need to include a sufficient
number and variety of such tasks to identify the particular attributes
that each student has mastered and those that the student has not
mastered.
The IUA and the Validity Argument
As noted, a learning progression is an ordered set of levels defined
with respect to a developmental or curricular model. What are
relevant criteria for evaluating an assessment based on a learning
progression? That question becomes all the more important in light
of the shift toward assessments that are used across jurisdictions,
such as countries in the case of international assessments, or states
in the case of the U. S., where consortia are developing assessments
intended to be used across states. In international assessments the
potential for country-by-item interactions has been noted when
different languages are involved (Ercikan, 2002). Similarly, the
potential for jurisdiction-by-item interactions could be relevant if
consequential inferences are to be drawn regarding the relative
performance of the different jurisdictions.
The IUA for assessments based on a learning progression would
start with the student performances on the assessment tasks and
would end with conclusions about the student (e.g., where the
student is in the learning progression), and in applied settings, with
suggestions about what to do next.
Scoring
Given the structure, interpretation, and expected uses of
assessment results in terms of learning progressions or cognitive
diagnostic models, the scoring system would be designed to assign
each student to a particular level in the progression or to an attribute
profile, based on the requirements built into the model. The
assignment might also include some differentiation within levels to
distinguish, for example, between students who have clearly
mastered a level, students who seem to be at the level but are
somewhat inconsistent, and students who have mastered the
previous level and are beginning to develop the skills of this level.
For the scoring procedures to make sense, they must be consistent
with the assumptions built into the model and with the structure
and content of the assessment; we have to collect appropriate data
for the estimation of the attributes used to characterize each
student’s achievement. A careful analysis of the performance domain
and of the model being adopted (e.g., using ECD) can make a strong
preliminary case for the fit between the performance domain, the
theoretical model, and the data collection procedures (de la Torre &
Minchen, this issue).
In addition, the observed relationships within the data should be
consistent with the assumptions built into the model and any
empirical predictions that can be derived from the model (van Rijn
et al., this issue). For example, the achievement levels in learning
progressions are typically strongly hierarchical in the sense that a
student who is assigned to a level in the progression should generally
be able to meet the requirements for lower levels, and should
generally not be able to meet the requirements for higher levels.
There may be some exceptions and slippage, especially for adjacent
levels, but the hierarchical structure of the learning progression
should generally hold.
Van Rijn et al. (this issue) propose two criteria for evaluation. One
is whether the leaning progressions can be ‘recovered” from test
data; the second criterion is whether tasks that are built based on a
learning progression and intended to be parallel to each other, in
fact, behave in that manner.
Most students classified as being at a certain level in the
progression should also be more or less at the levels of the progress
variables associated with that level of the progression, and this
pattern should hold across major subgroups of students (e.g., defined
by gender, race) as well as across jurisdictions. It will generally not
be possible to evaluate all such relations across all groups (e.g.,
because of small sample sizes), but where possible, the differences
should be evaluated, to ensure that the model-based interpretations
are invariant across relevant groups.
Similarly, the fit of cognitive diagnostic models should be
invariant across relevant groupings of students, as well as across
jurisdictions, where the potential exists that the match between the
การแปล กรุณารอสักครู่..