between real-life tasks and DHM simulations as well as
large differences in paired AAWS scores (although they
do not differ in overall mean).
On the contrary, inter-rater reliability on static postures
and extra scores was higher for DHM simulations
than for real-life tasks. One reason for this outcome
might be that those types of workload (e.g., the duration
of a static posture or the degree of restricted
visibility) are more obvious in DHM simulations than
on the rather busy production line. This outcome was
also verified by both experts. In that sense, DHM simulations
can help to detect workloads that are overlooked
in real-life assessments. This also explains why
experts assigned higher scores on static postures and
extra strains in DHM simulations as well as the differences
in paired AAWS scores for those two measures.
It can be concluded that inter-rater reliability on
DHM simulations would have been much better if ergonomists’
agreement on action forces had been as
high as it was in real life, because agreement on all other
scores (static postures, material handling, extra strains)
was good. Thus future DHM applications should be
able to integrate information on action forces, which
could be taken from CAD data of assembly parts like
clips (i.e., pressing forces) or tools like screwdrivers
(i.e., torques). Furthermore, these results also show
that some inter-rater differences should be expected
when DHM simulations are being assessed manually
with paper-and-pencil methods.
Inter-rater differences were also found when comparing
objective ergonomics risk assessment and subjective
RPE. Results generally highlighted that ergonomists’
AAWS scores and workers’ Borg RPE scores
correlated moderately positively for real-life tasks and
for DHM simulations. This finding demonstrates that
AAWS risk assessments reflected to some extent individually
experienced workload, which is an indication
of good criterion validity because usually correlations
between observational methods and subjective measures
are rather low (Barriera-Viruet et al., 2006). The
positive relationship between AAWS scores and workers’
RPE scores was clearly lower, however, for Observer
1 than for Observer 2, which again underlines
that inter-rater differences have to be expected when
DHM simulations (but also real-life tasks) are being
assessed with paper-and-pencil methods.
Besides such inter-rater differences, there might be
some additional reasons for discrepancies between
real-life assessments and DHM simulations specifi-
cally for this study. First, the 50th percentile male