The abstracted data (mean age, type of stroke, numbers of patients in
experimental and control group, days of treatment, average length of
daily treatment in minutes, mean difference in change scores in
ADL, and standard deviation [SD] of ADL scores in experimental
and control groups at baseline) were entered into Excel for Windows.
The formal statistical methods used to test the results of different
trials have been described elsewhere.3 The effect size gi (Hedges’ g)
for individual studies was established by calculating the difference
between means of the experimental and control groups divided by
the average population SDi
.
14 If necessary, means and SDi were
requested from the respective authors. Otherwise, point estimates
were obtained from the graphs of included articles by recording the
bitmap coordinates after scanning the graphs into Microsoft Paint.
To estimate SDi for gi
, baseline SDs of control and experimental
groups were pooled (eg, Hedges, 1985). Because the gi tend to
overestimate the population effect size in studies with a small number of
patients, a correction was implemented to obtain an unbiased estimation
gu
. The impact of sample size was addressed by estimating a weighting
factor wi for each study and applying more weight to effect sizes from
studies with larger samples that resulted in smaller variances. Subsequently,
gu of individual studies were averaged to obtain a weighted SES
(Tˇ). Finally, the wi of each study were combined to estimate the variance
of the SES.15 When information about point estimates and standard
errors was lacking, the original authors were consulted. The effect size
gu for individual studies was computed for degree of disability in
day-to-day activities, walking speed, and dexterity. In addition, SESs
expressed as number of standard deviation units (SDUs) were calculated
for studies comparing effects of different intensities in rehabilitation in
the chronic stage of stroke ("6 months after onset), and those initiated
within 6 months of stroke. The fixed effects model was used to decide
whether a SES was statistically significant. The homogeneity (or
heterogeneity) test statistic (Q-statistic) of each set of effect sizes was
examined to determine whether studies shared a common effect size