Appendix B. Regression Methodology for Chinese Price Adjustments
We estimate separate regressions for the following nine commodity groups within consumption:
food and non-alcoholic beverages; clothing and footwear; gross rent, water, fuel and power;
medical and health services; transport; recreation; education; restaurants; other goods and
services. Three other commodity groups had incomplete data and so no adjustment was made
for: alcoholic beverages and tobacco; household furnishings; and communication. The sample
consisted of 22 developing countries in the Asia-Pacific region. Country coverage and the
procedures adopted by the Asian Development Bank are detailed in their final report (Asian
Development Bank, 2007). China is not included in any of the regressions since its inclusion
may result in biased regression coefficients if the Chinese prices are understated in the first
place.
Two regression specifications were used: linear and log-linear models:
Linear : Yij ¼ a þ bXij þ uij; and Log-linear : ln Yi ¼ a þ b ln Xi þ ui;
where Yi is the price level of the i-th commodity group in country j, defined as the ratio of
the PPP for the i-th commodity group in country j to the market exchange rate for the
currency of that country. Separate regressions were run for each commodity group i. We
consider four alternative regressor variables leading to four different model specifications, as
described in the text. In addition, we estimate two sets of models, one including Fiji and
another without Fiji. As Fiji is an island country where most consumption items are imported,
the price level in Fiji is considerably higher than we would expect from countries at a similar
level of development. Further, Fiji price data suffered from a strong urban bias in the
collection of ICP prices.
Taking account of all the combinations of alternative X variables, linear versus log-linear
models and inclusion/exclusion of Fiji, we estimate a total of 16 alternative specifications.25 To
be able to compare the performance of linear models against log-linear models, we do not use
the conventional correlation coefficient but rather an alternative R2 measure, defined as the
squared correlation between observed and predicted Y values. Based on R2
, the log-linear
specification with a dummy variable for Fiji dominates the linear specification in all cases, so we
only report results for the former. Estimated coefficients and the R2 values for models 1–4
estimated for the nine commodity groups are presented for the log-linear model in Table B1.
Our preferred models conceptually are models 1 and 4 followed by models 2 and 3. The
models offer satisfactory fits for most of the commodity groups except for transport. Table B2
shows the actual and estimated price indexes for different commodity groups derived using
models 1–4.