Zhang and Lu (2012) proposed five different methods to estimate bias in RF for regression. Their methods estimate
residuals and add these estimated residuals to the predicted values to correct bias. Our method is similar to theirs because
we use the estimated residuals, but we go further. We will fit a linear model with estimated residuals as the response variable
(Y) and predicted values as the explanatory variable (X). Then, we will rotate this fitted line to the horizontal line or find
the best rotation angle to minimize bias.
The RF package in R (Liaw & Wiener, 2009) offers a bias correction method using a simple linear regression (SLR). It fits
a SLR with observed values as the response variable (Y) and predicted values as the explanatory variable (X). One thing
to note is that this built-in bias correction uses the fitted values from out-of-bag samples. We will later show that using
a bias correction with all data to fit a SLR offers better performance than a bias correction only with out-of-bag samples
in real data applications. The predicted values from out-of-bag samples can be computed as follows. Suppose we have
B bootstrap samples. Basically, tree bagging method takes an average of the predicted values from B trees. Suppose we
have to compute the predicted values for the first observation. Since they are bootstrap samples, there can be samples
that the first observations are not included. Let us say that K(