3.3.3. Data imputation
If there are any missing data points identified, imputation techniques
may be applied to fill the missing points, particularly for the
vital parameters in reliability modeling. In most situations, simple
imputation techniques, such as mean imputation, using information
from related observation, or adding indicator variables for
missingness of categorical or continuous variables, are adequately
used to complement the missing data. For example the mean imputation may be the easiest way to impute where each missing
value is replaced with the mean of the historically observed values
for that variable. However, this strategy should be used only with
proper justification since it may severely distort the distribution
for this variable, leading to complications with summary measures,
including notably underestimates of the standard deviation. Moreover,
mean imputation intends to distort relationships between
variables by ‘‘pulling’’ estimates of the correlation toward zero.
The analyst should use this strategy only if the standard deviation
of the corresponding variable does not play a critical role in the risk
modeling. Refer to Gelman and Hill (2006) for advanced imputation
techniques.