However, due to the non-random selection of participants into BVIS, it becomes unrealistic to take
the difference between the observed outcomes (agricultural income) of participants and nonparticipants
as a measure of impact of BVIS. This is because there could be systematic differences in
the observed and unobserved characteristics of participants and non-participants. Failure to control
for these systematic differences between the two groups may yield biased impact estimates since their
characteristics are not homogeneous.
The missing data problem is solved by establishing counterfactual outcomes for participants based
on observed covariates of non-participants expressed as E(y0i | X, j = 0). Similarly, a treated outcome
was developed for participants based on observed covariates expressed as E(yi1 | X, j = 1). However,
estimation of E(Y) conditional on observed covariates leads to a dimensionality problem (Heinrich et al.,
2010). In other words, as the number of covariates increases, it becomes cumbersome to identify the
observations which are similar (in observed characteristics) to each other between participants and
non-participants. A solution to this is to use the conditional probabilities commonly referred to as propensity
scores (Rosenbaum and Rubin, 1983). Therefore, income differential estimated from equations
(2a) and (2b) based on propensity scores is as follows: