Fortunately, for many applications the DP optimal policy can be computed with a modest computational effort. In this paper we restrict attention to this class of DPs. Typically, the transition probability of the underlying Markov process is estimated from historical data and is, therefore, subject to statistical errors. In current practice, these errors are ignored and the optimal policy is computed assuming that the estimate is, indeed, the true transition probability. The DP optimal policy is quite sensitive to perturbations in the tran-sition probability and ignoring the estimation errors can lead to serious degradation in performance (Nilim and El Ghaoui, 2002; Tsitsiklis et al., 2002). Degradation in performance due to estimation errors in parameters has also been observed in other contexts (Ben-Tal and Nemirovski, 1997; Goldfarb and Iyengar, 2003).Therefore, there is a need to develop DP models that explicitly account for the effect of errors.