Whereas a reward function indicates what is good in an immediate sense,
a value function species what is good in the long run. Roughly speaking, the
value of a state is the total amount of reward an agent can expect to accumulate
over the future, starting from that state. Whereas rewards determine the
immediate, intrinsic desirability of environmental states, values indicate the
long-term desirability of states after taking into account the states that are
likely to follow, and the rewards available in those states