Below, we summarize how the RL-model parameters are related to the history dependence of choice; a portion of these results
was obtained in the present study. First, the learning rate αL
largely
controls how the weights for past outcomes are balanced, i.e., how
much the model weighs more recent outcomes compared to outcomes in the more distant past. In the F-Q model (αL = αF ), the learning rate does not influence the total weight (the sum of the
regression coefficients for reward history). However, we demonstrated that if the learning rate and the forgetting rate differ, then
the total weight can be a decreasing function of the learning rate.
This finding implies that increasing αL does not necessarily lead
to an increase in the cumulative effect of the recent reward history. Therefore, the value of the learning rate should be interpreted
with caution. The inverse temperature β and the outcome value κ
had essentially the same effect on the history dependence unless
the outcome values varied for different outcomes. These parameters uniformly and multiplicatively control the weights for past
events. Thus, the summed influence of the past reward history is a
monotonically increasing function of these parameters. The residual choice-autocorrelation factor Ci(t) has an additive effect on the
dependence on choice history. In the general case, this factor may
modulate the dependence on reward history.