The effects of the reinforcement schedule Our analytical calculation demonstrated that for the special case where the forgetting rate αF equals the learning rate αL, the regression coefficient is determined independent of the task structure, i.e., the reinforcement schedule. For the general case where αL ̸= αF , however, the reinforcement schedule may affect the influence of the previous reward history because the impact of the reward history depends on the number of the same choices after the reward is given, as shown in Eq. (21). To examine this effect, first, we conducted a simulation with varying reward probabilities for the optimal option (pr ) from 0.5 to 0.9 (with reward probability for the non-optimal option being 1−pr ). Fig. 4(A) shows the regression coefficients obtained by the simulation. The closer to 0.5 that pr was (the more difficult it is to discriminate the optimal choice), the smaller the decay of the regression coefficients, although the effect was weak. This result is explained as follows. When the difference in reward probabilities of two options is small, the difference between two action values tends to be small; thus, the model is likely to switch the choice. Therefore, the number times the same option is repeated becomes smaller, which leads to a smaller decay of the influence of reward history.