Fig. 1. Simulation and analytical results for the special case in which the learning rate (αL) and the forgetting rate (αF ) are identical (F-Q model). These two parameters
were varied while fixed to the same value (αL = αF ). (A) The regression coefficients for the reward history (top) and the choice history (bottom). The solid line represents
the coefficients estimated for the logistic regression model fitted to simulated data generated by the Q-learning models. The squares represent the analytical predictions
obtained using Eq. (17). (B) The total sum of the regression coefficients for the reward history (top) and the choice history (bottom), while varying the length of the history
included in the regression model (Mr = Mc ). (C) The scatter plot of the predictions regarding the current choice (P(a(t) = 1)) derived from the Q-learning model and
the regression model for varying learning rates (with identical forgetting and learning rates). For (A) and (C), the history lengths Mr and Mc are both set to 10. The other
parameters were κ = 1 and β = 3.0.