3.3. Standard Q-learning model (with the forgetting rate αF = 0)
The standard learning model in which the action value for the
unchosen option is not updated is represented by setting αF = 0
in Eq. (9). With this setting,
α∗
t,i = αLδt,i
and Eq. (11) becomes