4.5. The effects of the choice-autocorrelation factor
We have seen that the standard Q-learning model exhibits a
(negative) dependency on choice history, as well as on reward
history. However, it is difficult to control the dependence on choice
history explicitly by tuning a model parameter. As described above,
human choice behavior has a property called choice perseverance,
which is a tendency to repeat the same choice as the recent
choice (Akaishi et al., 2014; Gershman et al., 2009; Huys et al.,
2011). A straightforward way to represent choice perseverance in
the Q-learning is simply to add a residual choice-autocorrelation
factor to the action values when computing the choice probability,
as in Eq. (22). Here, we investigated the effect of the choice-autocorrelation factor on history dependence using a probabilistic
learning task (with pr = 0.7)
4.5. The effects of the choice-autocorrelation factorWe have seen that the standard Q-learning model exhibits a(negative) dependency on choice history, as well as on rewardhistory. However, it is difficult to control the dependence on choicehistory explicitly by tuning a model parameter. As described above,human choice behavior has a property called choice perseverance,which is a tendency to repeat the same choice as the recentchoice (Akaishi et al., 2014; Gershman et al., 2009; Huys et al.,2011). A straightforward way to represent choice perseverance inthe Q-learning is simply to add a residual choice-autocorrelationfactor to the action values when computing the choice probability,as in Eq. (22). Here, we investigated the effect of the choice-autocorrelation factor on history dependence using a probabilisticlearning task (with pr = 0.7)
การแปล กรุณารอสักครู่..
