The rule with the highest firing strength will be used to compute the system output. Whenever the reward and the new state are observed, the Q-value function is updated using the standard SARSA formula. To summarize, fuzzy-SARSA algorithm is depicted in Fig. 10.