This way, it knows to never return to that node and the learner can see that the agent now knows that path cannot lead to (at least not redundantly) the path for which it is searching. This process repeats until the third and always-final case is reaches, in which the tank agent knows the final path and exits the maze at the destination node victoriously.