Only the player in possession of the ball would take action based on previous experience using the reinforcement learning approach – the Q-Learning algorithm – that allows the simulated agents to select the best action when in possession of the ball.