I'm running simulations on two agents: random agent and probabilistic agent. The world they are running in is the Wumpus World where the agent is dropped in a 4x4 grid where each cell has a 20% chance of being a pit and there is only one wumpus and one gold. For my simulation the gold and wumpus are not on a cell where there is a pit and all three cannot be at the starting cell (0, 0) where the player starts. I also added that the game must be solvable, there must be a path from (0, 0) to where the gold is. The agent has no knowledge of the world and must navigate by going cell by cell and using the known percepts (Stench, Breeze, Glitter) to get a better understanding of where each hazard is. If there is a stench, then the wumpus is in one of the adjacent cells and if there is a breeze, then a pit is in an adjacent cell. The random agent navigates the world by choosing a random adjacent cell and traveling there until it has the gold and left or has died. The probabilistic agent uses Monte Carlo to randomly generate n number of possible worlds with the same prior (the percepts that have been collected). It uses these worlds to then generate a Bayesian probability of where a pit and the wumpus could be and picks the cell that is the least probable to be a pit.
For the pu