I'm running simulations on two agents: random agent and probabilistic agent. The world they are running in is the Wumpus World where the agent is dropped in a 4x4 grid where each cell has a 20% chance of being a pit and there is only one wumpus and one gold. For my simulation the gold and wumpus are not on a cell where there is a pit and all three cannot be at the starting cell (0, 0) where the player starts. I also added that the game must be solvable, there must be a path from (0, 0) to where the gold is. The agent has no knowledge of the world and must navigate by going cell by cell and using the known percepts (Stench, Breeze, Glitter) to get a better understanding of where each hazard is. If there is a stench, then the wumpus is in one of the adjacent cells and if there is a breeze, then a pit is in an adjacent cell. The random agent navigates the world by choosing a random adjacent cell and traveling there until it has the gold and left or has died. The probabilistic agent uses Monte Carlo to randomly generate n number of possible worlds with the same prior (the percepts that have been collected). It uses these worlds to then generate a Bayesian probability of where a pit and the wumpus could be and picks the cell that is the least probable to be a pit.
For the purposes of this simulation, I am comparing the random agent and the probabilistic agent and comparing their efficiency. The simulation only does one step in which the agent is put in a random, legal position in which two adjacent cells percepts are known to the agent. Using this knowledge, it will guess what cell adjacent to the three known cells are safe to travel to. In order to test their efficiency, I need to know the expected probability that the random agent and the probabilistic agent would choose right. Combinatorics is not my strong suit so I need to know how to generate the expected probability that each agent will choose a safe cell based on all possible combinations of the world.