Let us start thinking of x as the model state, whichaccumulates our knowledge. p ( x ) is the (5) prior distribution; our knowledge from previous observations. p ( x y o ) is the posterior distribution, after adding the information from the observation y o . p ( y o x ) is the probability density of getting the observation y o, given our previous knowledge. Note that this is a density in y -space. Regarded as a function of x , p ( y o x ) is not a probability density (its integral is not necessarily one); it is called the likelihood function for x .