In these notes, we will talk about a different flavor of learning algorithms, known as
Bayesian methods. Unlike classical learning algorithm, Bayesian algorithms do not attempt
to identify “best-fit” models of the data (or similarly, make “best guess” predictions
for new test inputs). Instead, they compute a posterior distribution over models (or similarly,
compute posterior predictive distributions for new test inputs). These distributions provide
a useful way to quantify our uncertainty in model estimates, and to exploit our knowledge
of this uncertainty in order to make more robust predictions on new test points