Following Helmholtz, we view the human perceptual system as a statistical
inference engine whose function is to infer the probable causes
of sensory input. We show that a device of this kind can learn how to
perform these inferences without requiring a teacher to label each sensory
input vector with its underlying causes. A recognition model is used
to infer a probability distribution over the underlying causes from the
sensory input, and a separate generative model, which is also learned, is
used to train the recognition model (Zemel 1994; Hinton and Zemel 1994;
Zemel and Hinton 1995)