Discovering the structure inherent in a set of patterns is a fundamental
aim of statistical inference or learning. One fruitful approach is to
build a parameterized stochastic generative model, independent draws
from which are likely to produce the patterns. For all but the simplest
generative models, each pattern can be generated in exponentially
many ways. It is thus intractable to adjust the parameters to maximize
the probability of the observed patterns. We describe a way of finessing
this combinatorial explosion by maximizing an easily computed
lower bound on the probability of the observations. Our method can
be viewed as a form of hierarchical self-supervised learning that may
relate to the function of bottom-up and top-down cortical processing
pathways