One way to detect the largest flat area of the distribution is to represent a web page as a sequence of bits,
where bn = 1 indicates that the nth token is a tag,
and bn = 0 otherwise.
Certain tags that are mostly used to format text, such as font changes, headings, and table tags, are ignored (i.e., are represented by a 0 bit).
The detection of the main content can then be viewed as an optimization problem where we find values of i and j to maximize both the number of tags below i and above j and the number of non-tag tokens between i and j.
This corresponds to maximizing the corresponding objective function: