The heuristic rules we used in our implemented algorithm may be summarized as follows: a component with a meaningful ID spelling will be assigned a relative higher score, a component with too many numbers (>=4) will be assigned a relative lower score, a component with an ID containing “middle” has relative higher score than “top” and “bottom”, a component with an ID containing “footer”, “left”, and “right” will be assigned a relative lower score, and a component with an ID containing advertisement oriented words will be assigned a relative lower score