The first step, edge detection, is the technique of identifying
sharp changes in an image. Fundamentally, it works by detecting
discontinuities in image brightness. We specifically use the
Canny edge detection algorithm, a classical, yet generally wellperforming
edge detection algorithm. In the second step we
compute contours of images, using the computed edges, to
obtain object boundaries. Since buttons typically have a convex
shape and a large enough area so that a user can easily tap
on them, we ignore non-convex contours and those with a
small area within a threshold parameter. Numerous contours
such as those arising out of text or the non-convex or open
contours in embedded images are eliminated in this step. For
the remaining contours, we compute the bounding boxes, or
the smallest rectangles that would contain those contours. This
step is simply to identify a central point where a tap can be
made to simulate a button click.