Deep Neural Networks (DNNs) have recently shown outstanding performance on
image classification tasks [14]. In this paper we go one step further and address
the problem of object detection using DNNs, that is not only classifying but also
precisely localizing objects of various classes. We present a simple and yet powerful
formulation of object detection as a regression problem to object bounding
box masks. We define a multi-scale inference procedure which is able to produce
high-resolution object detections at a low cost by a few network applications.
State-of-the-art performance of the approach is shown on Pascal VOC