In this work we leverage the expressivity of DNNs for object detector. We show that the simple
formulation of detection as DNN-base object mask regression can yield strong results when applied
using a multi-scale course-to-fine procedure. These results come at some computational cost at
training time – one needs to train a network per object type and mask type. As a future work we aim
at reducing the cost by using a single network to detect objects of different classes and thus expand
to a larger number of classes