Obtaining accurate, large-scale, and dense 3D reconstructions
of environments in (or close to) real-time is a
core problem in 3D Computer Vision. Recently, multiple
solutions to this problem have been proposed that run interactively
and rely on active depth sensors, e.g., Microsoft’s
Kinect or the depth camera integrated in Google’s Project
Tango devices. In such systems the user walks around with
a hand-held device and reconstructs the scene, allowing the
user to directly add data where it is needed. Active sensors
are usually restricted to indoor use because of too strong
background illumination by the sun, and have a limited
depth range. This creates a strong need for passive, image
based solutions to overcome these problems.