to be highly non-convex as there can be sub-groups freely
moving in the unoccupied space.
Clustering can be performed by fitting Gaussian Mixture
Models to motion vectors directly [11] or by mapping motion
vectors to tailored subspaces [3]. Clusters of coherently
moving people can also be isolated in a crowd [4], [12]
by temporally correlating tracklets, generated for example as
Kanade-Lucas-Tomasi short tracks [13], so as to automatically
highlight groups moving together. However, this approach
may omit smaller groups with weak coherent motions. Crowd
motion can also be characterized in terms of coherent or
intersecting motion via Deep Networks [14]. Moreover, this
method requires supervision and does not spatially localize
activities. Deep Networks could also be used to infer clusters
of features in an unsupervised manner as a dimensionality
reduction problem (auto-encoders) [15].