• Crowdsourcing used to collect data and/or features and
metadata to enhance the current semantics of data.
• Text analytics which aims to analyze large text collections
(email, web pages, etc.) to extract information. It is used
for topics modeling, question answering, etc.
Some proposals emphasize that those techniques rely on a
generalized picture of the underlying knowledge. Due to their
design they fail to capture the subtleties of the processes
which produce these data [33,34]. Moreover, these techniques
sometimes behave badly with very large datasets. It is the
case for example of learning-based techniques. There, size of
training data can exceed memory or the fast growing number
of features can lead to a high execution time. Sengamedu [35]
presents some scalable methods which can be applied for
machine learning (Random Projections, Stochastic Gradient
Descent and MinClosed sequences). Trends about big data
analytics are summarized within [31]. They mainly concern
visualization of multi-form, multi-source and real-time data.
Moreover, the size of data limits in-memory processing.