Modern society generates huge amounts of information every day, especially in digital format, which obstruct the storage and further processing and analysis. Big data can be defined as a scale of data set that goes beyond existing database management tool capabilities of data collection, storage, management, and analysis capabilities [1]. Although the most common trait of big data is Volume, it is typically defined by three Vs (Volume, Variety and Velocity).
In addition, big data can be classified taking into account the data type:
1.
Structured (data are organized into a predefined data schema).
2.
Semi-structured (data does not require a schema definition but the data includes metadata).
3.
Unstructured (data are stored in an unstructured form without any defined data schema).
There are many different applications in which big data techniques are applicable: data mining, predictive analytics, geoanalysis, natural language processing and pattern recognition. Also, we are heading into a social-media data explosion. In this connection, web-based applications encounter big data frequently, such as social computing, Internet text and documents or Internet search indexing. For this reason, there are some techniques which need to be seriously taken into account such as social network analysis and text mining.