Big data management
Basically, data processing is seen as the gathering, processing, management of data for producing “new” information for
end users [3]. Over time, key challenges are related to storage, transportation and processing of high throughput data. It is
different from Big Data challenges to which we have to add
ambiguity, uncertainty and variety [3]. Consequently, these requirements imply an additional step where data are cleaned,
tagged, classified and formatted [3,14]. Karmasphere5 currently splits Big Data analysis into four steps: Acquisition or
Access, Assembly or Organization, Analyze and Action or Decision.
Thus, these steps are mentioned as the “4 A’s”. The Computing
Community Consortium [14] similarly to [3], divides the organization step into an Extraction/Cleaning step and an Integration
step.