Basically, data processing is seen as the gathering, processing,management of data for producing “new” information for end users [3]. Over time, key challenges are related to storage,transportation and processing of high throughput data. It is different from Big Data challenges to which we have to add ambiguity, uncertainty and variety [3]. Consequently, these requirements imply an additional step where data are cleaned,
tagged, classified and formatted [3,14]. Karmasphere5 currently splits Big Data analysis into four steps: Acquisition or Access, Assembly or Organization, Analyze and Action or Decision.
Thus, these steps are mentioned as the “4 A’s”. The Computing
Community Consortium [14] similarly to [3], divides the organization
step into an Extraction/Cleaning step and an Integration
step.