Table 1 – Four types of data integration strategies described by K. Krishnan in [3] with their
main characteristics, pros and cons.
Data-driven integration External integration
-Categorization of data by type -Big Data and classic warehouse
(transactional, analytical, in two platforms
semi-structured, unstructured) -A data bus for connection
-Pros: infrastructure can be adapted -Pros: the platforms can scale each,
to each category. Idem for workload types overload is reduced, modularity, etc.
(w.r.t. Volume of data and latency) -Cons: complexity of data bus architecture
-Cons: possible various integration can drop performance over
Efforts on the same architecture time, poor metadata handling
Integration-driven approach Big Data appliances
-Combining Big Data and existing -A black box from vendors with three
warehouse platforms layers (Big Data, RDBMS and integration)
- A Hadoop/NoSQL connector links them -Pros: scalable and modular custom
Pros: the platforms can scale each, configuration for users (organizations)
overload is distributed, modularity, -Cons: custom configuration by vendors can
good metadata handling change frequently and can be source
-Cons: the connector is of heavy maintenance
Achilles’ heel, complexity of data integration