Abstract— The paradigm of processing huge datasets has been
shifted from centralized architecture to distributed architecture.
As the enterprises faced issues of gathering large chunks of data
they found that the data cannot be processed using any of the
existing centralized architecture solutions. Apart from time
constraints, the enterprises faced issues of efficiency,
performance and elevated infrastructure cost with the data
processing in the centralized environment.
With the help of distributed architecture these large
organizations were able to overcome the problems of extracting
relevant information from a huge data dump. One of the best
open source tools used in the market to harness the distributed
architecture in order to solve the data processing problems is
Apache Hadoop. Using Apache Hadoop’s various components
such as data clusters, map-reduce algorithms and distributed
processing, we will resolve various location-based complex data
problems and provide the relevant information back into the
system, thereby increasing the user experience.