RR: HBase gives you real-time access to Hadoop software. By building on top of Hadoop, we are leveraging the scale that Yahoo has pioneered and other companies have also helped to improve. When we want to retrieve a range of data, we send a signal to a directory service that tells us which machine hosts that range of data. We ask the machine for the data, which it then retrieves and sends back to us, kind of like a library card catalog. But even though we contact a directory, the system isn’t strictly centralized. There’s not one machine handling every request, which means that if parts of the system go down, we don’t lose access to the entire set of data at the same time. We can also get data faster because we’re spreading the load over multiple machines, taking advantage of parallelism.