So what does my sketch of the HBase innards really say? You can see that HBase handles basically two kinds of file types.
One type is used for the write-ahead log and the other for the actual data storage. The files are primarily handled by the HRegionServer's. But in certain scenarios even the HMaster will have to perform low-level file operations.
You may also notice that the files are in fact divided up into smaller blocks when stored within the Hadoop Distributed Filesystem (HDFS). This is also one of the areas where you can configure the system to handle larger or smaller data better. (More on that later.)
The general flow is that a new client first contacts the Zookeeper quorum (a separate cluster of Zookeeper nodes) to find a particular row key. It does so by retrieving the server (i.e., host) name that hosts the -ROOT- region from Zookeeper. With that information it can query that server to get the server that hosts the .META. table. Both of these details are cached and only looked up once. Lastly, it can query the .META. server and retrieve the server that has the row the client is looking for.
Once it has been told where the row resides (i.e., in what region), it caches this information and directly contacts the HRegionServer hosting that region. That way, the client has a pretty complete picture over time of where to get rows without needing to query the .META. server again.
Note: When you start HBase, the HMaster is responsible for assigning the regions to each HRegionServer. This also includes the "special" -ROOT- and .META. tables.
Next, when the HRegionServer opens the region, it creates a corresponding HRegion object. As the HRegion is "opened," it sets up a Store instance for each HColumnFamily for every table, as defined by the user beforehand. Each of the Store instances in turn can have one or more StoreFile instances, which are lightweight wrappers around the actual storage file called HFile. An HRegion also has a MemStore and a HLog instance. Now let's have a look at how they work together, as well as where there are exceptions to the rule.
So what does my sketch of the HBase innards really say? You can see that HBase handles basically two kinds of file types.One type is used for the write-ahead log and the other for the actual data storage. The files are primarily handled by the HRegionServer's. But in certain scenarios even the HMaster will have to perform low-level file operations.You may also notice that the files are in fact divided up into smaller blocks when stored within the Hadoop Distributed Filesystem (HDFS). This is also one of the areas where you can configure the system to handle larger or smaller data better. (More on that later.)The general flow is that a new client first contacts the Zookeeper quorum (a separate cluster of Zookeeper nodes) to find a particular row key. It does so by retrieving the server (i.e., host) name that hosts the -ROOT- region from Zookeeper. With that information it can query that server to get the server that hosts the .META. table. Both of these details are cached and only looked up once. Lastly, it can query the .META. server and retrieve the server that has the row the client is looking for.Once it has been told where the row resides (i.e., in what region), it caches this information and directly contacts the HRegionServer hosting that region. That way, the client has a pretty complete picture over time of where to get rows without needing to query the .META. server again.Note: When you start HBase, the HMaster is responsible for assigning the regions to each HRegionServer. This also includes the "special" -ROOT- and .META. tables.Next, when the HRegionServer opens the region, it creates a corresponding HRegion object. As the HRegion is "opened," it sets up a Store instance for each HColumnFamily for every table, as defined by the user beforehand. Each of the Store instances in turn can have one or more StoreFile instances, which are lightweight wrappers around the actual storage file called HFile. An HRegion also has a MemStore and a HLog instance. Now let's have a look at how they work together, as well as where there are exceptions to the rule.
การแปล กรุณารอสักครู่..