Hello Gang, Over the past few months, I've been dedicating my spare time to review the DataNode project. I've had several patches accepted over the past year but found working with the code base very challenging. It motivated me to create a new DataNode, backed by Spring Boot.
I also wrote the DataTransfer protocol into a Netty-backed service. I learned a lot about the protocol and took a bunch of notes that, if implemented, could streamline the protocol quite a bit and make it much more Netty (async+protobuf) friendly. The other motivation is that I wanted a way to better utilize all of the drives in the node. In a standard cluster installation, the OS gets two disks dedicated to it (RAID-1). In a cluster with 500 nodes, that is 1,000 drives dedicated to OS, much of it wasted. I have designed this DataNode to better utilize this space by storing the block metadata on these primary drives as well, freeing up more space on the data drives for block data. I have elected to use LevelDB to store the block metadata. This turns out to be quite handy because it can also be used to store all the other metadata a DataNode generates... DataNode UUID, Volume Metadata, Namespace info, and the rest. Having a single metadata repository greatly simplifies the design and since it is on a RAID drive, it can be assumed that it is always available (if the OS drives are both dead, the entire node will be dead anyway). It also could remove a lot of work from the NameNode. For example, the volume location of each block is tracked in LevelDB. Therefore, it is not necessary for the NameNode to track volumes. The only information that the NameNode need track is the DataNode URI, block pool Id, block ID, and generation stamp. The client can request the block from the DataNode without concern of the specific volume it is on. What I've put together is currently at version 0.0.0.5. It will work (tested on a three-node cluster with terasort/teragen/teravalidate) but it is pretty rough and does not implement even 20% of the functionality of the reference Apache DataNode. Also note that without a detailed reference guide of the protocol involved, I've had to do a bunch of reverse engineering. So, this DataNode may be communicating in such a way that the cluster doesn't reject it, but using bogus values for some of the fields. Please check it out. https://github.com/belugabehr/springdn Thanks!