Hi,

I'm currently trying to build a cache layer that should sit "on top" of the
datanode. Essentially, the namenode should know the port number of the
cache layer instead of that of the datanode (since the namenode then relays
this information to the default HDFS client). All of the communication
between the datanode and the namenode currently flows through my cache
layer (including heartbeats, etc.)

*First question*: is there a way to tell the namenode where a datanode
should be? Any way to trick it into thinking that the datanode is on a port
number where it actually isn't? As far as I can tell, the port number is
obtained from the DatanodeId object; can this be set in the configuration
so that the port number derived is that of the cache layer?

I spent quite a bit of time on the above question and I could not find any
sort of configuration option that would let me do that. So, I delved into
the HDFS source code and tracked down the DatanodeRegistration class.
However, I can't seem to find out *how* the NameNode figures out the
Datanode's port number or if I could somehow change the packets to reflect
the port number of cache layer? *Second question: *how does the namenode
figure out a newly-registered Datanode's port number?

Thank you,

Dhaivat

Reply via email to