Hi Dhaivat, I did a good chunk of the design and implementation of HDFS-4949, so if you could post a longer writeup of your envisioned use cases and implementation, I'd definitely be interested in taking a look.
It's also good to note that HDFS-4949 is only the foundation for a whole slew of potential enhancements. We're planning to add some form of automatic cache replacement, which as a first step could just be an external policy that manages your static caching directives. It should also already be possible to integrate a job scheduler with HDFS-4949, since it both exposes the cache state of the cluster and allows a scheduler to prefetch data into RAM. Finally, we're also thinking about caching at finer granularities, e.g. block or sub-block rather than file-level caching, which is nice for apps that only read regions of a file. Best, Andrew On Mon, Dec 23, 2013 at 9:57 PM, Dhaivat Pandya <dhaivatpan...@gmail.com>wrote: > Hi Harsh, > > Thanks a lot for the response. As it turns out, I figured out the > registration mechanism this evening and how the sourceId is relayed to the > NN. > > As for your question about the cache layer it is a similar basic concept as > the plan mentioned, but the technical details differ significantly. First > of all, instead of having the user tell the namenode to perform caching (as > it seems from the proposal on JIRA), there is a distributed caching > algorithm that decides what files will be cached. Secondly, I am > implementing a hook-in with the job scheduler that arranges jobs according > to what files are cached at a given point in time (and also allows files to > be cached based on what jobs are to be run). > > Also, the cache layer does a bit of metadata caching; the numbers on it are > not all in, but thus far, some of the *metadata* caching surprisingly gives > a pretty nice reduction in response time. > > Any thoughts on the cache layer would be greatly appreciated. > > Thanks, > > Dhaivat > > > On Mon, Dec 23, 2013 at 11:46 PM, Harsh J <ha...@cloudera.com> wrote: > > > Hi, > > > > On Mon, Dec 23, 2013 at 9:41 AM, Dhaivat Pandya <dhaivatpan...@gmail.com > > > > wrote: > > > Hi, > > > > > > I'm currently trying to build a cache layer that should sit "on top" of > > the > > > datanode. Essentially, the namenode should know the port number of the > > > cache layer instead of that of the datanode (since the namenode then > > relays > > > this information to the default HDFS client). All of the communication > > > between the datanode and the namenode currently flows through my cache > > > layer (including heartbeats, etc.) > > > > Curious Q: What does your cache layer aim to do btw? If its a data > > cache, have you checked out the design being implemented currently by > > https://issues.apache.org/jira/browse/HDFS-4949? > > > > > *First question*: is there a way to tell the namenode where a datanode > > > should be? Any way to trick it into thinking that the datanode is on a > > port > > > number where it actually isn't? As far as I can tell, the port number > is > > > obtained from the DatanodeId object; can this be set in the > configuration > > > so that the port number derived is that of the cache layer? > > > > The NN receives a DN host and port from the DN directly. The DN sends > > it whatever its running on. See > > > > > https://github.com/apache/hadoop-common/blob/release-2.2.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java#L690 > > > > > I spent quite a bit of time on the above question and I could not find > > any > > > sort of configuration option that would let me do that. So, I delved > into > > > the HDFS source code and tracked down the DatanodeRegistration class. > > > However, I can't seem to find out *how* the NameNode figures out the > > > Datanode's port number or if I could somehow change the packets to > > reflect > > > the port number of cache layer? > > > > See > > > https://github.com/apache/hadoop-common/blob/release-2.2.0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java#L690 > > (as above) for how the DN emits it. And no, IMO, that ("packet > > changes") is not the right way to go about it if you're planning an > > overhaul. Its easier and more supportable to make proper code changes > > instead. > > > > > *Second question: *how does the namenode > > > figure out a newly-registered Datanode's port number? > > > > Same as before. Registration sends the service addresses (so NN may > > use them for sending to clients), beyond which the DN's heartbeats are > > mere client-like connections to the NN, carried out on regular > > ephemeral ports. > > > > -- > > Harsh J > > >