On Thu, May 7, 2009 at 1:04 PM, Foss User <[email protected]> wrote: > On Fri, May 8, 2009 at 1:20 AM, Raghu Angadi <[email protected]> > wrote: > > > > > > Philip Zeyliger wrote: > >> > >> It's over TCP/IP, in a custom protocol. See DataXceiver.java. My sense > >> is > >> that it's a custom protocol because Hadoop's IPC mechanism isn't > optimized > >> for large messages. > > > > yes, and job classes are not distributed using this. It is a very simple > > protocol used to read and write raw data to DataNodes. > > How are the job class files or jar files distributed then?
I believe that the JobClient does write the job's files to HDFS (with mapred.submit.replication replication factor) as part of job submission, and writing to HDFS does use this interface. (It also triggers other uses of this interface: the data nodes stream copies of blocks to each other, I believe) What may be confusing is that the job configuration and such is passed via IPC to the JobTracker separately. -- Philip
