Hi Jay, Actually, my question on seeing that code is wondering why it's hardcoded to 2, rather than targets.length. The pipeline length is supposed to be the number of datanodes in the pipeline. This might be a bug.
Regarding the timeout, it makes sense to boost the timeout based on the length of the pipeline. Longer pipelines can experience more delays, since data needs to flow down and then get ack'd back up. Best, Andrew On Tue, Apr 23, 2013 at 6:25 AM, Jay Vyas <jayunit...@gmail.com> wrote: > Hi guys: I noticed that in the call to createSocketForPipeline, there is a > hardcoded length of "2". > > //from > sock = createSocketForPipeline(src, 2, dfsClient); > > This cascades down to the "getDataNodeReadTimeout" method, resulting in a > multiplier of 2. > > //from DFSClient.java > int getDatanodeReadTimeout(int numNodes) { > return dfsClientConf.socketTimeout > 0 ? > (HdfsServerConstants.READ_TIMEOUT_EXTENSION * numNodes + > dfsClientConf.socketTimeout) : 0; > } > > I wonder why the pipeline length is "2" as opposed to "1" ? It seems that > transferring a single block should have a pipeline length of 1? > > ///for example: in the createBlockOutputStream method, we have > s = createSocketForPipeline(nodes[0], nodes.length, dfsClient); > > Is the "2", then, just used to add some cushion to the timeout? Or is > something expected to be happening during a block transfer which makes the > pipeline a 2 node, rather than 1 node one? > > Maybe I'm misunderstanding something about the way the pipeline works so > thanks for helping and apologies if this question is a little silly. > > -- > Jay Vyas > http://jayunit100.blogspot.com >