I guess you are right. It doesn't affect correctness, or even performance. The only bits I am not sure about performance is can we safely assume all the status messages and seqno ack messages will travel in one TCP packets? Given that there probably won't being long chains of datanodes, the total ack message sizes should be well under system buffer size, this should be true. (Here I assume that even though datanodes use non-blocking socketchannel writes/reads, each individual socketchannel.write are still buffered at system level before they go out to the network).
-Bin From: Dhruba Borthakur <dhr...@gmail.com> To: common-dev@hadoop.apache.org Date: Mon, 2 Nov 2009 22:31:03 -0800 Subject: Re: datanode ack behavior for block receive Hi Bin, I think that your observation is correct. The act of sending a SUCCESS status ack can be avoided by intelligently looking at the seqno. However, my opinion is that returning the extra bit of information is not impacting performance/correctness at all, do you agree? thank, dhruba On Mon, Nov 2, 2009 at 12:39 PM, B. X. <bxi...@gmail.com> wrote: > Hi All, > > I observed that there are two kinds of ack'ing going on when a > datanode receives a data block packet: 1. ack by sending the sequence > number of the received block to upstream datanode; 2. also send > operation status (e.g. SUCCESS, ERROR); > > The seqno is chained, that is, a node will not ack the seqno unless > it received the same seqno from downstream, or a -2 is sent to > indicate not receiving anything from downstream datanodes. > The status is forwarded, with the number of such messages increased by > one traveling upstream. > > My question is why the seqno ack mechanism alone is not sufficient > in this case. Are status acks really needed? > > -Bin >