Hi Harsh,

I did mean 0.18 - sorry about the typo.

I read through the BlockSender.sendChunks method once again and noticed
that I wasn't reading the checksum byte array correctly in my code.

Thanks for the help,

Dhaivat Pandya



On Sun, Apr 6, 2014 at 8:59 PM, Harsh J <ha...@cloudera.com> wrote:

> There's been no Apache Hadoop release versioned v1.8 historically, nor
> is one upcoming. Do you mean 0.18?
>
> Either way, can you point to the specific code lines in BlockSender
> which have you confused? The sendBlock and sendPacket methods would
> interest you I assume, but they appear to be well constructed/named
> internally and commented in a few important spots.
>
> On Mon, Apr 7, 2014 at 6:39 AM, Dhaivat Pandya <dhaivatpan...@gmail.com>
> wrote:
> > Hi,
> >
> > I'm trying to figure out how data is transferred between client and
> > DataNode in Hadoop v1.8.
> >
> > This is my understanding so far:
> >
> > The client first fires an OP_READ_BLOCK request. The DataNode responds
> with
> > a status code, checksum header, chunk offset, packet length, sequence
> > number, the last packet boolean, the length and the data (in that order).
> >
> > However, I'm running into an issue. First of all, which of these lengths
> > describes the length of the data? I tried both PacketLength and Length it
> > seems that they leave data on the stream (I tried to "cat" a file with
> the
> > numbers 1-1000 in it).
> >
> > Also, how does the DataNode signal the start of another packet? After
> > "Length" number of bytes have been read, I assumed that the header would
> be
> > repeated, but this is not the case (I'm not getting sane values for any
> of
> > the fields of the header).
> >
> > I've looked through the DataXceiver, BlockSender, DFSClient
> > (RemoteBlockReader) classes but I still can't quite grasp how this data
> > transfer is conducted.
> >
> > Any help would be appreciated,
> >
> > Dhaivat Pandya
>
>
>
> --
> Harsh J
>

Reply via email to