If you want to read replicas from a specific DN after determining the block bounds via getFileBlockLocations you could abuse the rack locality infrastructure by generating a dummy topology script to get the NN to order replicas such that the client tries to read from the DNs you prefer first. It's not going to guarantee a read from a specific DN and is a terrible idea to do in a multi-tenant/production cluster but if you have a very specific goal in mind or want to learn more about the storage layer it may be an interesting exercise.
On Mon, Apr 23, 2018 at 9:14 PM, Arpit Agarwal <aagar...@hortonworks.com> wrote: > Hi, > > Perhaps I missed something in the question. FileSystem#getFileBlockLocations > followed by open, seek to start of target block, read. This will let you > read the contents of a specific block using public APIs. > > > > On 4/23/18, 5:26 PM, "Daniel Templeton" <dan...@cloudera.com> wrote: > > I'm not aware of a way to work with blocks using the public APIs. The > easiest way to do it is probably to retrieve the block IDs and then go > grab those blocks from the data nodes' local file systems directly. > > Daniel > > On 4/23/18 9:05 AM, Thodoris Zois wrote: > > Hello list, > > > > I have a file on HDFS that is divided into 10 blocks (partitions). > > > > Is there any way to retrieve data from a specific block? (e.g: using > > the blockID). > > > > Except that, is there any option to write the contents of each block > > (or of one block) into separate files? > > > > Thank you very much, > > Thodoris > > > > > > > > > > ------------------------------------------------------------ > --------- > > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > > >