thanks, man, these small files will be frequently accessed, our idea is that by caching data location from namenode, we may relieve the load of NameNode and maybe also save sometime at network communication.
On Tue, Sep 21, 2010 at 1:10 AM, Allen Wittenauer <awittena...@linkedin.com> wrote: > > On Sep 19, 2010, at 7:57 PM, steven zhuang wrote: > >> hi, all, >> I have sent this mail in common user list before, duplicate it >> here to seek for more help from experts. > > You'll likely have more luck on hdfs-dev. > >> I am wondering why seek(long) is disabled in HDFS.BlockReader? >> Can I use skip(long) to replace this seek(long)? >> >> I have a bunch of small files, each is less than a block in size. > > In other words, using Hadoop against recommended best practices. :) > > >> In >> my program, given the file/block information, I will try to start a >> process on each datanode and try to read from the HDFS directly >> through a socket connection to the datanode. > > Again, using Hadoop against best practices. > >> The read requires seek OP on the file, cause the file I used is >> TFile, which requires the underlying class to be seekable. > >