Thank you, Harsh. I appreciate it. 2012/12/20 Harsh J <ha...@cloudera.com>
> Hi Christoph, > > If you use sync/hflush/hsync, the new length of data is only seen by a > new reader, not an existent reader. The "workaround" you've done > exactly how we've implemented the "fs -tail <file>" utility. See code > for that at > http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Tail.java?view=markup > (Note the looping at ~74). > > On Thu, Dec 20, 2012 at 5:51 PM, Christoph Rupp <ch...@crupp.de> wrote: > > Hi, > > > > I am experiencing an unexpected situation where FSDataInputStream.read() > > returns -1 while reading data from a file that another process still > appends > > to. According to the documentation read() should never return -1 but > throw > > Exceptions on errors. In addition, there's more data available, and > read() > > definitely should not fail. > > > > The problem gets worse because the FSDataInputStream is not able to > recover > > from this. If it once returns -1 then it will always return -1, even if > the > > file continues growing. > > > > If, at the same time, other Java processes read other HDFS files, they > will > > also return -1 immediately after opening the file. It smells like this > error > > gets propagated to other client processes as well. > > > > I found a workaround: close the FSDataInputStream, open it again and then > > seek to the previous position. And then reading works fine. > > > > Another problem that i have seen is that the FSDataInputStream returns -1 > > when reaching EOF. It will never return 0 (which i would expect when > > reaching EOF). > > > > I use CDH 4.1.2, but also saw this with CDH 3u5. I have attached samples > to > > reproduce this. > > > > My cluster consists of 4 machines; 1 namenode and 3 datanodes. I run my > > tests on the namenode machine. there are no other HDFS users, and the > load > > that is generated by my tests is fairly low, i would say. > > > > One process writes to 6 files simultaneously, but with a 5 sec sleep > between > > each write. It uses an FSDataOutputStream, and after writing data it > calls > > sync(). Each write() appends 8 mb; it stops when the file grows to 100 > mb. > > > > Six processes read files; each process reads one file. At first each > reader > > loops till the file exists. If it does then it opens the > FSDataInputStream > > and starts reading. Usually the first process returns the first 8 MB in > the > > file before it starts returning -1. But the other processes immediately > > return -1 without reading any data. I start the 6 reader processes > before i > > start the writer. > > > > Search HdfsReader.java for "WORKAROUND" and remove the comments; this > will > > reopen the FSDataInputStream after -1 is returned, and then everything > > works. > > > > Sources are attached. > > > > This is a very basic scenario and i wonder if i'm doing anything wrong > or if > > i found an HDFS bug. > > > > bye > > Christoph > > > > > > -- > Harsh J >