On 2 May 2013 09:28, Todd Lipcon <t...@cloudera.com> wrote: > Hi Brad, > > The reasoning is that the NameNode locking is somewhat coarse grained. In > older versions of Hadoop, before it worked this way, we found that listing > large directories (eg with 100k+ files) could end up holding the namenode's > lock for a quite long period of time and starve other clients. > > Additionally, I believe there is a second API that does the "on-demand" > fetching of the next set of files from the listing as well, no? >
HDFS v2; only incompatible change between v1 and v2 FileSystem class. Chatty over long haul and hangs Amazon S3:// an issue for which there's a patch to replicate but not fix the problem https://issues.apache.org/jira/browse/HADOOP-9410 Good local -but I think it needs test coverage for all the other filesystem clients that ship w/ Hadoop FWIW, blobstores do tend to only support paged lists of their blobs, so the same build-up-as-you-go-along process works there. We should spell out in the documentation "changes that occur to the filesystem during the generation of this list MAY not be reflected in the result, and so MAY result in a partially incomplete or inconsistent view". -Steve