I'd just like to add that I've been reviewing HADOOP-9361 for Steve and think it's pretty close, but would appreciate a look from the broader HDFS audience.
The spec is pretty explicit about just being an attempt to document HDFS' current behavior, so there's no danger of the text locking us into certain things. That said, implementors of non-HDFS filesystems are likely to read these documents before digging through the HDFS code (which Steve has kindly done for us), so it'd be good if we generally agreed on the ideas within. On Fri, Jun 13, 2014 at 12:30 PM, Steve Loughran <ste...@hortonworks.com> wrote: > Something I've been doing as a spare-time activity is trying to specify > what filesystems are expected to do -using HDFS as the reference, being > rigorous about expectations (size of file in ls == size of data in file, > seek(0) always works, seek(-ve) always fails, close() idempotent), slightly > tuning the "other" filesystems to match HDFS, and writing the tests. > > https://issues.apache.org/jira/browse/HADOOP-9361 > > There's a lot more lower-level tests than the JUnit 3.x > FileSystemContractTests -the tests are driven by XML declarations --here's > the ones for the hadoop-common filesystems > > https://github.com/steveloughran/hadoop-trunk/tree/stevel/HADOOP-9361-filesystem-contract/hadoop-common-project/hadoop-common/src/test/resources/contract > > Test that work with remote filesystems, object stores &c also rely on a > resource file test/resources/contract-test.xml. By marking that as > svnignore, & gitignore people can test against s3 without editing > core-site.xml and potentially accidentally checking in private credentials. > The tests are designed to downgrade to skips if the credentials are missing > -so you can see in the junit reports which were not run. > > > All the markdown documents are rendered on github > > https://github.com/steveloughran/hadoop-trunk/tree/stevel/HADOOP-9361-filesystem-contract/hadoop-common-project/hadoop-common/src/site/markdown/filesystem > > Can people review this? > > The biggest change is in s3n, fixing of a recent regression Hadoop 2.4 > triggering NPEs on a seek(0). This is why we need the new tests and the > patch ASAP. Even if the specification is incomplete, I'd like it in there > with the tests so that we can have stricter documentation that what's there > today -with tests to match > > -steve > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. >