MapReduce has some very demanding file operations at at job submission time, and is an important integration test for any hadoop compliant FileSystem.
How in the hadoop HDFS source tree are mapreduce integration tests implemented (also I'm interested in how S3, Swift, and other Hadoop filesystem implementations) implement integration tests... if any clues. I haven't seen much in this way in the source code. To distill my quesiton: how are new HDFS patches tested against the MapReduce job flow --- ? --- are there some standalone vanilla MapReduce jobs (Sorting, WordCount, etc..., that run as part of the HDFS build or is there an HDFS-Mapreduce integration repository ?) -- Jay Vyas http://jayunit100.blogspot.com