On 13 April 2013 21:46, Jay Vyas <jayunit...@gmail.com> wrote: > MapReduce has some very demanding file operations at at job submission > time, and is an important integration test for any hadoop compliant > FileSystem. > > How in the hadoop HDFS source tree are mapreduce integration tests > implemented (also I'm interested in how S3, Swift, and other Hadoop > filesystem implementations) implement integration tests... if any clues. I > haven't seen much in this way in the source code. >
There are some functional tests for S3 that run by hand; when the swift support goes in that will be skipped unless there is the file test/resources/auth-keys.xml containing the login details. We'd like to add more here in Bigtop, which is downstream enough you can include stuff like Pig, Hive and HBase tests, using different filesystems as source, destination and intermediate files. They should also allow the option of creating big files, many files in a single dir and deeper directories: scale problems. > To distill my quesiton: how are new HDFS patches tested against the > MapReduce job flow --- ? --- are there some standalone vanilla MapReduce > jobs (Sorting, WordCount, etc..., that run as part of the HDFS build or is > there an HDFS-Mapreduce integration repository ?) > > The -common, HDFS and -mapred source trees are in a combined repo so this is implicit, you just go "mvn clean test" at the root. Or have Jenkins do it and send email when you break things steve