Hello all - As part of a side project, I've been interested in HDFS benchmarking, particularly of the Namenode. To get started, I tried to track down a number of different benchmarks and collect a few observations about each. I've put together a list here:
http://epaulson.github.io/HadoopInternals/benchmarks.html The benchmarks I included were: DFSIO DFSIO-e NNBench and NNBenchWithoutMR S-Live LoadGenerator NNThroughputBenchmark TestEditLog MStress, from Quantcast Ohio State Microbenchmarks SWIM (I also wrote a bit about what else I'd like to see in a NN benchmark) I'd appreciate any corrections, feedback, and pointers to code that I missed! Thanks! -Erik