On Wed, Jul 6, 2011 at 9:00 AM, Eric Payne <er...@yahoo-inc.com> wrote:
> I will attempt to recreate the tests on 20.203. > > Currently, I'm comparing trunk against branches/MR-279/, and the slowdown > is many times slower. I have run several tests (45 or 50) with different > variables, and they all seem to be slower on trunk. > > Just for example, in one test here are my findings: > > Operation Trunk branches/MR-279/ > ------------------- ----- ---------------- > Average operations per second: 24 200 > Average open execution time: 41ms 5ms > Average deletion time: 43ms 5ms > Average creation time: 47ms 9ms > Average write close time: 658ms 100ms > > Seems pretty bad. Which test case is it that you're running? Something that's easy for others to reproduce? How many concurrent threads access the NN? If you jstack the NN do you see some particular lock causing lots of contention? -Todd > > -----Original Message----- > > From: Todd Lipcon [mailto:t...@cloudera.com] > > Sent: Wednesday, July 06, 2011 10:26 AM > > To: hdfs-dev@hadoop.apache.org > > Subject: Re: HDFS on trunk is now quite slow > > > > On Wed, Jul 6, 2011 at 6:54 AM, Eric Payne <er...@yahoo-inc.com> wrote: > > > > > Thanks Todd. > > > > > > Yes, the stress test is NN-only. The simulated datanodes (using > > > MiniDFSCluster) don't read or write actual data, only log the metadata. > > > > > > So, it sounds like the slowdown on the NN is to be expected, correct? > > The > > > race condition I was experiencing before is no longer happening, so the > > > benefit of correct locking has resulted in an acceptable slowdown on > the > > > namenode. Is that correct? > > > > > > > How does the slowdown compare to 0.20.203 for example? We may have made > > the > > locking _too_ coarse -- ie overcompensated for the bug. > > > > -Todd > > > > > > > > > > Thanks, > > > -Eric > > > > > > > -----Original Message----- > > > > From: Todd Lipcon [mailto:t...@cloudera.com] > > > > Sent: Friday, July 01, 2011 7:49 PM > > > > To: hdfs-dev@hadoop.apache.org > > > > Subject: Re: HDFS on trunk is now quite slow > > > > > > > > My guess is HDFS-988 caused the slowdown by coarsening some locking > > that > > > > was > > > > previously incorrect. Your stress test is NN-only (metadata ops), not > > an > > > > I/O > > > > benchmark, right? I/O should be faster in trunk than ever before. > > > > > > > > -Todd > > > > > > > > On Fri, Jul 1, 2011 at 8:23 AM, Eric Payne <eric.payne1...@yahoo.com > > > > > > wrote: > > > > > > > > > Hi gang, > > > > > > > > > > I ran some stress tests on the latest HDFS trunk yesterday, and the > > > > > performance > > > > > is a lot slower (sometimes 10 times slower) when compared with the > > HDFS > > > > in > > > > > MR-279. The HDFS in MR-279 is slightly behind trunk. The stability > > of > > > > > HDFS trunk > > > > > seems to be better than HDFS MR-279, but I'm not sure if the > > slowness > > > is > > > > > just > > > > > avoiding the race contitions or if they are actually fixed in > trunk. > > > > > > > > > > At this point, I'm not sure what is causing this performance > > disparity. > > > > I > > > > > notice > > > > > that Block management has recently undergone significant changes in > > > > trunk. > > > > > It > > > > > has some new locking and it is now in its own package. Could this > be > > > > part > > > > > of the > > > > > cause? > > > > > > > > > > Thanks, > > > > > -Eric > > > > > > > > > > > > > > > > > > > > -- > > > > Todd Lipcon > > > > Software Engineer, Cloudera > > > > > > > > > > > -- > > Todd Lipcon > > Software Engineer, Cloudera > -- Todd Lipcon Software Engineer, Cloudera