I wrote a script to record the tpstats output every 5 seconds. Here is the output just before the jvm OOM:
Pool Name Active Pending Completed FILEUTILS-DELETE-POOL 0 0 280 STREAM-STAGE 0 0 0 RESPONSE-STAGE 0 0 245573 ROW-READ-STAGE 0 0 0 LB-OPERATIONS 0 0 0 MESSAGE-DESERIALIZER-POOL 1 14290091 65943291 GMFD 0 0 26670 LB-TARGET 0 0 0 CONSISTENCY-MANAGER 0 0 0 ROW-MUTATION-STAGE 32 3349 63897493 MESSAGE-STREAMING-POOL 0 0 3 LOAD-BALANCER-STAGE 0 0 0 FLUSH-SORTER-POOL 0 0 0 MEMTABLE-POST-FLUSHER 0 0 420 FLUSH-WRITER-POOL 0 0 420 AE-SERVICE-STAGE 1 1 4 HINTED-HANDOFF-POOL 0 0 52 On Tue, Apr 27, 2010 at 10:53 AM, Chris Goffinet <goffi...@digg.com> wrote: > I'll work on doing more tests around this. In 0.5 we used a different data > structure that required polling. But this does seem problematic. > > -Chris > > On Apr 26, 2010, at 7:04 PM, Eric Yu wrote: > > I have the same problem here, and I analysised the hprof file with mat, as > you said, LinkedBlockQueue used 2.6GB. > I think the ThreadPool of cassandra should limit the queue size. > > cassandra 0.6.1 > > java version > $ java -version > java version "1.6.0_20" > Java(TM) SE Runtime Environment (build 1.6.0_20-b02) > Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode) > > iostat > $ iostat -x -l 1 > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await svctm %util > sda 81.00 8175.00 224.00 17.00 23984.00 2728.00 221.68 > 1.01 1.86 0.76 18.20 > > tpstats, of coz, this node is still alive > $ ./nodetool -host localhost tpstats > Pool Name Active Pending Completed > FILEUTILS-DELETE-POOL 0 0 1281 > STREAM-STAGE 0 0 0 > RESPONSE-STAGE 0 0 473617241 > ROW-READ-STAGE 0 0 0 > LB-OPERATIONS 0 0 0 > MESSAGE-DESERIALIZER-POOL 0 0 718355184 > GMFD 0 0 132509 > LB-TARGET 0 0 0 > CONSISTENCY-MANAGER 0 0 0 > ROW-MUTATION-STAGE 0 0 293735704 > MESSAGE-STREAMING-POOL 0 0 6 > LOAD-BALANCER-STAGE 0 0 0 > FLUSH-SORTER-POOL 0 0 0 > MEMTABLE-POST-FLUSHER 0 0 1870 > FLUSH-WRITER-POOL 0 0 1870 > AE-SERVICE-STAGE 0 0 5 > HINTED-HANDOFF-POOL 0 0 21 > > > On Tue, Apr 27, 2010 at 3:32 AM, Chris Goffinet <goffi...@digg.com> wrote: > >> Upgrade to b20 of Sun's version of JVM. This OOM might be related to >> LinkedBlockQueue issues that were fixed. >> >> -Chris >> >> >> 2010/4/26 Roland Hänel <rol...@haenel.me> >> >>> Cassandra Version 0.6.1 >>> OpenJDK Server VM (build 14.0-b16, mixed mode) >>> Import speed is about 10MB/s for the full cluster; if a compaction is >>> going on the individual node is I/O limited >>> tpstats: caught me, didn't know this. I will set up a test and try to >>> catch a node during the critical time. >>> >>> Thanks, >>> Roland >>> >>> >>> 2010/4/26 Chris Goffinet <goffi...@digg.com> >>> >>> Which version of Cassandra? >>>> Which version of Java JVM are you using? >>>> What do your I/O stats look like when bulk importing? >>>> When you run `nodeprobe -host XXXX tpstats` is any thread pool backing >>>> up during the import? >>>> >>>> -Chris >>>> >>>> >>>> 2010/4/26 Roland Hänel <rol...@haenel.me> >>>> >>>> I have a cluster of 5 machines building a Cassandra datastore, and I >>>>> load bulk data into this using the Java Thrift API. The first ~250GB runs >>>>> fine, then, one of the nodes starts to throw OutOfMemory exceptions. I'm >>>>> not >>>>> using and row or index caches, and since I only have 5 CF's and some 2,5 >>>>> GB >>>>> of RAM allocated to the JVM (-Xmx2500M), in theory, that should happen. >>>>> All >>>>> inserts are done with consistency level ALL. >>>>> >>>>> I hope with this I have avoided all the 'usual dummy errors' that lead >>>>> to OOM's. I have begun to troubleshoot the issue with JMX, however, it's >>>>> difficult to catch the JVM in the right moment because it runs well for >>>>> several hours before this thing happens. >>>>> >>>>> One thing gets to my mind, maybe one of the experts could confirm or >>>>> reject this idea for me: is it possible that when one machine slows down a >>>>> little bit (for example because a big compaction is going on), the >>>>> memtables >>>>> don't get flushed to disk as fast as they are building up under the >>>>> continuing bulk import? That would result in a downward spiral, the system >>>>> gets slower and slower on disk I/O, but since more and more data arrives >>>>> over Thrift, finally OOM. >>>>> >>>>> I'm using the "periodic" commit log sync, maybe also this could create >>>>> a situation where the commit log writer is too slow to catch up with the >>>>> data intake, resulting in ever growing memory usage? >>>>> >>>>> Maybe these thoughts are just bullshit. Let me now if so... ;-) >>>>> >>>>> >>>>> >>>> >>> >> > >