> Has anyone else seen nodes "hang" for several seconds like this? I'm > not sure if this is a Java VM issue (e.g. garbage collection) or something
Since garbage collection is logged (if you're running with default settings etc), any multi-second GC:s should be discoverable in said log. So for testing that hypothesis i'd check there first. Cassandra itself logs GC:s, but you can also turn of the JVM:s GC logging by e.g. "-XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimestamps". > I'm also interested in comparing notes with anyone else that has been doing > read/write throughput benchmarks with Cassandara. I did some batch write testing to see how it scaled up to about 200 million rows and 200 gb; I had ocational spikes in latency that were due to disk writes being flushed by the OS. However it was probably exacerbated in this case by the fact that this was ZFS/FreeBSD and ZFS is always (in my humble of opinion, and at least on FreeBSD) exhibiting the behavior for me that it flushes writes too late and end up blocking applications even if you have left-over bandwidth. In my case I "eliminated" the issue for the purpose of my test by having a stupid while loop simply doing "sync" every handful of seconds, to avoid accumulating too much data in the cache. While I expect this to be less of a problem for other setups, it's possible this is what you're seeing. If the operating system is blocking writes to the commit log for example (are you running with periodic fsync or batch wise fsync?). -- / Peter Schuller