> After a memtable flush, you see minimum cpu and maximum read > throughput both in term of disk and cassandra records read. > As memtable increase in size, cpu goes up and read drops. > If this is because of memtable or GC performance issue, this is the > big question. > > As each memtable is just 128MB when flushed, I don't really expect GC > problem or caching issues.
A memtable is basically just a ConcurrentSkipListMap. Unless you are somehow triggering some kind of degenerate casein the CSLM itself, which seems unlikely, the only common circumstance where filling the memtable should be resulting in a very significant performance drop should be if you're running really close to heap size and causing additional GC asymptotally as you're growing the memtable. But that doesn't seem to be the case. I don't know, maybe I missed something in your original post, but I'm not sure what to suggest that I haven't already without further information/hands-on experimentation/observation. But running with verbose GC as I mentioned should at least be a good start (-Xloggc:path/to/gclog -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimestamps). -- / Peter Schuller