did you look at compaction activity? On Mon, Jul 12, 2010 at 9:31 AM, Olivier Rosello <orose...@corp.free.fr> wrote: >> > But in Cassandra output log : >> > r...@cassandra-2:~# tail -f /var/log/cassandra/output.log >> > INFO 15:32:05,390 GC for ConcurrentMarkSweep: 1359 ms, 4295787600 >> reclaimed leaving 1684169392 used; max is 6563430400 >> > INFO 15:32:09,875 GC for ConcurrentMarkSweep: 1363 ms, 4296991416 >> reclaimed leaving 1684201560 used; max is 6563430400 >> > INFO 15:32:14,370 GC for ConcurrentMarkSweep: 1341 ms, 4295467880 >> reclaimed leaving 1684879440 used; max is 6563430400 >> > INFO 15:32:18,906 GC for ConcurrentMarkSweep: 1343 ms, 4296386408 >> reclaimed leaving 1685489208 used; max is 6563430400 >> > INFO 15:32:23,564 GC for ConcurrentMarkSweep: 1511 ms, 4296407088 >> reclaimed leaving 1685488744 used; max is 6563430400 >> > INFO 15:32:28,068 GC for ConcurrentMarkSweep: 1347 ms, 4295383216 >> reclaimed leaving 1686469448 used; max is 6563430400 >> > INFO 15:32:32,617 GC for ConcurrentMarkSweep: 1376 ms, 4295689192 >> reclaimed leaving 1687908304 used; max is 6563430400 >> > INFO 15:32:37,283 GC for ConcurrentMarkSweep: 1468 ms, 4296056176 >> reclaimed leaving 1687916880 used; max is 6563430400 >> > INFO 15:32:41,811 GC for ConcurrentMarkSweep: 1358 ms, 4296412232 >> reclaimed leaving 1688437064 used; max is 6563430400 >> > INFO 15:32:46,436 GC for ConcurrentMarkSweep: 1368 ms, 4296105472 >> reclaimed leaving 1691050032 used; max is 6563430400 >> > INFO 15:32:51,180 GC for ConcurrentMarkSweep: 1545 ms, 4297439832 >> reclaimed leaving 1691033816 used; max is 6563430400 >> > INFO 15:32:55,703 GC for ConcurrentMarkSweep: 1379 ms, 4295491928 >> reclaimed leaving 1692891456 used; max is 6563430400 >> > INFO 15:33:00,328 GC for ConcurrentMarkSweep: 1378 ms, 4296657208 >> reclaimed leaving 1694981528 used; max is 6563430400 >> >> Note that those are ConcurrentMarkSweep GC:s rather than ParNew:s, so >> should be running concurrently with the application and should not >> correlate to 1.3 second pauses for the application. > > When I have this behaviour (ConcurrentMarkSweep, high CPU...) Cassandra is > running but there is no write, no read since hours... (I stopped read & > writes when the behaviour started). > > Even after a wipe of data on all nodes, the behaviour started to happen again > after some hours of writing... :-( > > >> As for the discrepancy between nodes, are all nodes handling a >> similar >> amount of traffic? I briefly checked your original post and you said >> you're doing TimeUUID insertions. I don't remember off hand, and a >> quick google didn't tell me, whether there is something specialy >> about >> the TimeUUID type that would prevent it - but normally if you're >> using >> an OrderedPartitioner you may simply be writing all your data to a >> single node for token space division reasons and the fact that >> timestamps are highly ordered. > > Theorically yes. But in fact, this behaviour happens first to heavier nodes > (those which have the more important quantity of data). > >> How big a latency are we talking about in the cases where you're >> timing out (i.e., what's the timeout)? Were the timeouts on reads, >> writes or both? > > It's TimeOutExceptions on write (using C++ code -> thrift -> cassandra). This > cluster is used at 99% to handle writes. > > How could I get/mesure latency ? > > > Olivier >
-- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com