>
> Ah, I keep always assuming random partitions since it is a very common
> case (just to be sure: unless you specifically want the ordering
> despite the downsides, you generally want to default to the random
> partitioner).
>
Yes, I'm working on geographical data so everything is keyed by a
derivation of the z-order curve. Reads are such that they read linear
ranges of the values. So although there's data in a lot of places, writes &
reads happen more often in only a subset hence my need to rebalance not by
storage load but by query load. I'm going to have to think about how to do
that..


> > When I get a log like these, there always is a "cluster-freeze" during
> the
> > preceding minute. By "cluster-freeze", I mean that a couple of nodes go
> to
> > 0% utilization (no cpu, no system, no io)
> An hypothesis here is that your workload is causing problem for a node
> (for example, sudden spikes in memory allocation causing full GC
> fallbacks that take time), and both the readers and the writers get
> "stuck" on requests to those nodes (once a sufficient number of
> requests happen to be destined to those). The result would be that all
> other nodes are no longer seeing traffic because the clients aren't
> making progress.
>
I did open 10 windows on my screen in order to view iostat & vmstat in
parallel on all nodes. It's hard to be definitive but I did see moments
where the cluster was "freezing" and the CPU was at 100% for a couple
seconds on one of the nodes. It didn't happen everytime but that may be
because my processes were backing off while failing-over to another node.

So could sending a batch of 90 counter mutations produce such a spike ? It
looks a little small to me but, yes, maxing the batches to 30/40 elements
has eliminated the problem.I looked at the system.log on the node on which
this happened and I only see a ParNew collection taking place at that time
and heap usage is low.
INFO [ScheduledTasks:1] 2011-12-11 15:48:14,641 GCInspector.java (line 122)
GC for ParNew: 334 ms for 1 collections, 4729419512 used; max is 16838033408



> I would first eliminate or confirm any GC hypothesis by running all
> nodes with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
> -XX:+PrintGCDateStamps.
>
Is full GC not being logged through GCInspector with the defaults ?


> If you can see this happen sufficiently often to
> manually/interactively "wait for it", I suggest something as simple as
> fireing up an top + iostat for each host and have them on the screen
> at the same time, and look for what happens when you see this again.
> If the problem is fallback to full GC for example, the affected nodes
> should be churning 100% CPU (one core) for several seconds (assuming a
> large heap). If there is a sudden burst of disk I/O that is causing a
> hiccup (e.g. dirty buffer flushing by linux) this should be visibly
> correlated with 'iostat -x -k 1'.
>
some CPU correlation in some cases (on one node)
no iostat correlation ever


Thanks
Philippe

Reply via email to