Hello Aaron,
it's probably the over-optimistic number of concurrent compactors that
was tripping the system.
I do not entirely understand what's the correlation here, maybe it's
that the compactors were overloading
the neighboring nodes causing time-outs. I tuned the concurrency down
and after a while things seem
to have settled down, thanks for the suggestion.
Maxim
On 4/19/2012 4:13 PM, aaron morton wrote:
1150 pending tasks, and is not
making progress.
Not all pending tasks reported by nodetool compactionstats actually
run. Once they get a chance to run the files they were going to work
on may have already been compacted.
Given that repair tests at double the phi threshold, it may not make
much difference.
Did other nodes notice it was dead ? Was there anything in the log
that showed it was under duress (GC or dropped message logs) ?
Is the compaction a consequence of repair ? (The streaming stage can
result in compactions). Or do you think the node is just behind on
compactions ?
If you feel compaction is hurting the node, consider
setting concurrent_compactors in the yaml to 2.
You can also isolate the node from updates using nodetool
disablegossip and disablerthrift , and the turn off the IO limiter
with nodetool setcompactionthroughput 0.
Hope that helps.
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 20/04/2012, at 12:29 AM, Maxim Potekhin wrote:
Hello Aaron,
how should I go about fixing that? Also, after a repeated attempt to
compact
it goes again into "building secondary index" with 1150 pending
tasks, and is not
making progress. I suspected the disk system failure, but this needs
to be confirmed.
So basically, do I need to tune the phi threshold up? Thing is, there
was no heavy load
on the cluster at all.
Thanks
Maxim
On 4/19/2012 7:06 AM, aaron morton wrote:
At some point the gossip system on the node this log is from decided
that 130.199.185.195 was DOWN. This was based on how often the node
was gossiping to the cluster.
The active repair session was informed. And to avoid failing the job
unnecessarily it tested that the errant nodes phi value was twice
the configured phi_convict_threshold. It was and the repair was killed.
Take a look at the logs on 130.199.185.195 and see if anything was
happening on the node at the same time. Could be GC or an
overloaded node (it would log about dropped messages).
Perhaps other nodes also saw 130.199.185.195 as down? it only needed
to be down for a few seconds.
Hope that helps.
-----------------
Aaron Morton