Re: Repair failure under 0.8.6

Maxim Potekhin Wed, 07 Dec 2011 16:12:05 -0800

I'm still having tons of problems with repairs and compactions, wherethe nodesare declared dead in their log files, although they were online at alltimes.

This leads to problem behavior, i.e. once again I see that repair fails,and the clusterbecomes unusable since there is no space to compact, with all thegarbage left.


There are no signs of GC in the logs, I had increased the verbosity.

The first piece of log is from the node #2 (IP address ending in 194)where a repair was attempted (and ran for a while),

the second after that -- from node #1 (IP address ending in 192)

#2=======================================================================================================================INFO [HintedHandoff:1] 2011-12-07 19:00:37,134HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rowsto endpoint /130.199.185.193INFO [GossipTasks:1] 2011-12-07 19:00:37,136 Gossiper.java (line 697)InetAddress /130.199.185.193 is now dead.INFO [ScheduledTasks:1] 2011-12-07 19:00:37,136 GCInspector.java (line122) GC for ParNew: 190691 ms for 1 collections, 4752799432 used; max is8178892800ERROR [GossipTasks:1] 2011-12-07 19:00:37,136 AntiEntropyService.java(line 711) Problem during repair sessionmanual-repair-e6264aa6-bc19-487e-8de1-3b6c1e5c735a, endpoint /130.199.185.

193 died

INFO [ScheduledTasks:1] 2011-12-07 19:00:37,136 StatusLogger.java(line 50) Pool Name Active Pending BlockedINFO [ScheduledTasks:1] 2011-12-07 19:00:37,138 StatusLogger.java(line 65) ReadStage 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,139 StatusLogger.java(line 65) RequestResponseStage 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,140 StatusLogger.java(line 65) ReadRepairStage 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,140 StatusLogger.java(line 65) MutationStage 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,141 StatusLogger.java(line 65) ReplicateOnWriteStage 0 0 0INFO [GossipStage:1] 2011-12-07 19:00:37,141 Gossiper.java (line 683)InetAddress /130.199.185.193 is now UPINFO [GossipStage:1] 2011-12-07 19:00:37,142 Gossiper.java (line 683)InetAddress /130.199.185.195 is now UPINFO [ScheduledTasks:1] 2011-12-07 19:00:37,142 StatusLogger.java(line 65) GossipStage 1 335 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,142 StatusLogger.java(line 65) AntiEntropyStage 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java(line 65) MigrationStage 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java(line 65) StreamStage 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java(line 65) MemtablePostFlusher 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java(line 65) FlushWriter 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java(line 65) MiscStage 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java(line 65) FlushSorter 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java(line 65) InternalResponseStage 0 0 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java(line 65) AntiEntropySessions 1 1 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java(line 65) HintedHandoff 1 2 0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java(line 69) CompactionManager n/a 3INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java(line 81) MessagingService n/a 0,0INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java(line 85) ColumnFamily Memtable ops,data Row cachesize/cap Key cache size/capINFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java(line 88) system.NodeIdInfo 0,00/0 0/1INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java(line 88) system.IndexInfo 0,00/0 2/2INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java(line 88) system.LocationInfo 0,00/0 1/1INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java(line 88) system.Migrations 0,00/0 2/2INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java(line 88) system.HintsColumnFamily 0,00/0 2/2INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java(line 88) system.Schema 0,00/0 2/2INFO [ScheduledTasks:1] 2011-12-07 19:00:37,147 StatusLogger.java(line 88) PANDA.indexRegistry 8,00/0 0/200000INFO [ScheduledTasks:1] 2011-12-07 19:00:37,147 StatusLogger.java(line 88) PANDA.jobs 0,00/0 200000/200000ERROR [AntiEntropySessions:5] 2011-12-07 19:00:37,155AbstractCassandraDaemon.java (line 139) Fatal exception in threadThread[AntiEntropySessions:5,5,RMI Runtime]java.lang.RuntimeException: java.io.IOException: Problem during repairsession manual-repair-e6264aa6-bc19-487e-8de1-3b6c1e5c735a, endpoint/130.199.185.193 diedatorg.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)atjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)atjava.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

        at java.util.concurrent.FutureTask.run(FutureTask.java:138)

atjava.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

        at java.lang.Thread.run(Thread.java:662)

Caused by: java.io.IOException: Problem during repair sessionmanual-repair-e6264aa6-bc19-487e-8de1-3b6c1e5c735a, endpoint/130.199.185.193 diedatorg.apache.cassandra.service.AntiEntropyService$RepairSession.failedNode(AntiEntropyService.java:712)atorg.apache.cassandra.service.AntiEntropyService$RepairSession.convict(AntiEntropyService.java:749)atorg.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:155)atorg.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:527)

        at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:57)

#1=================================================================================================================================================INFO [FlushWriter:5] 2011-12-07 17:36:34,694 Memtable.java (line 254)Completed flushing/data/cassandra_data/data/PANDA/jobs.date-g-5836-Data.db (75743 bytes)INFO [Thread-13] 2011-12-07 17:36:34,695 ColumnFamilyStore.java (line390) Index build of 12+DATE,47+12+DATE,60,71+2+12,71+DATE,DATE, completeINFO [Thread-13] 2011-12-07 17:36:34,695 StreamInSession.java (line167) Finished streaming session 8671541101513055 from /130.199.185.194INFO [GossipTasks:1] 2011-12-07 18:54:36,874 Gossiper.java (line 697)InetAddress /130.199.185.194 is now dead.INFO [GossipStage:1] 2011-12-07 18:54:39,800 Gossiper.java (line 683)InetAddress /130.199.185.194 is now UPINFO [HintedHandoff:1] 2011-12-07 18:55:29,858HintedHandOffManager.java (line 323) Started hinted handoff for endpoint/130.199.185.194INFO [HintedHandoff:1] 2011-12-07 18:55:29,860HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rowsto endpoint /130.199.185.194INFO [GossipTasks:1] 2011-12-07 18:55:37,886 Gossiper.java (line 697)InetAddress /130.199.185.194 is now dead.INFO [GossipStage:1] 2011-12-07 18:56:02,038 Gossiper.java (line 683)InetAddress /130.199.185.194 is now UPINFO [HintedHandoff:1] 2011-12-07 18:56:14,026HintedHandOffManager.java (line 323) Started hinted handoff for endpoint/130.199.185.194INFO [HintedHandoff:1] 2011-12-07 18:56:14,027HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rowsto endpoint /130.199.185.194INFO [GossipTasks:1] 2011-12-07 18:57:01,899 Gossiper.java (line 697)InetAddress /130.199.185.194 is now dead.INFO [GossipStage:1] 2011-12-07 18:57:24,648 Gossiper.java (line 683)InetAddress /130.199.185.194 is now UPINFO [GossipTasks:1] 2011-12-07 18:57:58,907 Gossiper.java (line 697)InetAddress /130.199.185.194 is now dead.INFO [HintedHandoff:1] 2011-12-07 18:58:16,741HintedHandOffManager.java (line 314) Endpoint /130.199.185.194 diedbefore hint delivery, abortingINFO [GossipStage:1] 2011-12-07 19:00:37,136 Gossiper.java (line 683)InetAddress /130.199.185.194 is now UPINFO [HintedHandoff:1] 2011-12-07 19:01:13,000HintedHandOffManager.java (line 323) Started hinted handoff for endpoint/130.199.185.194INFO [HintedHandoff:1] 2011-12-07 19:01:13,001HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rowsto endpoint /130.199.185.194




On 12/4/2011 7:17 PM, Peter Schuller wrote:

As a side effect of the failed repair (so it seems) the disk usage on the
affected node prevents compaction from working. It still works on
the remaining nodes (we have 3 total).
Is there a way to scrub the extraneous data?

This is one of the reasons why killing an in-process repair is a bad thing :(

If you do not have enough disk space for any kind of compaction to
work, then no, unfortunately there is no easy way to get rid of the
data.

You can go to extra trouble such as moving the entire node to some
other machine (e.g. firewalled from the cluster) with more disk and
run compaction there and then "move it back", but that is kind of
painful to do. Another option is to decommission the node and replace
it. However, be aware that (1) that leaves the ring with less capacity
for a while, and (2) when you decommission, the data you stream from
that node to others would be artificially inflated due to the repair
so there is some risk of "infecting" the other nodes with a large data
set.

I should mention that if you have no traffic running against the
cluster, one way is to just remove all the data and then run repair
afterwards. But that implies that you're trusting that (1) no reads
are going to the cluster (else you might serve reads based on missing
data) and (2) that you are comfortable with loss of the data on the
node. (2) might be okay if you're e.g. writing at QUORUM at all times
and have RF>= 3 (basically, this is as if the node would have been
lost due to hardware breakage).

A faster way to reconstruct the node would be to delete the data from
your keyspaces (except the system keyspace), start the node (now
missing data), and run 'nodetool rebuild' after
https://issues.apache.org/jira/browse/CASSANDRA-3483 is done. The
patch attached to that ticket should work for 0.8.6 I suspect (but no
guarantees). This also assumes you have no reads running against the
cluster.

Re: Repair failure under 0.8.6

Reply via email to