I'm still having tons of problems with repairs and compactions, where the nodes are declared dead in their log files, although they were online at all times.

This leads to problem behavior, i.e. once again I see that repair fails, and the cluster becomes unusable since there is no space to compact, with all the garbage left.

There are no signs of GC in the logs, I had increased the verbosity.

The first piece of log is from the node #2 (IP address ending in 194) where a repair was attempted (and ran for a while),
the second after that -- from node #1 (IP address ending in 192)

#2 ======================================================================================================================= INFO [HintedHandoff:1] 2011-12-07 19:00:37,134 HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rows to endpoint /130.199.185.193 INFO [GossipTasks:1] 2011-12-07 19:00:37,136 Gossiper.java (line 697) InetAddress /130.199.185.193 is now dead. INFO [ScheduledTasks:1] 2011-12-07 19:00:37,136 GCInspector.java (line 122) GC for ParNew: 190691 ms for 1 collections, 4752799432 used; max is 8178892800 ERROR [GossipTasks:1] 2011-12-07 19:00:37,136 AntiEntropyService.java (line 711) Problem during repair session manual-repair-e6264aa6-bc19-487e-8de1-3b6c1e5c735a, endpoint /130.199.185.
193 died
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,136 StatusLogger.java (line 50) Pool Name Active Pending Blocked INFO [ScheduledTasks:1] 2011-12-07 19:00:37,138 StatusLogger.java (line 65) ReadStage 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,139 StatusLogger.java (line 65) RequestResponseStage 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,140 StatusLogger.java (line 65) ReadRepairStage 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,140 StatusLogger.java (line 65) MutationStage 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,141 StatusLogger.java (line 65) ReplicateOnWriteStage 0 0 0 INFO [GossipStage:1] 2011-12-07 19:00:37,141 Gossiper.java (line 683) InetAddress /130.199.185.193 is now UP INFO [GossipStage:1] 2011-12-07 19:00:37,142 Gossiper.java (line 683) InetAddress /130.199.185.195 is now UP INFO [ScheduledTasks:1] 2011-12-07 19:00:37,142 StatusLogger.java (line 65) GossipStage 1 335 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,142 StatusLogger.java (line 65) AntiEntropyStage 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java (line 65) MigrationStage 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java (line 65) StreamStage 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java (line 65) MemtablePostFlusher 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java (line 65) FlushWriter 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java (line 65) MiscStage 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java (line 65) FlushSorter 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java (line 65) InternalResponseStage 0 0 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java (line 65) AntiEntropySessions 1 1 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java (line 65) HintedHandoff 1 2 0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java (line 69) CompactionManager n/a 3 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java (line 81) MessagingService n/a 0,0 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java (line 85) ColumnFamily Memtable ops,data Row cache size/cap Key cache size/cap INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java (line 88) system.NodeIdInfo 0,0 0/0 0/1 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java (line 88) system.IndexInfo 0,0 0/0 2/2 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java (line 88) system.LocationInfo 0,0 0/0 1/1 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java (line 88) system.Migrations 0,0 0/0 2/2 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java (line 88) system.HintsColumnFamily 0,0 0/0 2/2 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java (line 88) system.Schema 0,0 0/0 2/2 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,147 StatusLogger.java (line 88) PANDA.indexRegistry 8,0 0/0 0/200000 INFO [ScheduledTasks:1] 2011-12-07 19:00:37,147 StatusLogger.java (line 88) PANDA.jobs 0,0 0/0 200000/200000 ERROR [AntiEntropySessions:5] 2011-12-07 19:00:37,155 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[AntiEntropySessions:5,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Problem during repair session manual-repair-e6264aa6-bc19-487e-8de1-3b6c1e5c735a, endpoint /130.199.185.193 died at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Problem during repair session manual-repair-e6264aa6-bc19-487e-8de1-3b6c1e5c735a, endpoint /130.199.185.193 died at org.apache.cassandra.service.AntiEntropyService$RepairSession.failedNode(AntiEntropyService.java:712) at org.apache.cassandra.service.AntiEntropyService$RepairSession.convict(AntiEntropyService.java:749) at org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:155) at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:527)
        at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:57)

#1 ================================================================================================================================================= INFO [FlushWriter:5] 2011-12-07 17:36:34,694 Memtable.java (line 254) Completed flushing /data/cassandra_data/data/PANDA/jobs.date-g-5836-Data.db (75743 bytes) INFO [Thread-13] 2011-12-07 17:36:34,695 ColumnFamilyStore.java (line 390) Index build of 12+DATE,47+12+DATE,60,71+2+12,71+DATE,DATE, complete INFO [Thread-13] 2011-12-07 17:36:34,695 StreamInSession.java (line 167) Finished streaming session 8671541101513055 from /130.199.185.194 INFO [GossipTasks:1] 2011-12-07 18:54:36,874 Gossiper.java (line 697) InetAddress /130.199.185.194 is now dead. INFO [GossipStage:1] 2011-12-07 18:54:39,800 Gossiper.java (line 683) InetAddress /130.199.185.194 is now UP INFO [HintedHandoff:1] 2011-12-07 18:55:29,858 HintedHandOffManager.java (line 323) Started hinted handoff for endpoint /130.199.185.194 INFO [HintedHandoff:1] 2011-12-07 18:55:29,860 HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rows to endpoint /130.199.185.194 INFO [GossipTasks:1] 2011-12-07 18:55:37,886 Gossiper.java (line 697) InetAddress /130.199.185.194 is now dead. INFO [GossipStage:1] 2011-12-07 18:56:02,038 Gossiper.java (line 683) InetAddress /130.199.185.194 is now UP INFO [HintedHandoff:1] 2011-12-07 18:56:14,026 HintedHandOffManager.java (line 323) Started hinted handoff for endpoint /130.199.185.194 INFO [HintedHandoff:1] 2011-12-07 18:56:14,027 HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rows to endpoint /130.199.185.194 INFO [GossipTasks:1] 2011-12-07 18:57:01,899 Gossiper.java (line 697) InetAddress /130.199.185.194 is now dead. INFO [GossipStage:1] 2011-12-07 18:57:24,648 Gossiper.java (line 683) InetAddress /130.199.185.194 is now UP INFO [GossipTasks:1] 2011-12-07 18:57:58,907 Gossiper.java (line 697) InetAddress /130.199.185.194 is now dead. INFO [HintedHandoff:1] 2011-12-07 18:58:16,741 HintedHandOffManager.java (line 314) Endpoint /130.199.185.194 died before hint delivery, aborting INFO [GossipStage:1] 2011-12-07 19:00:37,136 Gossiper.java (line 683) InetAddress /130.199.185.194 is now UP INFO [HintedHandoff:1] 2011-12-07 19:01:13,000 HintedHandOffManager.java (line 323) Started hinted handoff for endpoint /130.199.185.194 INFO [HintedHandoff:1] 2011-12-07 19:01:13,001 HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rows to endpoint /130.199.185.194



On 12/4/2011 7:17 PM, Peter Schuller wrote:
As a side effect of the failed repair (so it seems) the disk usage on the
affected node prevents compaction from working. It still works on
the remaining nodes (we have 3 total).
Is there a way to scrub the extraneous data?
This is one of the reasons why killing an in-process repair is a bad thing :(

If you do not have enough disk space for any kind of compaction to
work, then no, unfortunately there is no easy way to get rid of the
data.

You can go to extra trouble such as moving the entire node to some
other machine (e.g. firewalled from the cluster) with more disk and
run compaction there and then "move it back", but that is kind of
painful to do. Another option is to decommission the node and replace
it. However, be aware that (1) that leaves the ring with less capacity
for a while, and (2) when you decommission, the data you stream from
that node to others would be artificially inflated due to the repair
so there is some risk of "infecting" the other nodes with a large data
set.

I should mention that if you have no traffic running against the
cluster, one way is to just remove all the data and then run repair
afterwards. But that implies that you're trusting that (1) no reads
are going to the cluster (else you might serve reads based on missing
data) and (2) that you are comfortable with loss of the data on the
node. (2) might be okay if you're e.g. writing at QUORUM at all times
and have RF>= 3 (basically, this is as if the node would have been
lost due to hardware breakage).

A faster way to reconstruct the node would be to delete the data from
your keyspaces (except the system keyspace), start the node (now
missing data), and run 'nodetool rebuild' after
https://issues.apache.org/jira/browse/CASSANDRA-3483 is done. The
patch attached to that ticket should work for 0.8.6 I suspect (but no
guarantees). This also assumes you have no reads running against the
cluster.


Reply via email to