I'm still having tons of problems with repairs and compactions, where
the nodes
are declared dead in their log files, although they were online at all
times.
This leads to problem behavior, i.e. once again I see that repair fails,
and the cluster
becomes unusable since there is no space to compact, with all the
garbage left.
There are no signs of GC in the logs, I had increased the verbosity.
The first piece of log is from the node #2 (IP address ending in 194)
where a repair was attempted (and ran for a while),
the second after that -- from node #1 (IP address ending in 192)
#2
=======================================================================================================================
INFO [HintedHandoff:1] 2011-12-07 19:00:37,134
HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rows
to endpoint /130.199.185.193
INFO [GossipTasks:1] 2011-12-07 19:00:37,136 Gossiper.java (line 697)
InetAddress /130.199.185.193 is now dead.
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,136 GCInspector.java (line
122) GC for ParNew: 190691 ms for 1 collections, 4752799432 used; max is
8178892800
ERROR [GossipTasks:1] 2011-12-07 19:00:37,136 AntiEntropyService.java
(line 711) Problem during repair session
manual-repair-e6264aa6-bc19-487e-8de1-3b6c1e5c735a, endpoint /130.199.185.
193 died
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,136 StatusLogger.java
(line 50) Pool Name Active Pending Blocked
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,138 StatusLogger.java
(line 65) ReadStage 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,139 StatusLogger.java
(line 65) RequestResponseStage 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,140 StatusLogger.java
(line 65) ReadRepairStage 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,140 StatusLogger.java
(line 65) MutationStage 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,141 StatusLogger.java
(line 65) ReplicateOnWriteStage 0 0 0
INFO [GossipStage:1] 2011-12-07 19:00:37,141 Gossiper.java (line 683)
InetAddress /130.199.185.193 is now UP
INFO [GossipStage:1] 2011-12-07 19:00:37,142 Gossiper.java (line 683)
InetAddress /130.199.185.195 is now UP
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,142 StatusLogger.java
(line 65) GossipStage 1 335 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,142 StatusLogger.java
(line 65) AntiEntropyStage 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java
(line 65) MigrationStage 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java
(line 65) StreamStage 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java
(line 65) MemtablePostFlusher 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,143 StatusLogger.java
(line 65) FlushWriter 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java
(line 65) MiscStage 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java
(line 65) FlushSorter 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java
(line 65) InternalResponseStage 0 0 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,144 StatusLogger.java
(line 65) AntiEntropySessions 1 1 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java
(line 65) HintedHandoff 1 2 0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java
(line 69) CompactionManager n/a 3
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java
(line 81) MessagingService n/a 0,0
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java
(line 85) ColumnFamily Memtable ops,data Row cache
size/cap Key cache size/cap
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,145 StatusLogger.java
(line 88) system.NodeIdInfo 0,0
0/0 0/1
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java
(line 88) system.IndexInfo 0,0
0/0 2/2
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java
(line 88) system.LocationInfo 0,0
0/0 1/1
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java
(line 88) system.Migrations 0,0
0/0 2/2
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java
(line 88) system.HintsColumnFamily 0,0
0/0 2/2
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,146 StatusLogger.java
(line 88) system.Schema 0,0
0/0 2/2
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,147 StatusLogger.java
(line 88) PANDA.indexRegistry 8,0
0/0 0/200000
INFO [ScheduledTasks:1] 2011-12-07 19:00:37,147 StatusLogger.java
(line 88) PANDA.jobs 0,0
0/0 200000/200000
ERROR [AntiEntropySessions:5] 2011-12-07 19:00:37,155
AbstractCassandraDaemon.java (line 139) Fatal exception in thread
Thread[AntiEntropySessions:5,5,RMI Runtime]
java.lang.RuntimeException: java.io.IOException: Problem during repair
session manual-repair-e6264aa6-bc19-487e-8de1-3b6c1e5c735a, endpoint
/130.199.185.193 died
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Problem during repair session
manual-repair-e6264aa6-bc19-487e-8de1-3b6c1e5c735a, endpoint
/130.199.185.193 died
at
org.apache.cassandra.service.AntiEntropyService$RepairSession.failedNode(AntiEntropyService.java:712)
at
org.apache.cassandra.service.AntiEntropyService$RepairSession.convict(AntiEntropyService.java:749)
at
org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:155)
at
org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:527)
at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:57)
#1
=================================================================================================================================================
INFO [FlushWriter:5] 2011-12-07 17:36:34,694 Memtable.java (line 254)
Completed flushing
/data/cassandra_data/data/PANDA/jobs.date-g-5836-Data.db (75743 bytes)
INFO [Thread-13] 2011-12-07 17:36:34,695 ColumnFamilyStore.java (line
390) Index build of 12+DATE,47+12+DATE,60,71+2+12,71+DATE,DATE, complete
INFO [Thread-13] 2011-12-07 17:36:34,695 StreamInSession.java (line
167) Finished streaming session 8671541101513055 from /130.199.185.194
INFO [GossipTasks:1] 2011-12-07 18:54:36,874 Gossiper.java (line 697)
InetAddress /130.199.185.194 is now dead.
INFO [GossipStage:1] 2011-12-07 18:54:39,800 Gossiper.java (line 683)
InetAddress /130.199.185.194 is now UP
INFO [HintedHandoff:1] 2011-12-07 18:55:29,858
HintedHandOffManager.java (line 323) Started hinted handoff for endpoint
/130.199.185.194
INFO [HintedHandoff:1] 2011-12-07 18:55:29,860
HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rows
to endpoint /130.199.185.194
INFO [GossipTasks:1] 2011-12-07 18:55:37,886 Gossiper.java (line 697)
InetAddress /130.199.185.194 is now dead.
INFO [GossipStage:1] 2011-12-07 18:56:02,038 Gossiper.java (line 683)
InetAddress /130.199.185.194 is now UP
INFO [HintedHandoff:1] 2011-12-07 18:56:14,026
HintedHandOffManager.java (line 323) Started hinted handoff for endpoint
/130.199.185.194
INFO [HintedHandoff:1] 2011-12-07 18:56:14,027
HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rows
to endpoint /130.199.185.194
INFO [GossipTasks:1] 2011-12-07 18:57:01,899 Gossiper.java (line 697)
InetAddress /130.199.185.194 is now dead.
INFO [GossipStage:1] 2011-12-07 18:57:24,648 Gossiper.java (line 683)
InetAddress /130.199.185.194 is now UP
INFO [GossipTasks:1] 2011-12-07 18:57:58,907 Gossiper.java (line 697)
InetAddress /130.199.185.194 is now dead.
INFO [HintedHandoff:1] 2011-12-07 18:58:16,741
HintedHandOffManager.java (line 314) Endpoint /130.199.185.194 died
before hint delivery, aborting
INFO [GossipStage:1] 2011-12-07 19:00:37,136 Gossiper.java (line 683)
InetAddress /130.199.185.194 is now UP
INFO [HintedHandoff:1] 2011-12-07 19:01:13,000
HintedHandOffManager.java (line 323) Started hinted handoff for endpoint
/130.199.185.194
INFO [HintedHandoff:1] 2011-12-07 19:01:13,001
HintedHandOffManager.java (line 379) Finished hinted handoff of 0 rows
to endpoint /130.199.185.194
On 12/4/2011 7:17 PM, Peter Schuller wrote:
As a side effect of the failed repair (so it seems) the disk usage on the
affected node prevents compaction from working. It still works on
the remaining nodes (we have 3 total).
Is there a way to scrub the extraneous data?
This is one of the reasons why killing an in-process repair is a bad thing :(
If you do not have enough disk space for any kind of compaction to
work, then no, unfortunately there is no easy way to get rid of the
data.
You can go to extra trouble such as moving the entire node to some
other machine (e.g. firewalled from the cluster) with more disk and
run compaction there and then "move it back", but that is kind of
painful to do. Another option is to decommission the node and replace
it. However, be aware that (1) that leaves the ring with less capacity
for a while, and (2) when you decommission, the data you stream from
that node to others would be artificially inflated due to the repair
so there is some risk of "infecting" the other nodes with a large data
set.
I should mention that if you have no traffic running against the
cluster, one way is to just remove all the data and then run repair
afterwards. But that implies that you're trusting that (1) no reads
are going to the cluster (else you might serve reads based on missing
data) and (2) that you are comfortable with loss of the data on the
node. (2) might be okay if you're e.g. writing at QUORUM at all times
and have RF>= 3 (basically, this is as if the node would have been
lost due to hardware breakage).
A faster way to reconstruct the node would be to delete the data from
your keyspaces (except the system keyspace), start the node (now
missing data), and run 'nodetool rebuild' after
https://issues.apache.org/jira/browse/CASSANDRA-3483 is done. The
patch attached to that ticket should work for 0.8.6 I suspect (but no
guarantees). This also assumes you have no reads running against the
cluster.