Yes as per code you cannot delete hints for endpoints which are not part of the ring.
if (!StorageService.instance.getTokenMetadata().isMember(endpoint)) return; On Mon, Jan 20, 2014 at 12:34 PM, Allan C <alla...@gmail.com> wrote: > There are 3 other nodes that have a mild case. This is one node is worse > by an order of magnitude. deleteHintsForEndpoint fails with the same error > on any of the affected nodes. > > -Allan > > On January 20, 2014 at 12:24:33 PM, sankalp kohli > (kohlisank...@gmail.com<//kohlisank...@gmail.com>) > wrote: > > Is this happening in one node or all. Did you try to delete the hints via > JMX in other nodes? > > > On Mon, Jan 20, 2014 at 12:18 PM, Allan C <alla...@gmail.com> wrote: > >> Hi , >> >> I’m hitting a very odd issue with HintedHandoff on 1 node in my 12 node >> cluster running 1.2.13. Somehow it’s holding a large amount of hints for >> tokens that have never been part of the cluster. Pretty sure this is >> causing a bunch of memory pressure somehow that’s causing the node to go >> down. >> >> I’d like to find out if I can just reset by deleting the hints CF or if >> there’s actually important data in there. I’m tempted to clear the CF and >> hope that fixes it, but a few nodes have been up and down (especially this >> one) since my last repair and I worry that I won’t be able to get through a >> full repair given the problems with the node currently. >> >> Here’s what I see so far: >> >> >> * listEndpointsPendingHints returns a list of about 20 tokens that are >> not part of the ring and have never been part of it. I’m not using vnodes, >> fwiw. deleteHintsForEndpoint doesn’t work. It tells me that the there’s no >> host for the token. >> >> >> * The hints CF is oddly large: >> >> Column Family: hints >> SSTable count: 260 >> Space used (live): 124904685 >> Space used (total): 124904685 >> SSTable Compression Ratio: 0.394676439667606 >> Number of Keys (estimate): 66560 >> Memtable Columns Count: 0 >> Memtable Data Size: 0 >> Memtable Switch Count: 14 >> Read Count: 113 >> Read Latency: 757.123 ms. >> Write Count: 987 >> Write Latency: 0.044 ms. >> Pending Tasks: 0 >> Bloom Filter False Positives: 10 >> Bloom Filter False Ratio: 0.00209 >> Bloom Filter Space Used: 6528 >> Compacted row minimum size: 36 >> Compacted row maximum size: 107964792 >> Compacted row mean size: 787505 >> Average live cells per slice (last five minutes): 0.0 >> >> >> * I get this assertion in the logs often: >> >> ERROR [CompactionExecutor:81] 2014-01-20 >> 12:31:22,652<http://airmail.calendar/2014-01-20%2012:31:22%20PST> >> CassandraDaemon.java >> (line 191) Exception in thread Thread[CompactionExecutor:81,1,main] >> java.lang.AssertionError: originally calculated column size of 71868452 >> but now it is 71869026 >> at >> org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135) >> at >> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) >> at >> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162) >> at >> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) >> at >> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) >> at >> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) >> at >> org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(CompactionManager.java:442) >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) >> at >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> ERROR [HintedHandoff:52] 2014-01-20 >> 12:31:22,652<http://airmail.calendar/2014-01-20%2012:31:22%20PST> >> CassandraDaemon.java >> (line 191) Exception in thread Thread[HintedHandoff:52,1,main] >> java.lang.RuntimeException: java.util.concurrent.ExecutionException: >> java.lang.AssertionError: originally calculated column size of 71868452 but >> now it is 71869026 >> at >> org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:436) >> at >> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:282) >> at >> org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:90) >> at >> org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:502) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> Caused by: java.util.concurrent.ExecutionException: >> java.lang.AssertionError: originally calculated column size of 71868452 but >> now it is 71869026 >> at >> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) >> at java.util.concurrent.FutureTask.get(FutureTask.java:83) >> at >> org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:432) >> ... 6 more >> Caused by: java.lang.AssertionError: originally calculated column size of >> 71868452 but now it is 71869026 >> at >> org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135) >> at >> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) >> at >> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162) >> at >> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) >> at >> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) >> at >> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) >> at >> org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(CompactionManager.java:442) >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) >> at >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> ... 3 more >> >> >> * I see a similar error when I try to compact the hints CF, even when I >> set in_memory_compaction_limit_in_mb as high as 1024. >> >> This started after I had brought up a few new nodes last week and then >> decommissioned them a few days later. The adding and decommissioning >> appeared to go uneventfully. >> >> >> If anyone has seen anything like this or can give me some hints on how to >> determine if the hints can be deleted, I’d greatly appreciate it. >> >> -Allan >> > >