Hi Rob,
I checked tpstats and there are no dropped mutations (though I checked it after 
restating the affected nodes). If the problem occurs again, I will check 
tpstats again. Is there any stat that shows failed hints? The only abnormality 
I see is 1 flush writer got blocked (All time blocked = 1).

Pool Name                    Active   Pending      Completed   Blocked  All 
time blocked
MutationStage                     0         0         955265         0          
       0
ReadStage                         0         0        3287825         0          
       0
RequestResponseStage              0         0        3520467         0          
       0
ReadRepairStage                   0         0         155949         0          
       0
ReplicateOnWriteStage             0         0              0         0          
       0
MiscStage                         0         0              0         0          
       0
HintedHandoff                     0         0            161         0          
       0
FlushWriter                       0         0          55053         0          
       1
MemoryMeter                       0         0          55561         0          
       0
GossipStage                       0         0         276346         0          
       0
CacheCleanupExecutor              0         0              0         0          
       0
InternalResponseStage             0         0              0         0          
       0
CompactionExecutor                0         0         587882         0          
       0
ValidationExecutor                0         0              0         0          
       0
MigrationStage                    0         0              0         0          
       0
commitlog_archiver                0         0              0         0          
       0
AntiEntropyStage                  0         0              0         0          
       0
PendingRangeCalculator            0         0            502         0          
       0
MemtablePostFlusher               0         0          56747         0          
       0

Message type           Dropped
READ                         0
RANGE_SLICE                  0
_TRACE                       0
MUTATION                     0
COUNTER_MUTATION             0
BINARY                       0
REQUEST_RESPONSE             0
PAGED_RANGE                  0
READ_REPAIR                  0



From: Robert Coli <rc...@eventbrite.com<mailto:rc...@eventbrite.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Friday, November 13, 2015 at 5:57 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Deletes Reappeared even when nodes are not down

On Fri, Nov 13, 2015 at 1:47 PM, Peddi, Praveen 
<pe...@amazon.com<mailto:pe...@amazon.com>> wrote:
We do not currently run repairs because we know our deployment time for each 
cassandra node is very short. I do understand we have to run repairs but would 
repair be in the picture here when no nodes in the cluster were down for last 2 
weeks?

The only mechanism Cassandra provides that *ensures* that data doesn't undelete 
itself after gc_grace_seconds is periodic repair.

To expand slightly on what rustyrazorblade says down-thread, you might have :

1) dropped a mutation
2) stored a hint
3) failed to deliver that hint

If that hint was a DELETE, you will unmask the deleted data once 
gc_grace_seconds has passed and the tombstone has been compacted away on other 
nodes.

=Rob

Reply via email to