RE: Tombstoned data seems to remain after compaction

tak...@fujitsu.com Sun, 10 Dec 2017 23:42:23 -0800

Hi Jeff


I’m appreciate for your detailed explanation :)



Ø  Expired data gets purged on compaction as long as it doesn’t overlap with 
other live data. The overlap thing can be difficult to reason about, but it’s 
meant to ensure correctness in the event that you write a value with ttl 180, 
then another value with ttl 1, and you don’t want to remove the value with ttl1 
until you’ve also removed the value with ttl180, since it would lead to data 
being resurrected

I understand that TTL setting sometimes does not work as we expect, especially 
when we alter the
value afterword because of the Cassandra’s data consistency functionalities. My 
understanding is
correct?


And I think of trying sstablesplit utility to let the Cassandra do minor 
compaction because one of
SSTables, which is oldest and very large so I want to compact it.

Do you  think my plan works as expected?




Regards,
Takashima

From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Monday, December 11, 2017 3:36 PM
To: user@cassandra.apache.org
Subject: Re: Tombstoned data seems to remain after compaction

Replies inline


On Dec 10, 2017, at 9:59 PM, "tak...@fujitsu.com<mailto:tak...@fujitsu.com>" 
<tak...@fujitsu.com<mailto:tak...@fujitsu.com>> wrote:
Hi Jeff,



Ø  Are all of your writes TTL’d in this table?
Yes. We set TTL to 180 days at first, and then altered it to just 1 day because 
we noticed the First TTL
setting is too long.


Ok this is different - Kurt’s answer is true when you issue explicit deletes. 
Expiring data is slightly different.

Expired data gets purged on compaction as long as it doesn’t overlap with other 
live data. The overlap thing can be difficult to reason about, but it’s meant 
to ensure correctness in the event that you write a value with ttl 180, then 
another value with ttl 1, and you don’t want to remove the value with ttl1 
until you’ve also removed the value with ttl180, since it would lead to data 
being resurrected

This is the primary reason that ttl’d data doesn’t get cleaned up when people 
expect






Ø  Which compaction strategy are you using?
We use Size Tiered Compaction Strategy.



LCS would compact more aggressively and try to minimize overlaps

TWCS is designed for expiring data and tries to group data by time window for 
more efficient expiration.

You would likely benefit from changing to either of those - but you’ll want to 
try it on a single node first to confirm (should be able to find videos online 
about using JMX to change the compaction strategy of a single node)



Ø  Are you asking these questions because you’re running out of space faster 
than you expect and you’d like to expire data faster?
You’re right. We want to know the reason and how to purge those old data soon 
if possible.
And I want to understand why those old records reported by the sstablemetadata 
command persist in sstable data file in advance.
https://m.youtube.com/watch?v=PWtekUWCIaw


Not to self promote too much, but I’ve given a few talks on running time series 
Cassandra clusters. These slides 
https://www.slideshare.net/mobile/JeffJirsa1/using-time-window-compaction-strategy-for-time-series-workloads
 (in video form here, https://m.youtube.com/watch?v=PWtekUWCIaw ) may be useful.



B.T.W
I’m sorry but please let me ask the question again.
Here is the excerpt of sstablemetadata command below.

Does the section “Estimated tombstone drop times” mean that the sstable 
contains tombstones for those records that should expire
on the date of the 1st column? And the data might exist in other SSTables?

(excerpt)
----
Estimated tombstone drop times:%n
1510934467:      2475 * 2017.11.18
1510965112:       135
1510983500:       225
1511003962:       105
1511021113:      2280
1511037818:        30
1511055563:       120
----





Regards,
Takashima

From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Monday, December 11, 2017 2:35 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Tombstoned data seems to remain after compaction

Mutations read during boot won’t go into the memtable unless the mutation is in 
the commitlog (which usually means fairly recent - they’re a fixed size)

Are all of your writes TTL’d in this table?
Which compaction strategy are you using?
Are you asking these questions because you’re running out of space faster than 
you expect and you’d like to expire data faster?


--
Jeff Jirsa


On Dec 10, 2017, at 9:30 PM, "tak...@fujitsu.com<mailto:tak...@fujitsu.com>" 
<tak...@fujitsu.com<mailto:tak...@fujitsu.com>> wrote:
Hi Kurt,


Thanks for your reply!

“””
The tombstone needs to compact with every SSTable that contains data for the 
corresponding tombstone.
“””

Let me explain my understanding by example:


1.     A record inserted with 180 days TTL (Very long).

2.     The record is saved to SSTable (A) when the server restarts or some 
events like that.

3.     After 180 days pass, The Cassandra process read SSTable (A) on its boot 
process ( or, read access?) and put tombstone for the record on *Memory*.

4.     The tombstone on *Memory* is saved to SSTable (B) the next time the 
server is rebooted.

The procedure above splits the sstable for both the record per se and tombstone.

My understanding is correct?



Regards,
Takashima


From: kurt greaves [mailto:k...@instaclustr.com]
Sent: Monday, December 11, 2017 1:46 PM
To: User <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Tombstoned data seems to remain after compaction

The tombstone needs to compact with every SSTable that contains data for the 
corresponding tombstone. For example the tombstone may be in that SSTable but 
some data the tombstone covers may possibly be in another SSTable. Only once 
all SSTables that contain relevant data have been compacted with the SSTable 
containing the tombstone can the tombstone be removed.

On 11 December 2017 at 01:08, tak...@fujitsu.com<mailto:tak...@fujitsu.com> 
<tak...@fujitsu.com<mailto:tak...@fujitsu.com>> wrote:
Hi All,


I'm using the SSTable with Size Tired Compaction Strategy with
10 days gc grace period as default.

And sstablemetadata command shows Estimated tombstone drop times
As follows after minor compaction on 9th Dec, 2018.

(excerpt)
Estimated tombstone drop times:%n
1510934467:      2475 * 2017.11.18
1510965112:       135
1510983500:       225
1511003962:       105
1511021113:      2280
1511037818:        30
1511055563:       120
1511075445:       165


I just think there are records that should be deleted on
18th Nov, 2018 in the SSTable by the output above. My understanding
is correct?

If my understanding I correct, could someone tell me why those
expired data remains after compation?




Regards,
Takashima

----------------------------------------------------------------------
Toshiaki Takashima
Toyama Fujitsu Limited
+810764553131, ext. 7260292355

----------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org>
For additional commands, e-mail: 
user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org>

RE: Tombstoned data seems to remain after compaction

Reply via email to