Re: sstables changing in snapshots

Dinesh Joshi Fri, 18 Mar 2022 19:41:21 -0700

Do you have a repro that you can share with us? If so, please file a jira and 
we'll take a look.


> On Mar 18, 2022, at 12:15 PM, James Brown <[email protected]> wrote:
> 
> This in 4.0.3 after running nodetool snapshot that we're seeing sstables 
> change, yes.
> 
> James Brown
> Infrastructure Architect @ easypost.com <http://easypost.com/>
> 
> On 2022-03-18 at 12:06:00, Jeff Jirsa <[email protected] 
> <mailto:[email protected]>> wrote:
>> This is nodetool snapshot yes? 3.11 or 4.0?
>> 
>> In versions prior to 3.0, sstables would be written with -tmp- in the name, 
>> then renamed when complete, so an sstable definitely never changed once it 
>> had the final file name. With the new transaction log mechanism, we use one 
>> name and a transaction log to note what's in flight and what's not, so if 
>> the snapshot system is including sstables being written (from flush, from 
>> compaction, or from streaming), those aren't final and should be skipped.
>> 
>> 
>> 
>> 
>> On Fri, Mar 18, 2022 at 11:46 AM James Brown <[email protected] 
>> <mailto:[email protected]>> wrote:
>> We use the boring combo of cassandra snapshots + tar to backup our cassandra 
>> nodes; every once in a while, we'll notice tar failing with the following:
>> 
>> tar: 
>> data/addresses/addresses-eb0196100b7d11ec852b1541747d640a/snapshots/backup20220318183708/nb-167-big-Data.db:
>>  file changed as we read it
>> 
>> I find this a bit perplexing; what would cause an sstable inside a snapshot 
>> to change? The only thing I can think of is an incremental repair changing 
>> the "repaired_at" flag on the sstable, but it seems like that should 
>> "un-share" the hardlinked sstable rather than running the risk of mutating a 
>> snapshot.
>> 
>> 
>> James Brown
>> Cassandra admin @ easypost.com <http://easypost.com/>

Re: sstables changing in snapshots

Reply via email to