Re: sstables changing in snapshots

Paul Chandler Tue, 22 Mar 2022 09:41:13 -0700

Hi all,

Was there any further progress made on this? Did a Jira get created?


I have been debugging our backup scripts and seem to have found the same 
problem. 

As far as I can work out so far, it seems that this happens when a new snapshot 
is created and the old snapshot is being tarred.

I get a similar message:

/bin/tar: 
var/lib/cassandra/backup/keyspacename/tablename-4eec3b01aba811e896342351775ccc66/snapshots/csbackup_2022-03-22T14\\:04\\:05/nb-523601-big-Data.db:
 file changed as we read it

Thanks 

Paul 



> On 19 Mar 2022, at 02:41, Dinesh Joshi <[email protected]> wrote:
> 
> Do you have a repro that you can share with us? If so, please file a jira and 
> we'll take a look.
> 
>> On Mar 18, 2022, at 12:15 PM, James Brown <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> This in 4.0.3 after running nodetool snapshot that we're seeing sstables 
>> change, yes.
>> 
>> James Brown
>> Infrastructure Architect @ easypost.com <http://easypost.com/>
>> 
>> On 2022-03-18 at 12:06:00, Jeff Jirsa <[email protected] 
>> <mailto:[email protected]>> wrote:
>>> This is nodetool snapshot yes? 3.11 or 4.0?
>>> 
>>> In versions prior to 3.0, sstables would be written with -tmp- in the name, 
>>> then renamed when complete, so an sstable definitely never changed once it 
>>> had the final file name. With the new transaction log mechanism, we use one 
>>> name and a transaction log to note what's in flight and what's not, so if 
>>> the snapshot system is including sstables being written (from flush, from 
>>> compaction, or from streaming), those aren't final and should be skipped.
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Mar 18, 2022 at 11:46 AM James Brown <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> We use the boring combo of cassandra snapshots + tar to backup our 
>>> cassandra nodes; every once in a while, we'll notice tar failing with the 
>>> following:
>>> 
>>> tar: 
>>> data/addresses/addresses-eb0196100b7d11ec852b1541747d640a/snapshots/backup20220318183708/nb-167-big-Data.db:
>>>  file changed as we read it
>>> 
>>> I find this a bit perplexing; what would cause an sstable inside a snapshot 
>>> to change? The only thing I can think of is an incremental repair changing 
>>> the "repaired_at" flag on the sstable, but it seems like that should 
>>> "un-share" the hardlinked sstable rather than running the risk of mutating 
>>> a snapshot.
>>> 
>>> 
>>> James Brown
>>> Cassandra admin @ easypost.com <http://easypost.com/>

Re: sstables changing in snapshots

Reply via email to