Oh, thanks Elliott for the explanation!  I had no idea about that little tidbit 
concerning ctime.   Now it all makes sense!

- Max

> On May 28, 2018, at 10:24 pm, Elliott Sims <elli...@backblaze.com> wrote:
> 
> Unix timestamps are a bit odd.  "mtime/Modify" is file changes, 
> "ctime/Change/(sometimes called create)" is file metadata changes, and a link 
> count change is a metadata change.  This seems like an odd decision on the 
> part of GNU tar, but presumably there's a good reason for it.
> 
> When the original sstable is compacted away, it's removed and therefore the 
> link count on the snapshot file is decremented.  The file's contents haven't 
> changed so mtime is identical, but ctime does get updated.  BSDtar doesn't 
> seem to interpret link count changes as a file change, so it's pretty 
> effective as a workaround.
> 
> 
> 
> On Fri, May 25, 2018 at 8:00 PM, Max C <mc_cassan...@core43.com 
> <mailto:mc_cassan...@core43.com>> wrote:
> I looked at the source code for GNU tar, and it looks for a change in the 
> create time or (more likely) a change in the size.
> 
> This seems very strange to me — I would think that creating a snapshot would 
> cause a flush and then once the SSTables are written, hardlinks would be 
> created and the SSTables wouldn't be written to after that.
> 
> Our solution is to wait 5 minutes and retry the tar if an error occurs.  This 
> isn't ideal - but it's the best I could come up with.  :-/
> 
> Thanks Jeff & others for your responses.
> 
> - Max
> 
>> On May 25, 2018, at 5:05pm, Elliott Sims <elli...@backblaze.com 
>> <mailto:elli...@backblaze.com>> wrote:
>> 
>> I've run across this problem before - it seems like GNU tar interprets 
>> changes in the link count as changes to the file, so if the file gets 
>> compacted mid-backup it freaks out even if the file contents are unchanged.  
>> I worked around it by just using bsdtar instead.
>> 
>> On Thu, May 24, 2018 at 6:08 AM, Nitan Kainth <nitankai...@gmail.com 
>> <mailto:nitankai...@gmail.com>> wrote:
>> Jeff,
>> 
>> Shouldn't Snapshot get consistent state of sstables? -tmp file shouldn't 
>> impact backup operation right?
>> 
>> 
>> Regards,
>> Nitan K.
>> Cassandra and Oracle Architect/SME
>> Datastax Certified Cassandra expert
>> Oracle 10g Certified
>> 
>> On Wed, May 23, 2018 at 6:26 PM, Jeff Jirsa <jji...@gmail.com 
>> <mailto:jji...@gmail.com>> wrote:
>> In versions before 3.0, sstables were written with a -tmp filename and 
>> copied/moved to the final filename when complete. This changes in 3.0 - we 
>> write into the file with the final name, and have a journal/log to let uss 
>> know when it's done/final/live.
>> 
>> Therefore, you can no longer just watch for a -Data.db file to be created 
>> and uploaded - you have to watch the log to make sure it's not being written.
>> 
>> 
>> On Wed, May 23, 2018 at 2:18 PM, Max C. <mc_cassan...@core43.com 
>> <mailto:mc_cassan...@core43.com>> wrote:
>> Hi Everyone,
>> 
>> We’ve noticed a few times in the last few weeks that when we’re doing 
>> backups, tar has complained with messages like this:
>> 
>> tar: 
>> /var/lib/cassandra/data/mars/test_instances_by_test_id-6a9440a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db:
>>  file changed as we read it
>> 
>> Any idea what might be causing this?
>> 
>> We’re running Cassandra 3.0.8 on RHEL 7.  Here’s rough pseudocode of our 
>> backup process:
>> 
>> <cronjob set to fire same script at same time on all nodes>
>> SNAPSHOT_NAME=backup_YYYMMDD_HHMMSS
>> nodetool snapshot -t $SNAPSHOT_NAME
>> 
>> for each keyspace
>> - dump schema to “schema.cql"
>> - tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_YYYYMMDD_HHMMSS.tgz 
>> schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME
>> 
>> nodetool clearsnapshot -t $SNAPSHOT_NAME
>> 
>> Thanks.
>> 
>> - Max
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
>> <mailto:user-unsubscr...@cassandra.apache.org>
>> For additional commands, e-mail: user-h...@cassandra.apache.org 
>> <mailto:user-h...@cassandra.apache.org>
>> 
>> 
>> 
>> 
> 
> 

Reply via email to