Re: backup of backup or alternating backups?

Andy Smith Wed, 09 Oct 2024 03:32:16 -0700

Hi,

On Wed, Oct 09, 2024 at 10:57:12AM +0200, Michel Verdier wrote:
> On 2024-10-08, Andy Smith wrote:
> 
> > When you have hundreds of millions of files in rsnapshot it really
> > starts to hurt because every backup run involves:
> >
> > - Deleting the oldest tree of files;
> 
> rsnapshot can rename it apart and delete it after backup is done. Thus
> involving only the backup system


Yes but this is still a necessary part of each backup cycle. You can't
do another backup run while that job is still outstanding and the load
it puts on the system is still there regardless of the timing within the
backup procedure.

> > - Walking the entire tree of the most recent backup once to cp -l it and
> >   then;
> 
> rsnapshot only renames directories when rotating backups then does rsync
> with hard links to the newest

Okay yes when you set link_dest to 1 in rsnapshot.conf then rsync will
do that bit during its run, but having to hard link a directory tree of
5 million files is not speedy. Other backup designs do not do this
because they don't need to take any form of copies of what is already
there. The point is that this step is "compare and hard link if
unchanged" whereas usually it is "compare and do nothing if unchanged".

> rsync uses metadata so it also depends on the filesystem. Some are
> quicker. I think metadata is quite like the index used by other backup
> systems.

The big difference is that to read the metadata of a tree of files in
the filesystem you have to walk through all the inodes which is a lot of
small random access.

70 years of database theory has tried to make queries efficient and
minimise random access, maximise cache locality etc. Otherwise all
databases would just be filesystems!

Like I say I like and use rsnapshot in some places, but speed and
resource efficiency are not its winning points.

Thanks,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting

Re: backup of backup or alternating backups?

Reply via email to