Hi Ludovic,

Ludovic Courtès <l...@gnu.org> writes:

[...]

> Would be nice to have a section in the Cookbook on setting up mirror via
> several methods, be it nar-herder or rsync.

Agreed.  That's something I've wanted for a while, and even had started
working on.

>> Searching through the mailing lists, I think there's an option to mirror
>> substitutes using rsync and serve them via 'guix publish'.  Is it
>> possible now?  If so, how to set up this on a foreign distro?

[...]

> We could consider setting up a rsync module on ci.guix (aka. berlin) if
> that helps.

The main issue I had seen (but it seems that'd affect most rsync
mirroring scheme) is that the files could be updated while an rsync
client is syncing them, which could perhaps lead to corrupted/incomplete
nars on a mirror in the worst case?

The way I had envisioned for this to work correctly, with some help from
Btrfs snapshoting, was:

1) Have the rsync daemon run in a containerized process (that's done),
serving immutable snapshots (to avoid the in-flight changing that would
seem problematic).

2) Btrfs would take snapshots regularly, and when a new snapshot is
available, a script would change e.g. the /srv/publish/substitutes mount
point to use it, which would be shared by rsync.

The tricky part would be that it would require using the renameat2
syscall to change the actively used mount point, which we'd need to add
support for in Guile, or use the renameat2 command, from the eponymous
package that I had packaged.  Copying a reply from Zygo from the #btrfs
channel who had helped me devise such scheme, it would require to have the:

[..] old subvol on /.../mount/point, new subvol on /.../newmount/point,
then cd /... and renameat2 mount newmount RENAME_EXCHANGE

Ideally we wouldn't want to slow down the head node with rsync requests,
especially since the cache of nars is huge (tens of terabyte at this
time), so...

3. I had a WIP script for offloading the snapshots to a 2nd machine
using 'btrfs send', and 1 and 2 would run on that second machine
instead.  That's currently impeded by the fact that the snapshots have
become too big to fit on machine 2.  We'd have to stop producing lzip
compressed nars and maybe even shorten retention from 6 months to 4 or 3
and see how manageable the size gets.  Anything above 10 TiB is
unwieldy, I'd say.  Ideally something more like 5 TiB would be
manageable and have some space for growth on hydra-129 (machine B) at
the MDC.

That's sounds complicated, but the nice thing is that it's only
complicated *on our end*.  Users would rsync the substitutes like they
rsync a Debian repository.

Perhaps rsync has already code for dealing with partial uddates while a
transfer occur, and that my ideal but complex scheme above can be
simplified to just: rsync at will?  I'd be interested to know what
experienced system admins have to say about this.

-- 
Thanks,
Maxim

Reply via email to