Hi Ludovic, Ludovic Courtès <l...@gnu.org> writes:
[...] > Would be nice to have a section in the Cookbook on setting up mirror via > several methods, be it nar-herder or rsync. Agreed. That's something I've wanted for a while, and even had started working on. >> Searching through the mailing lists, I think there's an option to mirror >> substitutes using rsync and serve them via 'guix publish'. Is it >> possible now? If so, how to set up this on a foreign distro? [...] > We could consider setting up a rsync module on ci.guix (aka. berlin) if > that helps. The main issue I had seen (but it seems that'd affect most rsync mirroring scheme) is that the files could be updated while an rsync client is syncing them, which could perhaps lead to corrupted/incomplete nars on a mirror in the worst case? The way I had envisioned for this to work correctly, with some help from Btrfs snapshoting, was: 1) Have the rsync daemon run in a containerized process (that's done), serving immutable snapshots (to avoid the in-flight changing that would seem problematic). 2) Btrfs would take snapshots regularly, and when a new snapshot is available, a script would change e.g. the /srv/publish/substitutes mount point to use it, which would be shared by rsync. The tricky part would be that it would require using the renameat2 syscall to change the actively used mount point, which we'd need to add support for in Guile, or use the renameat2 command, from the eponymous package that I had packaged. Copying a reply from Zygo from the #btrfs channel who had helped me devise such scheme, it would require to have the: [..] old subvol on /.../mount/point, new subvol on /.../newmount/point, then cd /... and renameat2 mount newmount RENAME_EXCHANGE Ideally we wouldn't want to slow down the head node with rsync requests, especially since the cache of nars is huge (tens of terabyte at this time), so... 3. I had a WIP script for offloading the snapshots to a 2nd machine using 'btrfs send', and 1 and 2 would run on that second machine instead. That's currently impeded by the fact that the snapshots have become too big to fit on machine 2. We'd have to stop producing lzip compressed nars and maybe even shorten retention from 6 months to 4 or 3 and see how manageable the size gets. Anything above 10 TiB is unwieldy, I'd say. Ideally something more like 5 TiB would be manageable and have some space for growth on hydra-129 (machine B) at the MDC. That's sounds complicated, but the nice thing is that it's only complicated *on our end*. Users would rsync the substitutes like they rsync a Debian repository. Perhaps rsync has already code for dealing with partial uddates while a transfer occur, and that my ideal but complex scheme above can be simplified to just: rsync at will? I'd be interested to know what experienced system admins have to say about this. -- Thanks, Maxim