Hi,

Ludovic Courtès <l...@gnu.org> writes:

> Hi,
>
> Maxim Cournoyer <maxim.courno...@gmail.com> writes:
>
>>> We could consider setting up a rsync module on ci.guix (aka. berlin) if
>>> that helps.
>>
>> The main issue I had seen (but it seems that'd affect most rsync
>> mirroring scheme) is that the files could be updated while an rsync
>> client is syncing them, which could perhaps lead to corrupted/incomplete
>> nars on a mirror in the worst case?
>
> I don’t think that’s the case because ‘guix publish’ creates narinfos
> and nars atomically (see ‘bake-narinfo+nar’ and ‘compress-nar’).
>
> That is, the worst that can happen is that the rsync client copies
> temporary files corresponding to incomplete nars or narinfos, but these
> files are not going to be served.

OK!  I guess we could try the simplest path first and complicate it if
there are issues with it.  There already is a rsync service exposing
/var/cache/guix/publish on berlin:

--8<---------------cut here---------------start------------->8---
        (rsync-module
         (name "substitutes")
         (file-name "/var/cache/guix/publish"))
--8<---------------cut here---------------end--------------->8---

So it'd be a matter of opening the rsync port in Berlin's firewall/MDC
infra.  I'd have preferred to have this done on hydra-guix-129 still
though, to avoid adding IO load to Berlin, which is critical to keep GC
times low, but given the current size of the publish cache, it lacks the
storage capacity for it (it only has a < 10 TiB SAN slice available, the
SSDs are now used as extra storage for the data service, IIRC).

I guess clients would want to add rsync options to transfer only the
newer items, and only to some maximum amount (e.g. 5 TiB), as > 25 TiB
or similar probably won't be practical for many setups, which would
incur lots of IO on Berlin, another reason for keeping the nars cache
size in check.

-- 
Thanks,
Maxim

Reply via email to