Reviving an old thread... Tobias Geerinckx-Rice <m...@tobias.gr> writes:
>> IMO, the best solution is to *never* generate nars on Hydra in response >> to client requests, but rather to have the build slaves pack and >> compress the nars, copy them to Hydra, and then serve them as static >> files using nginx. > > A true mirror at last! Do we have the disc space for that? > > And could Hydra actually handle compressing *everything*, without an > infinitely growing back-log? I don't have access to any statistics, but > I'm guessing that a fair number of package+versions are never actually > requested, and hence never compressed. This would change that. Actually, IIUC, the build slaves are _already_ compressing everything, and they always have. They compress the build outputs for transmission back to the master machine. In the current framework, the master machine immediately decompresses them upon receipt, and this compression and decompression is considered an internal detail of the network transport. Currently, the master machine stores all build outputs uncompressed in /gnu/store, and then later recompresses them for transmission to users and other build slaves. The needless decompression and recompression is a tremendous amount of wasted work on our master machine. That it's all stored uncompressed is also a significant waste of disk space, which leads to significant additional costs during garbage collection. Essentially, my proposal is for the build slaves to be modified to prepare the compressed NARs in a form suitable for delivery to end users (and other build slaves) with minimal processing by our master node. The master node would be significantly modified to receive, store, and forward NARs explicitly, without ever decompressing them. As far as I can tell, this would mean strictly less work to do and less data to store for every machine and in every case. Ludovic has pointed out that we cannot do this because Hydra must add its digital signature, and that this digital signature is stored within the compressed NAR. Therefore, we cannot avoid having the master machine decompress and recompress every NAR that is delivered to users. In my opinion, we should change the way we sign NARs. Signatures should be external to the NARs, not internal. Not only would this allow us to decentralize production of our NARs, but more importantly, it would enable a community of independent builders to add their signatures to a common pool of NARs. Having a common pool of NARs enables us to store these NARs in a shared distribution network without duplication. We cannot even have a common pool of NARs if they contain build-farm-specific data such as signatures. Thoughts? Mark