Hi Ludovic, Ludovic Courtès <ludovic.cour...@inria.fr> writes:
> This job is disassembling all the .tar.gz files packages refer to, using > the recently-added ‘etc/disarchive-manifest.scm’ file: > > https://ci.guix.gnu.org/jobset/disarchive > > It has just succeeded for the first time. :-) Fantastic! I feel bad that I left you holding the bag on this one, though. Sorry. I’ve been a little adrift this summer. Thanks for picking it up! > Where to go from here? Timothy Sample had already set up a Disarchive > database at <https://disarchive.ngyro.com>, which (guix download) uses > as a fallback; I’m not sure exactly how it’s populated. Basically the same as what you are doing now. I have many Cuirass jobs, and I use the build outputs mechanism (mentioned by Mathieu in elsewhere in this thread). I don’t have a “disarchive-collection” job, so I have to use the Cuirass API to dig through the recent build outputs to find new results. This happens from a cron job, which uploads each new result to my server. One simple but satisfying thing that I do is serve the files compressed. That is, they are compressed on disk and nginx just passes them along (using the “gzip_static” module). Because of Disarchive’s verbose and repetitive output format, this makes for a huge reduction in storage requirements. > The goal here would be for the Guix project to set up infrastructure > populating a database automatically and creating backups, possibly via > SWH (we’ll have to discuss it with them). > > A plan we can already deploy would be: > > 1. Add the disarchive.guix.gnu.org DNS entry, pointing to berlin. > > 2. On berlin, add an mcron job that periodically copies the output of > the latest “disarchive-collection” build to a directory, say > /srv/disarchive. Thus, the database would accumulate tarball > metadata over time. > > 3. Add an nginx route so that /srv/disarchive is served at > https://disarchive.guix.gnu.org. > > 4. Add disarchive.guix.gnu.org to (guix download). > > How does that sound? Thoughts? This is great! I can offer some past metadata, too. Specifically, I have ~14000 files that I generated while digging into SWH coverage. (That’s a project I’d like to return to, but I’m still trying to get my head back in the game and pick up where I left off.) -- Tim