Re: Preservation of Guix Report

Ludovic Courtès Fri, 29 Oct 2021 07:21:24 -0700

Hello!

Timothy Sample <samp...@ngyro.com> skribis:


> Ludovic Courtès <l...@gnu.org> writes:

[...]

>> This is truly awesome!  (Did you manage to grab all that info with the
>> default rate limit?!)
>
> Yes, but I have another trick.  The “known” endpoint [1].  If you
> already know the SWHIDs you want to check, you can check 1,000 per call.
> With the anonymous rate limit, I can check 120,000 every hour, which is
> plenty.
>
> [1] 
> https://docs.softwareheritage.org/devel/swh-web/uri-scheme-api.html#get--api-1-content-known-(sha1)[,(sha1),%20...,(sha1)]-

Oh, smart.

>> Some of our <git-reference> refer to tags, not commits.  How do you
>> determine whether they’re saved?
>
> The short answer is “elbow grease”.  Basically, I’m taking a “work
> harder, not smarter” approach.  :p  I go out and obtain the source,
> verify it with Guix’s hash, and then compute the SWHID.  This is another
> thing we could move to the CI infrastructure, but I think there might be
> some hiccoughs.  For git-references, I believe we can’t just compute the
> ID after the download derivation – we would have to change the download
> derivation itself.  Maybe add an ‘swhid’ output?  It’s a little more
> complicated than just throwing up some scripts, anyway.

Just like we have ‘etc/disarchive-manifest.scm’, we could have a thing
that computes the SWHID of all the ‘git-fetch’ origins, for instance,
using the Disarchive code.  Would that help?

That would allow us to maintain a mapping from nar hash to swh:dir hash.

>> ‘guix lint -c archival’ uses ‘lookup-origin-revision’, which is a good
>> approximation, but it’s not 100% reliable because tags can be modified
>> and that procedure only tells you that a same-named tag was found, not
>> that it’s the commit you were expecting.  (And really, we should stop
>> referring to tags.)
>
> Like zimoun said elsewhere in this thread, having an explicit mapping
> from Guix hash to SHWID will improve reliability quite a bit.  It’s hard
> to get to 100%, though!  With the reports, we will eventually be able to
> check everything.  However, there’s still a small possibility of bugs
> and false positives.  Ultimately, I’m hoping the reports will help
> detect small problems (some specific source is missing) and guide our
> efforts on big problems (xz support in Disarchive or support for more
> version control systems, etc.).

Definitely, thumbs up!

Ludo’.

Re: Preservation of Guix Report

Reply via email to