Ludovic Courtès <l...@gnu.org> writes:

> Hi Tomas,
>
> Somehow nobody acted over this bug report for a long time.  Terrible.
>
> wolf <w...@wolfsden.cz> skribis:
>
>>   There is an assumption made by Guix regarding guile-git, which is not
>>   true.  The problem is demonstrated using my fork, since that is where
>>   I encountered it first, but official Guix will hit the same problem
>>   sooner or later.  I will also provide an independent repository for
>>   the verification.
>>
>>   Guix made a design decision to compare commit objects using eq?, based
>>   on the assumption that guile-git will return the same object for the
>>   same commit.  However that assumption is wrong and can lead to fun
>>   issues like:
>>
>>   ,----
>>   | scheme@(guile-user)> (use-modules (git) (guix git))
>>   | scheme@(guile-user)> (define %repo (repository-open "/tmp/my-fork"))
>>   | scheme@(guile-user)> (define %hash 
>> "d51135e8477118dc63a7e5462355cd27e884f4fb")
>>   | scheme@(guile-user)> (commit-relation
>>   |  (commit-lookup %repo (string->oid %hash))
>>   |  (commit-lookup %repo (string->oid %hash)))
>>   | $5 = unrelated
>>   `----
>
> Ouch.
>
> ‘tests/commit.scm’ in Guile-Git checks that behavior, but maybe it’s
> just a lucky case.
>
> Two questions/comments:
>
>   1. Come up with a similar test (for Guile-Git) where commits are *not*
>      ‘eq?’? (It it enough to create a commit with a log that’s beyond
>      4 KiB?)

Seems to be the case:

--8<---------------cut here---------------start------------->8---
#!/bin/sh
set -eu

d=/tmp/ttest

rm -rf "$d"
mkdir -p "$d"
cd "$d"

git init -q
git commit -q --allow-empty -m "$(seq 0 8192)"
h=$(git rev-parse HEAD)

d="$d" h="$h" guile -c '
  (use-modules (git) (guix git))
  (define %repo (repository-open (getenv "d")))
  (pk (commit-relation (commit-lookup %repo (string->oid (getenv "h")))
                       (commit-lookup %repo (string->oid (getenv "h")))))
'
--8<---------------cut here---------------end--------------->8---

--8<---------------cut here---------------start------------->8---
$ /tmp/x.sh

;;; (unrelated)
--8<---------------cut here---------------end--------------->8---

IIRC from all the back then, the 4kB is a (default) limit for commit
being cache-able.  It technically is on all of commit data, not just the
commit message, but the message is the easiest to influence.

>
>   2. Since Guile-Git has been pretending to provide that eq?-ness
>      guarantee, I’m tempted to fix the problem in Guile-Git, by having a
>      per-repository lookup table (a weak-value hash table mapping OIDs
>      to commits).
>
> How does that sound?

I did not know guile-git was supposed to guarantee it.  I assumed it is
just a binding to libgit2, so I have expected it to have the same
semantics.

I think I would just fix this in Guix, and drop the promise on
guile-git's side.  I am obviously unsure whether people rely on it or
not, but given it does not work I am tempted to guess.  I also took a
quick look, and I did not find this promise actually being documented
anywhere.

The patch on Guix side is pretty simple[0] (ignore the licensing, I have
no problem changing the license if you decide to used it).

However I do see the appeal in being able to use eq?.  The correct
action probably depends on in what direction you want to take guile-git.
Should it stay just a bindings wrapper, or should it provide extra
features and guarantees?

But then again, I do not have much experience with weak tables, and
guile-git internals, so maybe I overestimate the complexity.  I would be
scared I miss some paths that return commits though.

This probably was not very useful answer. :)

Tomas

0: https://git.wolfsden.cz/guix/tree/etc/0001-git-Fix-usage-of-guile-git.patch

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.

Attachment: signature.asc
Description: PGP signature

Reply via email to