On Sun, Feb 21, 2016 at 12:17:19PM +0300, Alex Kost wrote: > Leo Famulari (2016-02-21 07:35 +0300) wrote: > > > On Thu, Jan 21, 2016 at 10:05:36PM +0100, Ludovic Courtès wrote: > [...] > >> I prefer 7! This is how Git usually truncates SHA1s, so it can’t be wrong. > > > > I stumbled across this email earlier, which reminded me of this > > discussion about hash lengths: > > https://lkml.org/lkml/2010/10/28/287 > > > > There are currently 13 7-character hash collisions in Guix's git repo: > > > > $ git rev-list --objects --all | cut -c1-7 | sort | uniq -dc > > 2 0d2b24c > > 2 11e0632 > > 2 1f3ab8d > > 2 229bd6c > > 2 7c4a7b7 > > 2 9ff8b63 > > 2 aa27b56 > > 2 c10c562 > > 2 d96cdce > > 2 dab4329 > > 2 dc27d1c > > 2 ea119a2 > > 2 f56cc27 > > Hm, when I tried "git rev-list --objects --all" I got some ridiculous > number of lines (I pressed C-c C-c after about 78000 lines). Does this > command really do what you wanted? (I'm sorry I didn't RTFM well enough > to understand what it does).
It lists the objects in the repository, so not just commits. I'm not presenting this as evidence that something is wrong with our repo, just that 7 characters is not enough to unambiguously refer to things in git repos of projects our size. For example, with `git show`. It's more of an informational link than a call to action. Although I am updating all of our uses of git-reference to use the method we agreed upon upthread, before any of those upstream repos grow too large for the identifiers we are currently using. > > I'm not sure if the following command is correct to find such > collisions, but it gives nothing (i.e., no collisions): > > git log --oneline | cut -c1-7 | sort | uniq -dc Indeed, we only have about 10000 commits. 6 characters is where we get collisions in our log. > > -- > Alex