Hi Guix,

Maxim Cournoyer <maxim.courno...@gmail.com> writes:

> Maybe a GCD proposing modernizing what we put in our git commit
> messages could be devised; there are lots of good guidelines out
> there, such as those used by the Linux kernel and Git projects.

I emphatically support this. I have wanted to bring it up here but I’m
currently a pretty infrequent contributor (at least partially due to
this topic) and didn’t feel it was my place to introduce discussions
like this, so I left it on my backlog for now. Maybe it’s expanding the
scope of the thread here, but it certainly fits the subject line if not
the original proposal for docs improvements 😅. So I’ll dump my thoughts
here. I don’t know if this warrants the ceremony of a GCD (I kind of
think it shouldn’t) but would be happy to contribute to one if folks
think so.

As a project we have already demonstrated a desire to improve the
learning curve for new contributors with the Codeberg migration. But
there is lower hanging fruit to pick in the “first-time contributor
experience” (FTCX?). The GNU ChangeLog format in particular feels really
dated coming from other projects, even email-based ones like Linux and
Git. IMO the Linux guidelines at
<https://www.kernel.org/doc/html/latest/process/submitting-patches.html>
are quite good, but pretty much anything would be an improvement.

Here’s a recent commit I made to the fzf package with a decent commit
message. Sharlatan appreciated it, anyway
(<https://codeberg.org/guix/guix/pulls/757#issuecomment-5529149>):
<https://codeberg.org/guix/guix/commit/09f191efbed13bbe833a1b3eccd92c58e8c0c65d>

In the message I establish the context for why the change is necessary
and briefly describe how the change addresses the issue (search for
‘install it’). The GNU ChangeLog portion is just a restatement of the
subject to fit the format but doesn’t actually add any information.

As Efraim mentions elsewhere in the thread, reasons for changes often
belong in code comments. But this is not always the case. Comments are
necessary when the code is not intuitive and other contributors are
liable to break the code again when reading it. But this fzf fix is a
good example of the other case, where the new code is equally intuitive,
so the rationale is only necessary for folks digging through git blame.
It’s unlikely someone is going to revert to the old filenames, so we
don’t have to weigh down the package with a long comment about the
intricacies of Bash completion.

To find some examples where I think the GNU ChangeLog format actively
harms commit message information density (which is pretty much the most
important thing about commit messages and is why the imperative mood
convention exists), I ran `git log --grep [Ll]ikewise'. I hope it goes
without saying, but I don’t mean to criticize the writing styles of any
of these commits. They were chosen rather haphazardly (it’s not hard to
find examples). The problems I am discussing are caused entirely by the
format imposing itself upon commit authors.

<https://codeberg.org/guix/guix/commit/8d419976b21d61a1b3c6d919c593cacdf16a37aa>
is a pretty typical example. Anything in a commit message saying “New
variable” or “New file”, or worse, “likewise”, is unnecessary. Git
associates the commit message with the diff, so manually duplicating
filenames in the commit message is noise. It’s already available in
structured form and searchable via log filtering options. It gets worse
with treewide changes like
<https://codeberg.org/guix/guix/commit/b0e7b6992f3f845e83cfbca4d700b51dba50b4d5>.
Doesn’t it feel a bit silly to scroll through this giant list of files
and then see the exact same list (assuming no mistakes were made!) in
the UI?

<https://codeberg.org/guix/guix/commit/ce10e2b3e93476a89cf85838fa21bb8b54268b1f>
is an example where there are more specific per-file messages, but it
still doesn’t clear the bar for information density. Phrases like “use x
instead of y” or “add comment on why x is used” restate the diff
contents, which is unnecessary when the diff comes with the message
anyway, whether in a patch or in a UI like git-show, Magit, or Codeberg.

To be fair, it’s clear why the GNU ChangeLog format is the way that it
is. It arose before Git, and even before any sort of dedicated version
control software was mainstream or taken for granted as it is now. GNU
developers were writing changes directly in ChangeLog files because
there was no out-of-band channel for associating them with particular
changes. But there is now, so we should take full advantage of it.

If you take a look at the
<https://www.gnu.org/prep/standards/html_node/Change-Logs.html> Info node,
the rationale behind continuing to use them in the age of Git is to
support generating ChangeLog files for release tarballs. IMO in 2025
it’s reasonable to expect users who want ChangeLog-tier granularity to
simply use Git. Generated ChangeLog files for Guix would be extremely
difficult to use compared to git blame where the messages are actually
connected to their code changes by commits.

Also, I took a look at the Guix 1.4.0 tarball, and the ChangeLog file
says:

> Normally a ChangeLog is generated at “make dist” time and available in
> source tarballs.

So I think we aren’t really doing this anyway? Regardless, we have NEWS
and channel news for user-facing changes. Generated ChangeLogs would be
completely overwhelming for users.

<https://www.gnu.org/prep/standards/html_node/Change-Log-Concepts.html>
acknowledges the points I’m discussing:

> If a project uses a modern VCS to keep the change log information, as
> described in Change Logs, explicitly listing the files and functions
> that were changed is not strictly necessary, and in some cases (like
> identical mechanical changes in many places) even tedious. It is up to
> you to decide whether to allow your project’s developers to omit the
> list of changed files and functions from the log entries, and whether
> to allow such omissions under some specific conditions.

Hey, it’s up to us! They do go on to mention some benefits. I won’t
quote everything here, but it’s mainly that log filtering based on the
diff might not catch function or macro names. But we already have an
excellent local gitconfig setup that should cover this. Even if we
didn’t have that, I don’t think it’s enough to justify imposing the
boilerplate. With git blame you can always find your way to what you
need regardless.

Finally, I want to be clear that we don’t have to force developers to
stop /using/ the format if we do agree to stop /requiring/ it. If people
have Emacs integrations doing it automatically and lots of muscle memory
built up, that’s fine. I don’t think the format is so harmful that we
need to ban its use. I’m only advocating for not requiring it.

Thanks for reading if you made it this far 😅.

—Liam

PS: I’d really like to drop the double space requirement, too. Everyone
outside GNU (and some inside) has moved on from this. Making
sentence-based Emacs commands work slightly better is not reason enough
to impose this on new contributors who have no muscle memory for it. We
don’t have to about-face and start imposing single spaces, either. It’s
fine to have a mix based on individual contributor preferences for the
foreseeable future (spoiler alert, plenty of single-spaced prose slips
through to master already). Aesthetics won’t suffer appreciably, IMO.
It’s simply not worth bogging down contributions with nits like sentence
spacing. The HTML export most newcomers are reading already collapses
them, and I find claims about monospaced readability dubious given that
the rest of the developer world has managed to move on from the
typewriter convention.

Reply via email to