[PATCH] sequencer.c: fix detection of duplicate s-o-b

2016-03-12 Thread Willy Tarreau
Hi,

after I upgraded my machine, I switched from git 1.7.12.2 to 2.6.4
and experienced an annoying regression when dealing with stable
kernel backports.

I'm using a "dorelease" script which relies on git-cherry-pick's
ability to properly detect duplicate s-o-b to ensure that all merged
commits are properly signed in a release. Today while preparing the
last 2.6.32 release, I did a git log before pushing and found some
commits having two s-o-b lines with myself. I found that these ones
were always those containing some backporting notes between the s-o-b
lines (which we all do in stable branches to indicate what was changed
in the backport process).

I didn't feel brave enough to individually deal with each offending
patch by hand so instead I bisected the git changes and found that the
behaviour changed with commit bab4d10 ("sequencer.c: teach append_signoff
how to detect duplicate s-o-b").

The reason is that function has_conforming_footer() immediately stops
after the first non-conforming line without checking if there are
conforming lines after. But if someone added signed-off-by anywhere
after a non-conforming block, it should always be considered as part
of the footer. Thus I adjusted the logic to check till the end of the
footer and report the presence of valid rfc2822 or cherry-picked lines
after the last non-conformant one and now it correctly handles all types
of commits I had to deal with (ie: only adds s-o-b when it doesn't match
the last one and doesn't add an empty line after a conformant one). For
example, this footer :

Signed-off-by: Mike Galbraith 
[bwh: Backported to 3.2:
 - Adjust numbering in the comment
 - Adjust filename]
Signed-off-by: Ben Hutchings 
Cc: Byungchul Park 
Cc: Peter Zijlstra 
Cc: Willy Tarreau 

Used to be turned into this :

Signed-off-by: Mike Galbraith 
[bwh: Backported to 3.2:
 - Adjust numbering in the comment
 - Adjust filename]
Signed-off-by: Ben Hutchings 
Cc: Byungchul Park 
Cc: Peter Zijlstra 
    Cc: Willy Tarreau 

Signed-off-by: Willy Tarreau 

And is now properly converted to :

Signed-off-by: Mike Galbraith 
[bwh: Backported to 3.2:
 - Adjust numbering in the comment
 - Adjust filename]
Signed-off-by: Ben Hutchings 
Cc: Byungchul Park 
Cc: Peter Zijlstra 
    Cc: Willy Tarreau 
Signed-off-by: Willy Tarreau 

Also, cherry-picking the last commit above again would produce this
before :

Signed-off-by: Mike Galbraith 
[bwh: Backported to 3.2:
 - Adjust numbering in the comment
 - Adjust filename]
Signed-off-by: Ben Hutchings 
Cc: Byungchul Park 
Cc: Peter Zijlstra 
    Cc: Willy Tarreau 
Signed-off-by: Willy Tarreau 

Signed-off-by: Willy Tarreau 

And it now is properly left untouched since the last s-o-b line
is properly matched.

I'm appending the patch, please include it upstream.

Thanks!
Willy


>From be9624a0df4c649d452f898925953a81dc9163fc Mon Sep 17 00:00:00 2001
From: Willy Tarreau 
Date: Sat, 12 Mar 2016 13:35:35 +0100
Subject: sequencer.c: fix detection of duplicate s-o-b

Commit bab4d10 ("sequencer.c: teach append_signoff how to detect
duplicate s-o-b") changed the method used to detect duplicate s-o-b,
but it introduced a regression for a case where some non-compliant
information are present in the footer. In maintenance branches, it's
very common to add some elements after the signed-off and to add your
s-o-b after. This is used a lot in the stable kernel series, for
example this commit backported from 3.2 to 2.6.32 :

ALSA: usb-audio: avoid freeing umidi object twice

commit 07d86ca93db7e5cdf4743564d98292042ec21af7 upstream.

The 'umidi' object will be free'd on the error path by snd_usbmidi_free()
when tearing down the rawmidi interface. So we shouldn't try to free it
in snd_usbmidi_create() after having registered the rawmidi interface.

Found by KASAN.

Signed-off-by: Andrey Konovalov 
Acked-by: Clemens Ladisch 
Signed-off-by: Takashi Iwai 
Signed-off-by: Ben Hutchings 
[wt: file is sound/midi/usbmidi.c in 2.6.32]
Signed-off-by: Willy Tarreau 

Prior to the commit above, a cherry-pick -s would not append an extra s-o-b.
After this commit, a new line and a second s-o-b are added, making the footer
look like this :

Signed-off-by: Andrey Konovalov 
Acked-by: Clemens Ladisch 
Signed-off-by: Takashi Iwai 
Signed-off-by: Ben Hutchings 
[wt: file is sound/midi/usbmidi.c in 2.6.32]
Signed-off-by: Willy Tarreau 

Signed-off-by: Willy Tarreau 

This patch improves the parsing of the footer by considering the
presence of a valid rfc2822 line after possibly non-conformant lines.
Indeed, if someone added an s-o-b or CC after some stuff, this line
must properly be considered as part of the footer and not of the body.

Signed-off-by: Willy Tarreau 
---
 sequen

Re: [PATCH] sequencer.c: fix detection of duplicate s-o-b

2016-04-06 Thread Willy Tarreau
On Wed, Apr 06, 2016 at 07:57:01AM -0700, Junio C Hamano wrote:
> This seems to have been lost, perhaps because the top part that was
> quite long didn't look like a patch submission message or something.

Don't worry, we all know it's the submitter's responsibility to retransmit,
I apply the same principle :-)

> Git 1.7.12 is a quite ancient release and I wouldn't be surprised if
> we made the behaviour change during the period leading to v2.6 on
> purpose, but nothing immediately comes to mind. Christian (as the
> advocate for the trailer machinery) and Brandon ("git shortlog
> sequencer.c" suggests you), can you take a look?

FWIW it wad changed in 1.8.3 by commit bab4d10 ("sequencer.c: teach
append_signoff how to detect duplicate s-o-b").

The change made a lot of sense but it didn't assume that this practice
was common. And indeed I think this practice only happens in maintenance
branches where people have to make a lot of adaptations to existing
patches that they're cherry-picking. We do that a lot in stable kernels
to keep track of what we may need to revisit if we break something.

Thanks!
Willy

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sequencer.c: fix detection of duplicate s-o-b

2016-04-07 Thread Willy Tarreau
Hi Christian,

On Thu, Apr 07, 2016 at 04:06:59PM -0400, Christian Couder wrote:
> On Wed, Apr 6, 2016 at 12:37 PM, Willy Tarreau  wrote:
> > On Wed, Apr 06, 2016 at 07:57:01AM -0700, Junio C Hamano wrote:
> >> This seems to have been lost, perhaps because the top part that was
> >> quite long didn't look like a patch submission message or something.
> >
> > Don't worry, we all know it's the submitter's responsibility to retransmit,
> > I apply the same principle :-)
> >
> >> Git 1.7.12 is a quite ancient release and I wouldn't be surprised if
> >> we made the behaviour change during the period leading to v2.6 on
> >> purpose, but nothing immediately comes to mind. Christian (as the
> >> advocate for the trailer machinery) and Brandon ("git shortlog
> >> sequencer.c" suggests you), can you take a look?
> 
> Ok, I will try to have a look at that next week.
> 
> > FWIW it wad changed in 1.8.3 by commit bab4d10 ("sequencer.c: teach
> > append_signoff how to detect duplicate s-o-b").
> 
> So the change is quite old and was made before I started working on
> the trailer machinery.
> 
> > The change made a lot of sense but it didn't assume that this practice
> > was common. And indeed I think this practice only happens in maintenance
> > branches where people have to make a lot of adaptations to existing
> > patches that they're cherry-picking. We do that a lot in stable kernels
> > to keep track of what we may need to revisit if we break something.
> 
> Yeah, we know for some time, but after the above patch breakage
> happened and after I worked on interpret-trailers, that some lines
> inside [] are added by kernel people in the trailer part and that the
> trailer machinery doesn't work properly with such lines.
> 
> Anyway if you want your patch to be applied, it will probably need tests.

Thanks. I've discussed with Junio yesterday who notified me that my patch
broke some existing tests that I hadn't updated (I didn't know they existed)
namely one supposed to make the distinction between a normal footer and a
special case where the s-o-b is in fact part of the last paragraph. So I
said I'll rework the patch to only consider the parts between brackets
above a footer. Thus no need to waste your time testing the current patch
next week.

Thanks,
Willy

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] Git v2.12.0

2017-02-24 Thread Willy Tarreau
Hi Junio,

On Fri, Feb 24, 2017 at 11:28:58AM -0800, Junio C Hamano wrote:
>  * Use of an empty string that is used for 'everything matches' is
>still warned and Git asks users to use a more explicit '.' for that
>instead.  The hope is that existing users will not mind this
>change, and eventually the warning can be turned into a hard error,
>upgrading the deprecation into removal of this (mis)feature.  That
>is not scheduled to happen in the upcoming release (yet).

FWIW '.' is not equivalent to '' when it comes to grep or such commands,
you should suggest '^' or '$' instead, otherwise you'll miss empty lines
and users will continue to use '' for that purpose. BTW I do use grep ''
a lot, but only on file systems, not within git (eg: to display contents
of /sys).

Cheers,
Willy


Re: [ANNOUNCE] Git v2.12.0

2017-02-25 Thread Willy Tarreau
On Sat, Feb 25, 2017 at 12:31:21AM -0800, Junio C Hamano wrote:
> Willy Tarreau  writes:
> 
> > Hi Junio,
> >
> > On Fri, Feb 24, 2017 at 11:28:58AM -0800, Junio C Hamano wrote:
> >>  * Use of an empty string that is used for 'everything matches' is
> >>still warned and Git asks users to use a more explicit '.' for that
> >>instead.  The hope is that existing users will not mind this
> >>change, and eventually the warning can be turned into a hard error,
> >>upgrading the deprecation into removal of this (mis)feature.  That
> >>is not scheduled to happen in the upcoming release (yet).
> >
> > FWIW '.' is not equivalent to '' when it comes to grep or such commands,
> 
> I am amused and amazed ;-).  
> 
> The above is not about "grep" but was meant to describe "pathspec".
> We used to take "" as a pathspec element that means "every path
> matches", but recently started deprecating it and ask users to be
> more explicit by using "." (as a directory as a pathspec element
> matches everything inside the directory).  We are not changing the
> pattern matching done by "git grep" or "log --grep".  What is
> changing is that between the two that means the same thing:
> 
>   cd t/ && git log ""
>   cd t/ && git log .
> 
> the former is deprecated.

Ah that's fun indeed I never used it like this :-)

> I find it amusing that I have been writing the above in the draft
> release notes without realizing that I failed to say that it is
> about pathspec for quite some time, and without realizing that the
> above can be misinterpreted as if it is talking about grep patterns.
> 
> And I find it amazing that it took this long for somebody to spot
> that misleading vagueness in this description and point it out.
> 
> It should probably be updated to start like so:
> 
>   Use of an empty string as a pathspec element that is used
>   for 'everything matches' is ...

Hey it's the usual matter of perspective and context. When you know
what you do it's always obvious when you explain it while for others
something different is obvious :-)

Thanks for your clarification!
Willy


Re: email as a bona fide git transport

2019-10-16 Thread Willy Tarreau
Hi Vegard,

On Wed, Oct 16, 2019 at 12:22:54PM +0200, Vegard Nossum wrote:
> (cross-posted to git, LKML, and the kernel workflows mailing lists.)
> 
> Hi all,
> 
> I've been following Konstantin Ryabitsev's quest for better development
> and communication tools for the kernel [1][2][3], and I would like to
> propose a relatively straightforward idea which I think could bring a
> lot to the table.
> 
> Step 1:
> 
> * git send-email needs to include parent SHA1s and generally all the
>   information needed to perfectly recreate the commit when applied so
>   that all the SHA1s remain the same
> 
> * git am (or an alternative command) needs to recreate the commit
>   perfectly when applied, including applying it to the correct parent
> 
> Having these two will allow a perfect mapping between email and git;
> essentially email just becomes a transport for git. There are a lot of
> advantages to this, particularly that you have a stable way to refer to
> a patch or commit (despite it appearing on a mailing list), and there
> is no need for "changeset IDs" or whatever, since you can just use the
> git SHA1 which is unique, unambiguous, and stable.

I agree this would be great and have been missing this a number of times,
eventhough I'm aware of git-send-pack/git-receive-pack. The text format
is way more convenient for a lot of reasons. It could also help with
Greg's idea of using the commit IDs to reference bugs, as such IDs could
remain stable within a series before it is merged, and as such referenced
in subsequent commit messages. It could also be useful to avoid losing
notes related to a patch once it's merged.

> Step 3:
> 
> * Instead of describing a patchset in a separate introduction email, we
>   can create a merge commit between the parent of the first commit in
>   the series and the last and put the patchset description in the merge
>   commit [5]. This means the patchset description also gets to be part
>   of git history.
> 
>   (This would require support for git send-email/am to be able to send
>   and apply merge commits -- at least those which have the same tree as
>   one of the parents. This is _not_ yet supported in my proposed git
>   patches.)

That's a good idea, as we've all seen long series with a very detailed
description in patch 0 and much less context in subsequent patches, thus
losing the context once merged.

> * stable SHA1s means we can refer to previous versions of a patchset by
>   SHA1 rather than archive links. I propose a new changelog tag for
>   this, maybe "Previous:" or maybe even a full list of "v1:", "v2:",
>   etc. with a SHA1 or ref. Note that these SHA1s do *not* need to exist
>   in Linus's repo, but those who want can pull those branches from the
>   bot-maintained repo on git.kernel.org.

For me this mainly brings the benefit of finally having a unique identifier
for multiple iterations of a patchset. It then becomes easier to use this
identifier to designate the functional work, regardless of the number of
updates it gets. Of course it's never that black and white since such work
may itself merge multiple other patchsets but for most use cases it can
help.

Willy


Re: email as a bona fide git transport

2019-10-17 Thread Willy Tarreau
On Thu, Oct 17, 2019 at 09:54:47PM -0400, Konstantin Ryabitsev wrote:
> On Thu, Oct 17, 2019 at 06:30:29PM -0700, Greg KH wrote:
> > > It could only possibly work if nobody ever adds their own
> > > "Signed-Off-By" or
> > > any other bylines. I expect this is a deal-breaker for most maintainers.
> > 
> > Yeah it is :(
> > 
> > But, if we could just have the signature on the code change, not the
> > changelog text, that would help with that issue.
> 
> We totally should, and I even mused on how we would do that here:
> https://public-inbox.org/git/20190910121324.GA6867@pure.paranoia.local/
> 
> However, since git's PGP signatures are made for the content in the actual
> commit record (tree hash, parent, author, commit message, etc), the only way
> we could preserve them between the email and the git tree is if we never
> modify any of that data. The SOB and other trailers would have to only be
> applied to the merge commit, or migrate into commit notes.

There's also the possibility to handle this a bit like we do when adding
comments before the SOB: a PGP signature would apply to the text *before*
it only. We could then have long chains of SOB, PGP, SOB, PGP etc.

Willy


Re: email as a bona fide git transport

2019-10-19 Thread Willy Tarreau
On Fri, Oct 18, 2019 at 03:14:56PM -0400, Theodore Y. Ts'o wrote:
> On Fri, Oct 18, 2019 at 06:50:51PM +0200, Vegard Nossum wrote:
> > The problem I ran into with putting the metadata at the end was
> > detecting where the diff ends. A comment in 'git apply' suggested that
> > detecting the difference between "--" as a diff/signature separator and
> > as part of the diff is nontrivial in the sense that you need to actually
> > do some parsing and keep track of hunk sizes.
> 
> Could we cheat by having "git format-patch" add a "Diff-size" in the
> header which gives the number of lines in the diff so git am can just
> count lines to find the Trailer section?

Be careful with this, it starts like this and ends up with non-editable
patches. I'd rather have git-am use best-effort detection of the end.
Also when dealing with stable backports, I've done a lot of
"cat foo.diff >> bar.patch" to fixup some patches in which I just had
to move some parts around. Having to count lines and edit a counter
somewhere is going to become really painful.

Just my two cents,
Willy