[PATCH] sequencer.c: fix detection of duplicate s-o-b
Hi, after I upgraded my machine, I switched from git 1.7.12.2 to 2.6.4 and experienced an annoying regression when dealing with stable kernel backports. I'm using a "dorelease" script which relies on git-cherry-pick's ability to properly detect duplicate s-o-b to ensure that all merged commits are properly signed in a release. Today while preparing the last 2.6.32 release, I did a git log before pushing and found some commits having two s-o-b lines with myself. I found that these ones were always those containing some backporting notes between the s-o-b lines (which we all do in stable branches to indicate what was changed in the backport process). I didn't feel brave enough to individually deal with each offending patch by hand so instead I bisected the git changes and found that the behaviour changed with commit bab4d10 ("sequencer.c: teach append_signoff how to detect duplicate s-o-b"). The reason is that function has_conforming_footer() immediately stops after the first non-conforming line without checking if there are conforming lines after. But if someone added signed-off-by anywhere after a non-conforming block, it should always be considered as part of the footer. Thus I adjusted the logic to check till the end of the footer and report the presence of valid rfc2822 or cherry-picked lines after the last non-conformant one and now it correctly handles all types of commits I had to deal with (ie: only adds s-o-b when it doesn't match the last one and doesn't add an empty line after a conformant one). For example, this footer : Signed-off-by: Mike Galbraith [bwh: Backported to 3.2: - Adjust numbering in the comment - Adjust filename] Signed-off-by: Ben Hutchings Cc: Byungchul Park Cc: Peter Zijlstra Cc: Willy Tarreau Used to be turned into this : Signed-off-by: Mike Galbraith [bwh: Backported to 3.2: - Adjust numbering in the comment - Adjust filename] Signed-off-by: Ben Hutchings Cc: Byungchul Park Cc: Peter Zijlstra Cc: Willy Tarreau Signed-off-by: Willy Tarreau And is now properly converted to : Signed-off-by: Mike Galbraith [bwh: Backported to 3.2: - Adjust numbering in the comment - Adjust filename] Signed-off-by: Ben Hutchings Cc: Byungchul Park Cc: Peter Zijlstra Cc: Willy Tarreau Signed-off-by: Willy Tarreau Also, cherry-picking the last commit above again would produce this before : Signed-off-by: Mike Galbraith [bwh: Backported to 3.2: - Adjust numbering in the comment - Adjust filename] Signed-off-by: Ben Hutchings Cc: Byungchul Park Cc: Peter Zijlstra Cc: Willy Tarreau Signed-off-by: Willy Tarreau Signed-off-by: Willy Tarreau And it now is properly left untouched since the last s-o-b line is properly matched. I'm appending the patch, please include it upstream. Thanks! Willy >From be9624a0df4c649d452f898925953a81dc9163fc Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Sat, 12 Mar 2016 13:35:35 +0100 Subject: sequencer.c: fix detection of duplicate s-o-b Commit bab4d10 ("sequencer.c: teach append_signoff how to detect duplicate s-o-b") changed the method used to detect duplicate s-o-b, but it introduced a regression for a case where some non-compliant information are present in the footer. In maintenance branches, it's very common to add some elements after the signed-off and to add your s-o-b after. This is used a lot in the stable kernel series, for example this commit backported from 3.2 to 2.6.32 : ALSA: usb-audio: avoid freeing umidi object twice commit 07d86ca93db7e5cdf4743564d98292042ec21af7 upstream. The 'umidi' object will be free'd on the error path by snd_usbmidi_free() when tearing down the rawmidi interface. So we shouldn't try to free it in snd_usbmidi_create() after having registered the rawmidi interface. Found by KASAN. Signed-off-by: Andrey Konovalov Acked-by: Clemens Ladisch Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings [wt: file is sound/midi/usbmidi.c in 2.6.32] Signed-off-by: Willy Tarreau Prior to the commit above, a cherry-pick -s would not append an extra s-o-b. After this commit, a new line and a second s-o-b are added, making the footer look like this : Signed-off-by: Andrey Konovalov Acked-by: Clemens Ladisch Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings [wt: file is sound/midi/usbmidi.c in 2.6.32] Signed-off-by: Willy Tarreau Signed-off-by: Willy Tarreau This patch improves the parsing of the footer by considering the presence of a valid rfc2822 line after possibly non-conformant lines. Indeed, if someone added an s-o-b or CC after some stuff, this line must properly be considered as part of the footer and not of the body. Signed-off-by: Willy Tarreau --- sequen
Re: [PATCH] sequencer.c: fix detection of duplicate s-o-b
On Wed, Apr 06, 2016 at 07:57:01AM -0700, Junio C Hamano wrote: > This seems to have been lost, perhaps because the top part that was > quite long didn't look like a patch submission message or something. Don't worry, we all know it's the submitter's responsibility to retransmit, I apply the same principle :-) > Git 1.7.12 is a quite ancient release and I wouldn't be surprised if > we made the behaviour change during the period leading to v2.6 on > purpose, but nothing immediately comes to mind. Christian (as the > advocate for the trailer machinery) and Brandon ("git shortlog > sequencer.c" suggests you), can you take a look? FWIW it wad changed in 1.8.3 by commit bab4d10 ("sequencer.c: teach append_signoff how to detect duplicate s-o-b"). The change made a lot of sense but it didn't assume that this practice was common. And indeed I think this practice only happens in maintenance branches where people have to make a lot of adaptations to existing patches that they're cherry-picking. We do that a lot in stable kernels to keep track of what we may need to revisit if we break something. Thanks! Willy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sequencer.c: fix detection of duplicate s-o-b
Hi Christian, On Thu, Apr 07, 2016 at 04:06:59PM -0400, Christian Couder wrote: > On Wed, Apr 6, 2016 at 12:37 PM, Willy Tarreau wrote: > > On Wed, Apr 06, 2016 at 07:57:01AM -0700, Junio C Hamano wrote: > >> This seems to have been lost, perhaps because the top part that was > >> quite long didn't look like a patch submission message or something. > > > > Don't worry, we all know it's the submitter's responsibility to retransmit, > > I apply the same principle :-) > > > >> Git 1.7.12 is a quite ancient release and I wouldn't be surprised if > >> we made the behaviour change during the period leading to v2.6 on > >> purpose, but nothing immediately comes to mind. Christian (as the > >> advocate for the trailer machinery) and Brandon ("git shortlog > >> sequencer.c" suggests you), can you take a look? > > Ok, I will try to have a look at that next week. > > > FWIW it wad changed in 1.8.3 by commit bab4d10 ("sequencer.c: teach > > append_signoff how to detect duplicate s-o-b"). > > So the change is quite old and was made before I started working on > the trailer machinery. > > > The change made a lot of sense but it didn't assume that this practice > > was common. And indeed I think this practice only happens in maintenance > > branches where people have to make a lot of adaptations to existing > > patches that they're cherry-picking. We do that a lot in stable kernels > > to keep track of what we may need to revisit if we break something. > > Yeah, we know for some time, but after the above patch breakage > happened and after I worked on interpret-trailers, that some lines > inside [] are added by kernel people in the trailer part and that the > trailer machinery doesn't work properly with such lines. > > Anyway if you want your patch to be applied, it will probably need tests. Thanks. I've discussed with Junio yesterday who notified me that my patch broke some existing tests that I hadn't updated (I didn't know they existed) namely one supposed to make the distinction between a normal footer and a special case where the s-o-b is in fact part of the last paragraph. So I said I'll rework the patch to only consider the parts between brackets above a footer. Thus no need to waste your time testing the current patch next week. Thanks, Willy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] Git v2.12.0
Hi Junio, On Fri, Feb 24, 2017 at 11:28:58AM -0800, Junio C Hamano wrote: > * Use of an empty string that is used for 'everything matches' is >still warned and Git asks users to use a more explicit '.' for that >instead. The hope is that existing users will not mind this >change, and eventually the warning can be turned into a hard error, >upgrading the deprecation into removal of this (mis)feature. That >is not scheduled to happen in the upcoming release (yet). FWIW '.' is not equivalent to '' when it comes to grep or such commands, you should suggest '^' or '$' instead, otherwise you'll miss empty lines and users will continue to use '' for that purpose. BTW I do use grep '' a lot, but only on file systems, not within git (eg: to display contents of /sys). Cheers, Willy
Re: [ANNOUNCE] Git v2.12.0
On Sat, Feb 25, 2017 at 12:31:21AM -0800, Junio C Hamano wrote: > Willy Tarreau writes: > > > Hi Junio, > > > > On Fri, Feb 24, 2017 at 11:28:58AM -0800, Junio C Hamano wrote: > >> * Use of an empty string that is used for 'everything matches' is > >>still warned and Git asks users to use a more explicit '.' for that > >>instead. The hope is that existing users will not mind this > >>change, and eventually the warning can be turned into a hard error, > >>upgrading the deprecation into removal of this (mis)feature. That > >>is not scheduled to happen in the upcoming release (yet). > > > > FWIW '.' is not equivalent to '' when it comes to grep or such commands, > > I am amused and amazed ;-). > > The above is not about "grep" but was meant to describe "pathspec". > We used to take "" as a pathspec element that means "every path > matches", but recently started deprecating it and ask users to be > more explicit by using "." (as a directory as a pathspec element > matches everything inside the directory). We are not changing the > pattern matching done by "git grep" or "log --grep". What is > changing is that between the two that means the same thing: > > cd t/ && git log "" > cd t/ && git log . > > the former is deprecated. Ah that's fun indeed I never used it like this :-) > I find it amusing that I have been writing the above in the draft > release notes without realizing that I failed to say that it is > about pathspec for quite some time, and without realizing that the > above can be misinterpreted as if it is talking about grep patterns. > > And I find it amazing that it took this long for somebody to spot > that misleading vagueness in this description and point it out. > > It should probably be updated to start like so: > > Use of an empty string as a pathspec element that is used > for 'everything matches' is ... Hey it's the usual matter of perspective and context. When you know what you do it's always obvious when you explain it while for others something different is obvious :-) Thanks for your clarification! Willy
Re: email as a bona fide git transport
Hi Vegard, On Wed, Oct 16, 2019 at 12:22:54PM +0200, Vegard Nossum wrote: > (cross-posted to git, LKML, and the kernel workflows mailing lists.) > > Hi all, > > I've been following Konstantin Ryabitsev's quest for better development > and communication tools for the kernel [1][2][3], and I would like to > propose a relatively straightforward idea which I think could bring a > lot to the table. > > Step 1: > > * git send-email needs to include parent SHA1s and generally all the > information needed to perfectly recreate the commit when applied so > that all the SHA1s remain the same > > * git am (or an alternative command) needs to recreate the commit > perfectly when applied, including applying it to the correct parent > > Having these two will allow a perfect mapping between email and git; > essentially email just becomes a transport for git. There are a lot of > advantages to this, particularly that you have a stable way to refer to > a patch or commit (despite it appearing on a mailing list), and there > is no need for "changeset IDs" or whatever, since you can just use the > git SHA1 which is unique, unambiguous, and stable. I agree this would be great and have been missing this a number of times, eventhough I'm aware of git-send-pack/git-receive-pack. The text format is way more convenient for a lot of reasons. It could also help with Greg's idea of using the commit IDs to reference bugs, as such IDs could remain stable within a series before it is merged, and as such referenced in subsequent commit messages. It could also be useful to avoid losing notes related to a patch once it's merged. > Step 3: > > * Instead of describing a patchset in a separate introduction email, we > can create a merge commit between the parent of the first commit in > the series and the last and put the patchset description in the merge > commit [5]. This means the patchset description also gets to be part > of git history. > > (This would require support for git send-email/am to be able to send > and apply merge commits -- at least those which have the same tree as > one of the parents. This is _not_ yet supported in my proposed git > patches.) That's a good idea, as we've all seen long series with a very detailed description in patch 0 and much less context in subsequent patches, thus losing the context once merged. > * stable SHA1s means we can refer to previous versions of a patchset by > SHA1 rather than archive links. I propose a new changelog tag for > this, maybe "Previous:" or maybe even a full list of "v1:", "v2:", > etc. with a SHA1 or ref. Note that these SHA1s do *not* need to exist > in Linus's repo, but those who want can pull those branches from the > bot-maintained repo on git.kernel.org. For me this mainly brings the benefit of finally having a unique identifier for multiple iterations of a patchset. It then becomes easier to use this identifier to designate the functional work, regardless of the number of updates it gets. Of course it's never that black and white since such work may itself merge multiple other patchsets but for most use cases it can help. Willy
Re: email as a bona fide git transport
On Thu, Oct 17, 2019 at 09:54:47PM -0400, Konstantin Ryabitsev wrote: > On Thu, Oct 17, 2019 at 06:30:29PM -0700, Greg KH wrote: > > > It could only possibly work if nobody ever adds their own > > > "Signed-Off-By" or > > > any other bylines. I expect this is a deal-breaker for most maintainers. > > > > Yeah it is :( > > > > But, if we could just have the signature on the code change, not the > > changelog text, that would help with that issue. > > We totally should, and I even mused on how we would do that here: > https://public-inbox.org/git/20190910121324.GA6867@pure.paranoia.local/ > > However, since git's PGP signatures are made for the content in the actual > commit record (tree hash, parent, author, commit message, etc), the only way > we could preserve them between the email and the git tree is if we never > modify any of that data. The SOB and other trailers would have to only be > applied to the merge commit, or migrate into commit notes. There's also the possibility to handle this a bit like we do when adding comments before the SOB: a PGP signature would apply to the text *before* it only. We could then have long chains of SOB, PGP, SOB, PGP etc. Willy
Re: email as a bona fide git transport
On Fri, Oct 18, 2019 at 03:14:56PM -0400, Theodore Y. Ts'o wrote: > On Fri, Oct 18, 2019 at 06:50:51PM +0200, Vegard Nossum wrote: > > The problem I ran into with putting the metadata at the end was > > detecting where the diff ends. A comment in 'git apply' suggested that > > detecting the difference between "--" as a diff/signature separator and > > as part of the diff is nontrivial in the sense that you need to actually > > do some parsing and keep track of hunk sizes. > > Could we cheat by having "git format-patch" add a "Diff-size" in the > header which gives the number of lines in the diff so git am can just > count lines to find the Trailer section? Be careful with this, it starts like this and ends up with non-editable patches. I'd rather have git-am use best-effort detection of the end. Also when dealing with stable backports, I've done a lot of "cat foo.diff >> bar.patch" to fixup some patches in which I just had to move some parts around. Having to count lines and edit a counter somewhere is going to become really painful. Just my two cents, Willy