I agree that we should create a good pointer for cleaning up PRs, and request (though not require) that authors do it. It's unfortunate though that squashing during a review makes things difficult to follow, so adds one more round trip.
We could consider for those PRs that make sense as a single logical commit (most, but not all, of them) simply using the "squash and merge" button even though it technically doesn't create a merge commit. On Fri, Sep 21, 2018 at 9:24 PM Daniel Oliveira <danolive...@google.com> wrote: > As a non-committer I think some automated squashing of commits sounds best > since it lightens the load of regular contributors, by not having to always > remember to squash, and lightens the load of committers so it doesn't take > as long to have your PR approved by one. > > But for now I think the second best route would be making it PR author's > responsibility to squash fixup commits. Having that expectation described > clearly in the Contributor's Guide, along with some simple step-by-step > instructions for how to do so should be enough. I mainly support this > because I've been doing the squashing myself since I saw a thread about it > here a few months ago. It's not nearly as huge a burden on me as it > probably is for committers who have to merge in many more PRs, it's very > easy to learn how to do, and it's one less barrier to having my code merged > in. > > Of course I wouldn't expect that committers wait for PR authors to squash > their fixup commits, but I think leaving a message like "For future pull > requests you should squash any small fixup commits, as described here: > <link>" should be fine. > > >> I was also thinking about the possibility of wanting to revert >> individual commits from a merge commit. The solution you propose works, >> but only if you want to revert everything. > > > Does this happen often? I might not have enough context since I'm not a > committer, but it seems to me that often the person performing a revert is > not the original author of a change and doesn't have the context or time to > pick out an individual commit to revert. > > On Wed, Sep 19, 2018 at 1:32 PM Maximilian Michels <m...@apache.org> wrote: > >> I tend to agree with you Lukasz. Of course we should try to follow the >> guide lines as much as possible but if it requires an extra back and >> forth with the PR author for a cosmetic change, it may not be worth the >> time. >> >> On 19.09.18 22:17, Lukasz Cwik wrote: >> > I have to say I'm guilty of not following the merge guidelines, >> > sometimes doing merges without rebasing/flatten commits. >> > >> > I find that it is a few extra mins of my time to fix someones PR >> history >> > if they have more then one logical commit they want to be separate and >> > it usually takes days for the PR author to do merging with the extra >> > burden as a committer to keep track of another PR and its state >> (waiting >> > for clean-up) is taxing. I really liked the idea of the mergebot (even >> > though it didn't work out in practice) because it could do all the >> > policy work on my behalf. >> > >> > Anything that reduces my overhead as a committer is useful as for the >> > 100s of PRs that I have merged, I've only had to rollback a couple so >> > I'm for Charle's suggestion which makes the rollback flow slightly more >> > complicated for a significantly easier PR merge workflow. >> > >> > On Wed, Sep 19, 2018 at 1:13 PM Charles Chen <c...@google.com >> > <mailto:c...@google.com>> wrote: >> > >> > What I mean is that if you get the first-parent commit using "git >> > log --first-parent", it will incorporate any and all fix up commits >> > so we don't need to worry about missing any. >> > >> > On Wed, Sep 19, 2018, 1:07 PM Maximilian Michels <m...@apache.org >> > <mailto:m...@apache.org>> wrote: >> > >> > Generally, +1 for isolated commits which are easy to revert. >> > >> > > I don't think it's actually harder to roll back a set of >> > commits that are merged together. >> > I think Thomas was mainly concerned about "fixup" commits to >> > land in >> > master (as part of a merge). These indeed make reverting commits >> > more >> > difficult because you have to check whether you missed a >> "fixup". >> > >> > > Ideally every commit should compile and pass tests though, >> right? >> > >> > That is definitely what we should strive for when doing a merge >> > against >> > master. >> > >> > > Perhaps the bigger issue is that we need better documentation >> > and a playbook on how to do this these common tasks in git. >> > >> > We do actually have basic documentation about this but most >> > people don't >> > read it. For example, the commit message of a Merge commit >> > should be: >> > >> > Merge pull request #xxxx: [BEAM-yyyy] Issue title >> > >> > But most merge commits don't comply with this rule :) See >> > https://beam.apache.org/contribute/committer-guide/#merging-it >> > >> > On 19.09.18 21:34, Reuven Lax wrote: >> > > Ideally every commit should compile and pass tests though, >> right? >> > > >> > > On Wed, Sep 19, 2018 at 12:15 PM Ankur Goenka >> > <goe...@google.com <mailto:goe...@google.com> >> > > <mailto:goe...@google.com <mailto:goe...@google.com>>> >> wrote: >> > > >> > > I agree with the cleanliness of the Commit history. >> > > "Fixup!", "Address comments", "Address even more >> > comments" type of >> > > comments does not convey meaningful information and are >> > not very >> > > useful. Its a good idea to squash them. >> > > However, I think its ok to keep separate commits for >> > different >> > > logical pieces of the code which make reviewing and >> > revisiting code >> > > easier. >> > > Example PR: Support X in the pipeline >> > > Commit 1: Restructuring a bunch of code without any >> > logical change. >> > > Commit 2: Changing validation logic for pipeline. >> > > Commit 3: Supporting new field "X" for pipeline. >> > > >> > > On Wed, Sep 19, 2018 at 11:27 AM Charles Chen >> > <c...@google.com <mailto:c...@google.com> >> > > <mailto:c...@google.com <mailto:c...@google.com>>> wrote: >> > > >> > > To be concrete, it is very easy to revert a commit in >> > any case: >> > > >> > > 1. First, use "git log --first-parent" to find the >> > first-parent >> > > commit corresponding to a PR merge (this is a >> > one-to-one >> > > correspondence). >> > > 2. Use "git revert -m 1 <commitid>" to revert the >> > commit; this >> > > selects the first parent as the base for a merge >> > commit (in >> > > the case where a single commit needs to be >> > reverted, just >> > > use "git revert <commitid>" without the "-m 1" >> flag). >> > > >> > > In any case, as a general good engineering practice, >> > I do agree >> > > that it is highly desirable to have small >> independent PRs >> > > instead of large jumbo PRs whenever possible. >> > > >> > > On Wed, Sep 19, 2018 at 11:20 AM Charles Chen >> > <c...@google.com <mailto:c...@google.com> >> > > <mailto:c...@google.com <mailto:c...@google.com>>> >> wrote: >> > > >> > > I don't think it's actually harder to roll back a >> > set of >> > > commits that are merged together. Git has the >> > notion of >> > > first-parent commits (you can see, for example, >> > "git log >> > > --first-parent", which filters out the >> intermediate >> > > commits). In this sense, PRs still get merged as >> > one unit >> > > and this is preserved even if intermediate >> > commits are >> > > kept. Perhaps the bigger issue is that we need >> > better >> > > documentation and a playbook on how to do this >> > these common >> > > tasks in git. >> > > >> > > On Wed, Sep 19, 2018 at 9:27 AM Thomas Weise >> > <t...@apache.org <mailto:t...@apache.org> >> > > <mailto:t...@apache.org <mailto:t...@apache.org>>> >> > wrote: >> > > >> > > Wanted to bring this up as reminder as well >> as >> > > opportunity to discuss potential changes to >> our >> > > committer guide. It has been a while since >> > last related >> > > discussion and we welcomed several new >> > committers since >> > > then. >> > > >> > > Finishing up pull requests pre-merge: >> > > >> > > >> > >> https://beam.apache.org/contribute/committer-guide/#finishing-touches >> > > >> > > PRs are worked on over time and may >> > accumulate many >> > > commits. Sometimes because scope expands, >> > sometimes just >> > > to separate independent changes but most of >> > the time the >> > > commits are just fixups that are added as >> > review progresses. >> > > It is important that the latter get squashed >> > prior to PR >> > > merge, as otherwise we lost the ability to >> > roll back >> > > changes by reverting a single commit and also >> > generally >> > > cause a lot of noise in the commit history >> > that does not >> > > help other contributors. To be clear, I refer >> > to the >> > > "Fixup!", "Address comments", "Address even >> more >> > > comments" type of entries :) >> > > >> > > I would also propose that every commit gets >> > tagged with >> > > a JIRA (except those fixups that will be >> > squashed). >> > > Having the JIRA and possibly other tags makes >> > it easier >> > > for others not involved in the PR to identify >> > changes >> > > after they were merged, for example when >> > looking at the >> > > revision history or annotated source. >> > > >> > > As for other scenarios of jumbo PRs with many >> > commits, >> > > there are probably situations where work >> > needs to be >> > > broken down into smaller units, making life >> > better for >> > > both, contributor and reviewer(s). Ideally, >> > every PR >> > > would have only one commit, but that may be a >> > bit much >> > > to mandate? Is the general expectation >> > something we need >> > > to document more clearly? >> > > >> > > Thanks, >> > > Thomas >> > > >> > >> >