> It not only makes it difficult to know what commits are associated with a
PR
> it also makes back porting / cherry-picking PRs from master to an LTS
branch really problematic


Counter-proposal: Use merge commits.

Part of git's design is the element of a commit, whose sha1 ID can be used
to track changes across branches.

When a commit is merged, its parent must change, which would invalidate the
commit. Currently, we rebase the commit to the point at which it is to be
merged, and then it is a simple fast-forward. But this effectively destroys
the commit hash, which is its ID.

The alternative I suggest is to use merge commits. Merge commits have two
parents instead of one. One commit points to the target branch, and the
other parent points to the commit to be merged. This way, no matter which
branch it is on, the commit is preserved (along with commit signatures, if
any). To cherry-pick commits from one such merge, cherry-pick the merge
commit instead, selecting the correct parent. `git cherry-pick -m 1
$mergeCommit`



I have to say first that I am aware there are probably reasons we do not
use merge commits already, and my own opinion is we should do whatever the
RM prefers, as they are doing the hard work here. Feel free to nix the idea.

Second, I also think it is fine to try and keep the PR down to 1 commit,
unless it is unreasonable to do so. (Squashing commits before the merge
seems OK.)

Finally, as I take cover from the rotten fruit I expect will be thrown my
way, I'll offer what I've found to be maybe a quick way to identify commits
that are "destructively" merged across branches:

> Essentially, there’s no easy way (as far as I can tell) to find the
upstream (master) commit IDs from such a multi-commit PR.


1) Identify a commit you want to cherry-pick (e7534aca8)

2) `git show e7534aca8` to see the changes as a diff.

3) Try to find some unique string somewhere in this diff and search
branches for it. ("www.connectfail502.test")

[~/trafficserver]
∴ git log --all --format='%h' -S"HOST = 'www.connectfail502.test'" | while
read hash; do git branch -r --contains "$hash" | sed "s;$; ($hash);"; done
|grep apache/|sort
  apache/7.1.x (835c0b113)
  apache/8.0.x (e7534aca8)
  apache/c++17 (e7534aca8)
  apache/connect_error_race (e7534aca8)
  apache/content_length (e7534aca8)
  apache/master (e7534aca8)
  apache/nexthop (e7534aca8)
  apache/origin_throttling (e7534aca8)
  apache/quic-07 (e7534aca8)
  apache/quic-11 (e7534aca8)
  apache/quic-12 (e7534aca8)
  apache/quic-latest (e7534aca8)
  apache/rm_f (e7534aca8)

Here I see the original commit was merged and became e7534aca8, and when it
was cherry-picked to 7.1.x, it became 835c0b113. The commit is also in
8.0.x and several other branches on the official repo that I have named
"apache" here.

This has to be done because the original commit hash was not used in the
cherry-pick, so we need to "join" them on something, and that may as well
be a string present in both commits.

This is not free from failure modes (reversions, non-unique strings,
conflicts/renames within the string), but at least it is a start.

On Tue, Aug 7, 2018 at 3:42 PM, Leif Hedstrom <zw...@apache.org> wrote:

> Hi,
>
> There are some serious shortcomings in Github when it comes to PRs with
> multiple commits. It not only makes it difficult to know what commits are
> associated with a PR, it also makes back porting / cherry-picking PRs from
> master to an LTS branch really problematic. Essentially, there’s no easy
> way (as far as I can tell) to find the upstream (master) commit IDs from
> such a multi-commit PR.
>
> To make this less confusing, at least until Github fixes this brain damage
> (if they can?), I’d like to propose that we have a strict policy for master
> (and 9.0.x) that there’s is exactly one commit per PR. This doesn’t mean
> that you shouldn’t break things up where it makes sense, it simply means
> you’d want to make multiple PRs off your branch instead.
>
> Thoughts?
>
> — Leif
>
> P.s
> If someone has a better idea, I’m willing to listen. But if you look at a
> PR with multiple commits, Github only shows one of those commits in the
> merge message. And the commit IDs it shows in the Commit tab is not the
> upstream commit IDs, but rather the local repos commit IDs (which are
> essentially useless for these purposes).




-- 
Derek

Reply via email to