Re: Implicit keep-alive after reintegrate merge

Mark Mielke Mon, 30 Jan 2012 07:01:54 -0800

Stefan: I believe you are agreeing that the merge in either direction isthe same complexity, and describing how --reintegrate moves theresponsibility for the complexity to the owner of the private branch,and requires resolution before submission. I think you are saying thisis a good thing because diff3 isn't perfect.


In my experience:

No merge is perfect. The situation is either complex, or it is notcomplex - and moving resolution to the private branch is a matter ofprocess - not a matter of algorithm. That is, it is the responsibilityof the team to decide that "we will always make sure our private branchis up to date before submitting to the integration stream."

In particular, if I have a stream with 100 users working in parallel,all submitting on a regular basis because this is their full time paidjob to work on a piece of software, it may be a race to actually get thesubmission - depending on if the algorithm can detect whether the samefiles are being changed or not.

The first thing the tool can do to be genuinely useful in thissituation, is to accept some of the responsibility of detecting whetheror not the race is one of these "diff3 is not idempotent" situations,and providing automatic handling. If the case has been hit, then--reintegrate could be used as a form of "special error checking" whereit does the same as "merge", except in the case that the merge has atrue conflict with any particular element of the change set (as opposedto a potential conflict with the end result), where the results of diff3would need to be "trusted", then it could bail and provide the user withthe information required to resolve the conflict locally before submission.

The second thing the tool can do to be genuinely useful in thissituation, is to allow for this check to be overridden. If I didn'ttrust diff3 - I wouldn't use merges at all. Sometimes a sourcemanagement tool just needs to help me resolve conflicts. Especially withmerge tracking and intelligent designer workflows, many cases of socalled "conflicts" touch unrelated lines of code, and it *is* safe tocomplete the merge, even to the integration stream. I should have theability to choose to do this, rather than race for submission with 100other users.

The worst thing the tool can do is to declare that "diff3 is idempotenttherefore it should be disabled" during --reintegrate. Yuck. This is apartial solution and at least as I understand it - it is even dangerous.What happens if I use --reintegrate in a situation that actually doesrequire merge resolution? Will every situation be blocked? Or will ittake --reintegrate as a license to overwrite results, trusting that Ican do all the necessary conflict checking myself? I have seen nothingso far that allows me to conclude that architecturally, Subversionrequires the --reintegrate behaviour. It's a short cut in providing acomplete branch merging solution for users of the system. Somebodystarted work on the canvas, and then drafted in the last corner ratherthan finish it. :-)


Cheers,
mark


On 01/30/2012 08:23 AM, Stefan Sperling wrote:

The same applies to "reintegrate", BTW. It is a Subversion-specificconcept that might not be represented in CM theory because it is, asyou point out, just a special case of the general merge (you didn'tdescribe what "merge" means in your theory so I'm just going to makeassumptions).

Just to make sure it's understood: When you create a branch, the origin
of the branch is an interesting bit of information. However, for
merging, it is entirely irrelevant if branch A was created from B or the
other way around. To illustrate:

     (1)
                +- b@r2 ---- b@r3 ----
      (branch) /              | (merge)
              /               v
        --- a@r1 -------------+- a@r4 ----

     (2)
        --- a@r1 ----------- a@r3 ----
              \               | (merge)
      (branch) \              v
                +- b@r2 ------+- b@r4 ----


Cases (1) and (2) are exactly equivalent as far as the merge algorithm
is concerned, but Subversion calls the first a reintegrate merge and the
second a sync merge, and treats them differently, as if branch (a) were
somehow special. It's not.

If you always use the 2-URL merge syntax all the abstractions go away
and you'll have symmetry.

  (1) svn co a@r4 wc; svn merge b@r2 b@r3 a
  (2) svn co b@r4 wc; svn merge a@r1 a@r3 b

See? Perfectly symmetrical.

Your example is too simple, though.
You only have one change being merged either way, and no cycles.

Generally, we want to avoid spurious conflicts from diff3 which happen
when changes are applied twice because diff3 is not idempotent.
I.e. we break the nice symmetry to work around a limitation of diff3.

In the following case we can avoid spurious conflicts by picking
our parameters carefully:

      (3)
                 +-b@r2--+ b@r3--b@r4-b@r5 ----
       (branch) /        ^             | (merge 2)
               /         | (merge 1)   v
         --- a@r1 ------a@r2-----------+- a@r6 ----

Merge 1 brings a@r2 into b@r2.
Merge 2 brings b@r4 into a@r5.

  (3.1) svn co b@r2 wc; svn merge a@r1 a@r2 b

There are two ways of performing merge 2.
The first is symmetrical and re-applies a@r2 to a@r6, via b@r3,
with possible spurious conflicts from diff3:

  (3.2 a) svn co a@r5 wc; svn merge b@r2 b@r5 a

The second does not re-apply a@r2, so there are no possible conflicts
from diff3 because of a@r2/b@r3. Only b@r4 can conflict.

  (3.2 b) svn co a@r5 wc; svn merge b@r3 b@r5 a

The result is the same, however.

What we use during --reintegrate is (3.2 b).
You can argue that this approach is broken and we should be using (3.2 a)
for symmetry, and let users deal with spurious conflicts.

But (3.2 b) is always correct and more convenient if diff3 fails to
produce a conflict-free diff when b@r3 is applied to a@r5.
So why not use it?

Alternatively, do you know of a diff3 replacement that is idempotent?



--
Mark Mielke<[email protected]>

Re: Implicit keep-alive after reintegrate merge

Reply via email to