Stefan: I believe you are agreeing that the merge in either direction is the same complexity, and describing how --reintegrate moves the responsibility for the complexity to the owner of the private branch, and requires resolution before submission. I think you are saying this is a good thing because diff3 isn't perfect.

In my experience:

No merge is perfect. The situation is either complex, or it is not complex - and moving resolution to the private branch is a matter of process - not a matter of algorithm. That is, it is the responsibility of the team to decide that "we will always make sure our private branch is up to date before submitting to the integration stream."

In particular, if I have a stream with 100 users working in parallel, all submitting on a regular basis because this is their full time paid job to work on a piece of software, it may be a race to actually get the submission - depending on if the algorithm can detect whether the same files are being changed or not.

The first thing the tool can do to be genuinely useful in this situation, is to accept some of the responsibility of detecting whether or not the race is one of these "diff3 is not idempotent" situations, and providing automatic handling. If the case has been hit, then --reintegrate could be used as a form of "special error checking" where it does the same as "merge", except in the case that the merge has a true conflict with any particular element of the change set (as opposed to a potential conflict with the end result), where the results of diff3 would need to be "trusted", then it could bail and provide the user with the information required to resolve the conflict locally before submission.

The second thing the tool can do to be genuinely useful in this situation, is to allow for this check to be overridden. If I didn't trust diff3 - I wouldn't use merges at all. Sometimes a source management tool just needs to help me resolve conflicts. Especially with merge tracking and intelligent designer workflows, many cases of so called "conflicts" touch unrelated lines of code, and it *is* safe to complete the merge, even to the integration stream. I should have the ability to choose to do this, rather than race for submission with 100 other users.

The worst thing the tool can do is to declare that "diff3 is idempotent therefore it should be disabled" during --reintegrate. Yuck. This is a partial solution and at least as I understand it - it is even dangerous. What happens if I use --reintegrate in a situation that actually does require merge resolution? Will every situation be blocked? Or will it take --reintegrate as a license to overwrite results, trusting that I can do all the necessary conflict checking myself? I have seen nothing so far that allows me to conclude that architecturally, Subversion requires the --reintegrate behaviour. It's a short cut in providing a complete branch merging solution for users of the system. Somebody started work on the canvas, and then drafted in the last corner rather than finish it. :-)

Cheers,
mark


On 01/30/2012 08:23 AM, Stefan Sperling wrote:
The same applies to "reintegrate", BTW. It is a Subversion-specific concept that might not be represented in CM theory because it is, as you point out, just a special case of the general merge (you didn't describe what "merge" means in your theory so I'm just going to make assumptions).

Just to make sure it's understood: When you create a branch, the origin
of the branch is an interesting bit of information. However, for
merging, it is entirely irrelevant if branch A was created from B or the
other way around. To illustrate:

     (1)
                +- b@r2 ---- b@r3 ----
      (branch) /              | (merge)
              /               v
        --- a@r1 -------------+- a@r4 ----

     (2)
        --- a@r1 ----------- a@r3 ----
              \               | (merge)
      (branch) \              v
                +- b@r2 ------+- b@r4 ----


Cases (1) and (2) are exactly equivalent as far as the merge algorithm
is concerned, but Subversion calls the first a reintegrate merge and the
second a sync merge, and treats them differently, as if branch (a) were
somehow special. It's not.
If you always use the 2-URL merge syntax all the abstractions go away
and you'll have symmetry.

  (1) svn co a@r4 wc; svn merge b@r2 b@r3 a
  (2) svn co b@r4 wc; svn merge a@r1 a@r3 b

See? Perfectly symmetrical.

Your example is too simple, though.
You only have one change being merged either way, and no cycles.

Generally, we want to avoid spurious conflicts from diff3 which happen
when changes are applied twice because diff3 is not idempotent.
I.e. we break the nice symmetry to work around a limitation of diff3.

In the following case we can avoid spurious conflicts by picking
our parameters carefully:

      (3)
                 +-b@r2--+ b@r3--b@r4-b@r5 ----
       (branch) /        ^             | (merge 2)
               /         | (merge 1)   v
         --- a@r1 ------a@r2-----------+- a@r6 ----

Merge 1 brings a@r2 into b@r2.
Merge 2 brings b@r4 into a@r5.

  (3.1) svn co b@r2 wc; svn merge a@r1 a@r2 b

There are two ways of performing merge 2.
The first is symmetrical and re-applies a@r2 to a@r6, via b@r3,
with possible spurious conflicts from diff3:

  (3.2 a) svn co a@r5 wc; svn merge b@r2 b@r5 a

The second does not re-apply a@r2, so there are no possible conflicts
from diff3 because of a@r2/b@r3. Only b@r4 can conflict.

  (3.2 b) svn co a@r5 wc; svn merge b@r3 b@r5 a

The result is the same, however.

What we use during --reintegrate is (3.2 b).
You can argue that this approach is broken and we should be using (3.2 a)
for symmetry, and let users deal with spurious conflicts.

But (3.2 b) is always correct and more convenient if diff3 fails to
produce a conflict-free diff when b@r3 is applied to a@r5.
So why not use it?

Alternatively, do you know of a diff3 replacement that is idempotent?


--
Mark Mielke<m...@mielke.cc>

Reply via email to