I agree that a symmetric merge algorithm would suffice for both 'sync' and 
'reintegrate' merges.


I'm starting to write this up here: 
<http://wiki.apache.org/subversion/SvnMergeTheory>.

Subversion's current merge implementation makes assumptions in both the 'sync' 
and 'reintegrate' cases.  (The part of the 'algorithm' I'm talking about is the 
part that looks at the mergeinfo and determines which three-way merge to 
perform next.)  Specifically, I think the main assumptions are that 'sync' 
assumes the base for the 3-way merge is going to be on the source branch, while 
'reintegrate' assumes the base is going to be on the target branch.


A slightly more general merge algorithm should be able to pick a base on either 
the source or the target branch depending on what was merged before, and so 
satisfy both of those cases.  In order to do so, it needs to look more deeply 
at the mergeinfo that the courrent implementation does. 


Branko Čibej wrote:
> [...] given the slightly more complex example:
> 
>                  +b@r2--b@r4-+-b@r5---b@r6---+--b@r8-----
>        (branch) /            ^                ^ (merge 2) \
>                /              | (merge 1)      |            \ (merge 3)
>          --- a@r1---a@r3-----+---------a@r7--+-------------+-a@r9 ---
> 
> the results should be, effectively:
> 
>   * merge 1:
>     diff3 b@r4 a@r1 a@r3 | patch b
>   * merge 2:
>     diff3 b@r6 a@r3 a@r7 | patch b
>   * merge 3:
>     diff3 a@r7 b@r2 b@r4 | patch a
>     diff3 a@r7 b@r5 b@r6 | patch a

(I'm assuming that your b@r2 in this example is a simple copy with no diff,
 otherwise the first part of 'merge 3' should use a@r1 instead of b@r2 for its 
base.)

But no, merge 3 should be even simpler than that.  We don't have to go and 
fetch the old (b@r2:b@r4) and (b@r5:b@r6) diffs again and merge them into (a), 
because that's already been done recently and stored as (b@r8).  So all we have 
to do is locate the youngest common ancestor of (b@r8) and (a@head), which is 
(a@r7), and do a single 3-way merge:

  * merge 3: (mine=a@head, old=a@r7, yours=b@head)
    diff3 a@r7 a@r7 b@r8 | patch a

(In this case this merge is trivial because mine==old, but if there had been a 
further change on (a), say in (a@r8), then it would be non-trivial.)

> Merge 3 is a cherry-pick merge, of course. But whatever you do, you
> always pick your common ancestor so that it's the most recent merge
> point from the revision you're merging, and the "myfile" is always 
> the
> most recent version on the branch you're merging to (or, effectively, in
> the target WC).
> 
> You'll be cherry-picking as long as both branches are being actively
> modified, but you always have to do the check.

I don't follow all that about cherry-picking.

> The merge algorithm is symmetric.

That, I do agree with.

> SVN's mergeinfo has all the data that are
> required to automate this merging, it's just not being used correctly.

I know that Subversion stores sufficient info after a sync, and how to derive 
those merge arrows accurately from it; I'm about to check exactly what 
mergeinfo it stores after a reintegrate and make sure that's also sufficient.

Locating the youngest common ancestor is a step that needs further 
explanation; it's not trivially obvious.  We mean to follow the "merge 
arrows" that are shown on these graphs, as well as the basic lines of 
history.  Each of these "merge arrows" means that all changes in the 
source branch have been merged into the target branch at that point.  When 
modify this algorithm to cope with cherry-picking, that is
 to skip revisions that have already been cherry-picked, which we will need to 
do, then we will also have another kind of merge arrow on our graphs, a kind 
that doesn't mean all source changes have been merged to the target.  The 
algorithm will get more complex.  I won't talk about that here.)

- Julian

Reply via email to