Re: Subtree mergeinfo -- what I learnt at the Hackathon

Julian Foad Mon, 09 Jul 2012 13:17:31 -0700

To move forward and decide what behaviour is right, we need to be able to 
compare the 1.7 behaviour with the proposed behaviour in *specific* scenarios.  
So we need to be able to enumerate the specific scenarios that we mean by the 
general term "merging with subtree mergeinfo".  This is what I am doing 
currently.

The twist is that the best way to enumerate all the possibilities for 1.7 
merges, and the best way to enumerate all the possibilities for an ideal 
symmetric merge, are different.  For example, the 1.7 non-reintegrate merge 
(let's say from branch A to branch B) doesn't look at any mergeinfo indicating 
merges from B to A, so the primary way to categorize those cases is by their 
last complete A->B merge (of the root of the merge, and of a subtree).  B->A 
merges can also affect the result if performed after the last A->B merge, so we 
categorize secondarily by their B->A mergeinfo.  By contrast, an ideal 
symmetric merge only cares about the last complete merge (A->B or B->A); any 
earlier merges in the other direction make no difference at all.

After separately enumerating the 1.7 cases and the (ideal) symmetric merge 
cases, we can split or combine categories as necessary to merge these cases 
into a single list.  Then we can decide what constitutes "same" or 
"backward-compatible" behaviour in each case.  Don't be suspicious -- I'm not 
trying to twist the word "compatible" to mean something else -- rather, what I 
mean is that not all of the possible scenarios are ones in which the 1.7 
behaviour is "good".  For example, we know that in 1.7 if the last complete 
merge was A->B (in r10, say) and then you reintegrate a subtree (B/foo -> A/foo 
in r20), then try to sync A->B again, Subversion will not notice that r20 merge 
and will attempt to re-merge r19:20 of A/foo into B/foo, which "works" only in 
rather trivial cases (due to auto-resolving of some types of duplicate change) 
and I trust we can agree it is in general wrong.

So far, I've only been compiling this categorization from my head; the next 
step is to write tests to confirm or correct this.

CATEGORIZING SUBTREE MERGES

For the purposes of this categorization, we consider:

  * The merging history of the root 'R' (the root node of the requested merge 
source and target trees), and of a subtree 'S' (a single node or subtree whose 
merging history differs from that of the root node in a significant way).  Only 
one subtree is considered; multiple subtrees are assumed to be handled 
independently, even if they are nested (such as root 'A', subtree 'A/D' whose 
history differs from 'A', and subtree 'A/D/foo' whose history differs from 
'A/D').

  * The "last complete merge" of the Root (in one or both directions), and the 
"last complete merge" of the Subtree (in one or both directions).  The "last 
complete merge" in direction A->B means the last revision R for which all 
revisions on A, up to and including R, are (currently) recorded as having been 
merged from A to B.  This state could have been reached through any kind of 
merge or sequence of merges; all that matters is what the current mergeinfo 
says has been merged.

  * There is an assumption that the actual content changes were in fact merged 
in accordance with what the mergeinfo says, subject to any editing that was 
required to resolve any conflicts detected by the merge process and any 
semantic conflicts.

CATEGORIZING SUBTREE MERGES: 1.7 Non-Reintegrate

These cases are for a reintegrate merge A->B.

In each row of this table, up to two Root merges are indicated, and their 
relative ordering is significant; similarly for Subtree merges.  The ordering 
of R merges relative to S merges is not significant.

    Root     | Subtree   | Behaviour
    ---------+-----------+-----------------------------------
 1. never    | same      | OK (not a subtree scenario)
             +-----------+-----------------------------------
 2.          | [S<] S>   | Merge all needed changes
             +-----------+-----------------------------------
 3.          | [S>] S<   | All needed; & some duplicates in S
    ---------+-----------+-----------------------------------
 4. [R<] R>  | same      | OK (not a subtree scenario)
             +-----------+-----------------------------------
 5.          | never     | Merge all needed changes 
 6.          | [S<] S> * |
             +-----------+-----------------------------------
 7.          | [S>] S<   | All needed; & some duplicates in S
    ---------+-----------+-----------------------------------
 8. [R>] R<  | same      | All needed; & some duplicates in R
             |           | (not a subtree scenario)
             +-----------+-----------------------------------
 9.          | none      | All needed; & some duplicates in R 
10.          | [S<] S>   |
             +-----------+-----------------------------------
11.          | [S>] S< * | All needed; & some duplicates in R and S
    ---------+-----------+----------------------------------- 

Key:
  *      -- S> not at same revision as R>, or S< not same as R<.
  R>     -- last complete Root merge in direction A->B.
  R<     -- last complete Root merge in direction B->A.
  never  -- never been merged in either direction since the YCA of A and B.
  [S<]   -- shorthand for both of the cases: no 'S<' merge and an 'S<' merge.
  duplicates -- changes that are already present in the target.

Example:  Row 9 represents the case where the Root's last complete merge was in 
the B->A direction, and its last complete A->B merge was earlier or never; and 
the Subtree likewise.  The root's last complete merge was before or after but 
not the same as the subtree's. 

CATEGORIZING SUBTREE MERGES: 1.7 Reintegrate

These cases are for a reintegrate merge B->A.

Treat this table as just a rough first draft for now; I'm not sure if this is 
the best way to categorize the reintegrate cases, and I need to investigate 
this more thoroughly and test it.

    Root     | Subtree   | Behaviour
    ---------+-----------+-----------------------------------
 1. never    | same      | OK (not a subtree scenario) 
             +-----------+-----------------------------------
 2.          | [S<] S>   | Rejected
 3.          |    S<     | Rejected??
 4.          |  S>  S<   | Rejected
    ---------+-----------+-----------------------------------
 5. [R<] R>  | same      | OK (not a subtree scenario) 
             +-----------+-----------------------------------
 6.          | never     | Rejected 
 7.          | [S<] S> * | Rejected
 8.          |    S<     | Rejected?? 
 9.          |  S>  S<   | Rejected
    ---------+-----------+-----------------------------------
10. [R>] R<  | same      | ?? (not a subtree scenario)
             +-----------+-----------------------------------
11.          | never     | Rejected?? )
12.          | [S<] S>   | Rejected   > non-reint. cases
13.          | [S>] S<   | Rejected   )
    ---------+-----------+----------------------------------- 

CATEGORIZING SUBTREE MERGES: Ideal Symmetric Merge

Now we consider the ideal symmetric merge (not what's currently implemented).

This is primarily concerned with the last complete merge of the root, in 
whichever direction that was, and similarly the last complete merge of the 
subtree.  The (earlier) last complete merge of the root in the other direction 
is not significant, nor is that of the subtree.

    Root  | Subtree       | Ideal behaviour
    ------+---------------+----------------------------------
 1.       | same          | OK (not a subtree scenario)
    R> or +---------------+----------------------------------
 2. never | S> later      | Merge all needed changes
 3.       | S< later      | 
    ------+---------------+ 
 4. R>    | never         |
 5.       | S> earlier    |
 6.       | S< earlier    |
    ------+---------------+----------------------------------
 7.       | same          | OK (not a subtree scenario)
    R<    +---------------+----------------------------------
 8.       | S> later      | Merge all needed changes
 9.       | S< later      |
10.       | never         |
11.       | S> earlier    |
12.       | S< earlier    |
    ------+---------------+---------------------------------- 

Key:
  R>       The last complete merge of Root was in the A->B direction.
  R<       The last complete merge of Root was in the B->A direction.
  never    Has never been merged between A & B in either direction.
  same     Same time and same direction as last complete merge of Root.
  earlier  Earlier than the last complete merge of the Root.
  later    Later than the last complete merge of the Root.

I'm working on integarting these tables, and also on writing tests (or rather 
scenarios using the test suite framework).

Please let me know if this seems like the approach we need, or any other 
thoughts.

- Julian
--
Certified & Supported Apache Subversion Downloads: 
http://www.wandisco.com/subversion/download

I (Julian Foad) wrote:
> [...] I'm not at all demanding we break backward
> compatibility.  Sorry if it sounded like it.  I'm just saying that
> we're proposing to change the behaviour of the plain merge command,
> and in doing that we need to work out what the details of the new
> behaviour will be, and this thread is helping us to do just that.
> I ended up with a bias towards trying to move toward a more rename-
> friendly approach, but I recognise we can't get there yet so the
> "follow each node's own ancestry" idea is just an idea for the
> future.  We need a simpler approach for now.
> 
> Stefan Sperling wrote:
[...]
>> Agreed. Ideally, the symmetric merge will support all currently supported
>> use cases, without throwing errors at users or requiring new command-line
>> switches.
>> 
>> I haven't yet made up my mind about interim measures for 1.8 though.
>> I suppose if symmetric merge won't support all currently supported use cases
>> in 1.8, we could keep the --symmetric option in place for 1.8, and drop it 
>> in 1.9 or later once the symmetric merge code can handle all use cases?

Re: Subtree mergeinfo -- what I learnt at the Hackathon

Reply via email to