On Fri, Mar 8, 2013 at 4:17 PM, Paul Burba <ptbu...@gmail.com> wrote: > On Tue, Mar 5, 2013 at 11:37 AM, Julian Foad <julianf...@btopenworld.com> > wrote: >> Paul Burba wrote: >>> On Wed, Feb 27, 2013 at 5:04 PM, Julian Foad wrote: >>>> Paul Burba wrote: >>>>> I found the cause of the conflict filled reintegrate merge. The >>>>> automatic merge code seems to be doing the right thing re Mark's >>>>> automatic merge above, the problem arises earlier. >>>> >>>> Paul, thanks for investigating. While I'm glad to hear the automatic >>>> merge was not the main root cause of the problem, it would make sense >>>> for it to detect that merges in the opposite direction have been done >>>> and at least warn the user. >>> >>> Hi Julian, >>> >>> What are we going to warn them to do exactly? >> >> What I was thinking of here is to detect the most general condition that >> isn't handled properly. Let me explain this part, although I'm sure you're >> familiar with it. >> >> Whenever the direction of merge is from A to B and the automatic merge code >> doesn't detect that a reintegrate merge is required, and yet one or more >> cherry-picked revision ranges and/or subtree merges have been merged from B >> to A, Subversion currently doesn't notice those merges from B to A, and >> proceeds to merge "everything" from A to B including those changes that were >> the result of B-to-A merges. That is what the plain "merge" from A to B has >> always done -- it has always ignored any mergeinfo describing merges from B >> to A. > > Yes, http://subversion.tigris.org/issues/show_bug.cgi?id=4255 for the curious. > >> In the example at hand, there has been a B-to-A merge of everything from B >> (except some non-operative revisions), and so we get loads of conflicts by >> ignoring this when we merge the other way. The intelligent user thinks, >> "This is clearly a reintegrate scenario." > > Do they? I'm not very confident that most users will immediately > think that, especially since we've deprecated the --reintegrate > option. Even if they do think "I need a reintegrate style merge!" I > suspect they are more likely to do just that, explicitly using the > --reintegrate option, rather than re-syncing B->A and then repeating > the automatic "reintegrate style" merge. > >> On the other hand, if there has been a B-to-A merge of not quite everything >> from B (leaving an operative unmerged part at the beginning or somewhere in >> the middle), then an old "reintegrate" merge from A to B would error out, >> whereas an old non-reintegrate merge would just do the merge and wrongly >> re-merge all the B-to-A changes back to B. The new automatic merge, if it >> determines (based on the merging of the branch root node) to do a >> non-reintegrate, currently does the latter. >> >> At some point, maybe not for 1.8, if we can detect any such condition, it >> would be a good to warn the user: >> >> A clean merge from A to B is not possible because some but not all >> changes have been merged from B to A. You're going to see conflicts >> (unless all your potential conflicts are trivially auto-resolved). > > But a reintegrate style merge already does this, if the automatic > merge could only determine up front that that is the correct type of > merge. > >> We could add some suggestions to the warning: >> >> If you are expecting this to be a reintegrate-like merge, check that >> all changes on B including non-operative revision ranges have been >> merged to A and recorded in the mergeinfo; try an automatic sync merge >> from B to A to fill in any gaps. >> >> If you are expecting a sync merge, note that there have been some >> cherry-pick and/or subtree merges from B to A, and this merge will >> attempt to merge these same changes back to B, which will likely >> produce conflicts. >> >> We could add more specific details of what's been found, and more specific >> suggestions related to the details, such as the case at hand where all >> operative revs have been merged B-to-A, if we think it's worth the effort. >> >> A basic implementation would not be cheap if there is a significant amout of >> subtree mergeinfo to check. And I have carefully avoided the issue of >> whether we should error out or proceed with the merge, as there's no point >> bike-shedding about that unless we really decide to do this. >> >> So let's take a closer look at what we could do for the specific case at >> hand. >> >> >>> To explicitly use >>> --reintegrate option? Because that would work in Stefan's case, so we >>> should just do it rather than warn...unless you mean something else. >> >> So, you're thinking more about detecting the very specific kind of scenario >> that Stefan encountered -- where everything except some leading >> non-operative revisions had been merged B-to-A, and a reintegrate would give >> the "correct" result. > > Why is "correct" in quotes? How is a reintegrate style merge ever the > wrong choice in that scenario? > > Actually, don't answer that, because I'm thinking of a more general > case: The missing non-operative revisions may be leading, as in > Stefan's case, or they may be interspersed with operative revisions > that have already been merged (including subtree merges where the > change is only operative in the given subtree -- i.e. would be a no-op > even if repeated at the branch root). Right now the automatic merge > code (i.e. merge.c:find_last_merged_location()) obviously doesn't > handle this, but we already have code that can answer this question: > svn_client_mergeinfo_log2[1]. > > Here's an example from our test suite of what I mean: > > Run merge_reintegrate_tests.py 13 then revert the local changes the > test leaves behind. We want to merge A_COPY to A, but there have been > prior cherrypicks and subtree merges from A to A_COPY: > >>svn pl -vR A_COPY > Properties on 'A_COPY': > svn:mergeinfo > /A:6 > Properties on 'A_COPY\B': > svn:mergeinfo > /A/B:5 > Properties on 'A_COPY\D\G\rho': > svn:mergeinfo > /A/D/G/rho:4 > Properties on 'A_COPY\D\H': > svn:mergeinfo > /A/D/H:3,6 > > However, we can see that these prior cherrypicks and subtree merges > actually constitute all operative revisions up to r8: > >>svn mergeinfo --show-revs eligible -R ^^/A A_COPY > r9 > > So an explicit reintegrate merge works fine: > >>svn merge ^^/A_COPY A --reintegrate > --- Merging differences between repository URLs into 'A': > U A\mu > --- Recording mergeinfo for merge between repository URLs into 'A': > U A > > But an automatic merge doesn't notice this, wrongly chooses a sync > style merge and results in spurious changes/conflicts: > >>svn revert -Rq . > >>svn merge ^^/A_COPY A > --- Merging r2 through r9 into 'A': > U A\mu > C A\D\H\psi > --- Recording mergeinfo for merge of r2 through r9 into 'A': > U A > Summary of conflicts: > Text conflicts: 1 > Conflict discovered in file 'A\D\H\psi'. > Select: (p) postpone, (df) diff-full, (e) edit, (m) merge, > (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p > > So my point is simply that if an explicit --reintegrate merge can > DTRT, why can't an automatic merge? > > [1] calculate_left_hand_side also does effectively the same thing -- > which makes me wonder if we can't use svn_client_mergeinfo_log2 there, > but that's another issue. > >> In an immediate sense, certainly we could "just do it", special-case this >> particular scenario (have you thought about what the exact condition should >> be?) and automatically choose the reintegrate code path in that case. > > See above "I'm thinking of a more general case". > >> The trouble is, that doesn't resolve the situation completely. It gives the >> correct result in terms of the changes merged, but it doesn't straighten out >> the mergeinfo. > > You mean the mergeinfo on the source? > >> That means we have to make sure the rest of the merge subsystem consistently >> handles this case too: the "sync"-style merge, merge graphing tools that >> determine what kind of merge was done and what kinds of merge are eligible, >> and so on. See below. > > I'm not sure how this is a problem. If we do a reintegrate style > merge in the general situation I describe above, how does that break > subsequent sync style merges? Can you point to an example? in terms > of determining eligible revisions, 'svn mergeinfo' already handles > these gaps. Which specific graphing tools are you speaking of? > >>> I filed an issue for this: >>> http://subversion.tigris.org/issues/show_bug.cgi?id=4329 and added a >>> test which demonstrates a very simple version of the problem. I set >>> issue #4329 a 1.8 blocker. While I think what Stefan is doing is >>> somewhat atypical[1], if we are deprecating --reintegrate, then >>> automatic merges should handle cases like this. After all, if we >>> explicitly use the --reintegrate option the merge works fine. What do >>> others think? >> >> I agree it's atypical (or rather unintended) usage, and I also acknowledge >> that if Stefan used merge like this, then sometimes other people will do too. >> >> However, I don't want us to put any more special cases in the code, >> especially without putting them into the *design*. If the old >> "--reintegrate" accepts a missing leading no-op revision range in >> considering whether "everything" has been merged, that should now be >> considered a bug. > > Seriously? As long as we allow cherrypicks and subtree merges I think > we want to support use cases like this...to call it a bug that we > DTRT, I guess I just don't get it. > >> The new rule is that if every revision on B is *recorded* as merged to A, >> then we can reintegrate from A to B, else we can't. > > I never came away with the understanding in all the discussion of > automatic merges...but maybe I'm the only one. > >> The main alternative would be to adopt some principle such as "When >> determining what's been merged and what needs to be merged, we only notice >> operative revisions, aka non-null changes". I considered that during the >> design of automatic merge, and was strongly drawn to the idea and started >> prototyping around it, but that turned out to be a rat-hole involving >> potentially exhaustive traversal of the whole mergeinfo graph (all branches) >> to determine whether a given bit of mergeinfo represented an operative or an >> inoperative change. And the benefits are near zero, because both sync and >> reintegrate merges were already filling in any gaps in the mergeinfo in the >> direction of the merge so the situation can only arise if you use manual >> merges. >> >> The sensible choice in the end was to require that revision ranges are >> explicitly recorded whether operative or not. >> >> Having found a case that we suspect may occur frequently enough to cause >> annoyance, I believe the best thing to do is to admit that we don't handle >> it automatically. Just because the old reintegrate handles this particular >> case doesn't mean we're making a mistake if the new automatic merge doesn't. >> This behaviour change is a small adjustment towards overall more consistent >> merge behaviour, and consistency is key to large-scale usability, much more >> so than special casing is. >> >> So, what to do exactly? Options seem to be: >> >> 0) leave it as it is >> >> 1) detect this specific case and warn or error out >> >> 2) detect this specific case and do a reintegrate > > The attached patch accomplishes, let's call it "2.5", because it > handles the more general case I outlined above, but not the case > represented by issue #4255. Would you mind taking a look?
fyi http://subversion.tigris.org/issues/show_bug.cgi?id=4329#desc7 explains my proposed patch in terms of how it changes the symmetric merge algorithm. -- Paul T. Burba CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development Skype: ptburba