I just know folks are going to hate me for mentioning this, but..... Tom Lord came up with a way to store/apply arbitrary rename applications against trees. Apparently my google-fu failed me tonight because I couldn't find the original reference/explanation. So bear with me as I recall the details...
Reminder: What follows does not take into account how SVN works, and is this written from the point of view of a generic version control system that does support renames in all of their goriness. Storage: You just store the renames as source/destination pairs as appropriate for your system Application of a series of renames: For Each Source Part of the Rename (ordered by deepest first (DFS I think)): Move the source AND its contents to <tempID>/source For Each Destination Part of the Rename (ordered by shallowest first (BFS I think)): Move the item and its contents from <tempID>/source -> the destination of the rename. This has the advantage of never caring how many items are involved in a dependent rename set or what order to apply them in because all dependencies are broken. The downside is that the process is expensive because it costs 2*(# of renames)*(# descendants of source of rename). The descendants matter if your system has an O(N) property for dealing with renames. This allows for weird rename sequences like: Assume you have a directory hierarchy: A/B/C/D Each of these has a file in it named: <dirname>.c i.e.: A/a.c A/B/b.c A/B/C/c.c A/B/C/D/d.c Additionally, we'll add some extra files in D: A/B/C/D/e.c and A/B/C/D/f.c Now the list of renames can begin: (Note: relative paths are listed here for the files to indicate that their containing parent didn't change) A/B/C -> A/C A/B -> A/C/B D/d.c -> D/e.c D/e.c -> D/f.c D/f.c -> D/g.c A/B/C/D -> A/C/E Now, quick, before I give the answer, what does the destination tree look like? Here we go: A/a.c A/C/c.c A/C/B/b.c A/C/E/e.c A/C/E/f.c A/C/E/g.c These renames show some of the nastier classes of renames: Tree Hierarchy changing ones, and simple dependent file renames where its container name changed. Here is the associated step by step application of the above pesudo-code: Deepest first means: A/B/C/D/f.c -> <temp1>/f.c A/B/C/D/e.c -> <temp2>/e.c A/B/C/D/d.c -> <temp3>/d.c A/B/C/D -> <temp4>/D A/B/C (and its contents: c.c) -> <temp5>/C A/B (and its contents: b.c) -> <temp6>/B Thus ends pass 1. Now we have pass 2: (ordered shallowest first by destination) <temp5>/C (and its contents: c.c) -> A/C <temp6>/B (and its contents: b.c) -> A/C/B <temp4>/D (and its contents: nothing atm) -> A/C/E <temp1>/f.c -> A/C/E/g.c <temp2>/e.c -> A/C/E/f.c <temp3>/d.c -> A/C/E/e.c Now we have the final tree structure. This approach also handles the A -> B and B -> A swap you were talking about Greg. Additional fun wrinkles to watch out for: Partial undoing of renames. (i.e. reverting a pending rename) Partial committing of renames. (committing A/C/E/e.c but not A/C/E/f.c) (If you allow this you undo non-committing renames on the set of renames before applying the submitted changes.) Partial merging of renames. Calculating the merged rename path where the destination tree has had hierarchy rearranging renames of its own. Resolving rename merge conflicts in such a way as to make sense to the user when you have an arbitrary tree structure you're merging into. A rename that is being merged consists of a source tree rename and a destination tree rename that you're trying to construct. The important parts of this operation are the source tree rename destination, the source tree rename destination parent (strdp), the current location of the source item in the destination tree and the current location of the strdp in the destination tree. (strdp') The suggested destination tree destination of the merged rename will be: strdp'\<the item's rename destination name> e.g.: You have this tree: A/B/C/c.c branch A -> A' rename A'/B -> A'/D rename A/B/C/c.c -> A/B/d.c merge A -> A' This merge needs to merge the c.c rename into A'. In this case strdp is A/B. strdp's name in branch A' is A'/D. The suggested merged rename destination is: A'/D/d.c Note: Just because I didn't show the two pass steps here don't mean it doesn't need to be done in the general case. :) The same approach is also used during checkin. (Someone could have checked in renames behind your back) (Although you should probably complain and say: updated, retest and then checkin) The same approach could also be used during update. (Pulling down someone else's rename on top of your already existing intersecting pending renames) How does this approach apply to Ev2? Not a clue at all. If you get me to ApacheCon (hah!) I'd be glad to draw on Whiteboards. ;) The above annoying cases are the major reasons that not many version control systems properly deal with renames. Perforce doesn't even try for example. But Bill, can't I just prevent some of these cases from occurring while pending changes in my working copy? Well, yes you can, but what are you going to do in the merge case or in the update case? Remember, anything you can do manually across more than checkin/commit will always be combined into one checkin/commit when you do a merge or an update. Bill On Sat, Nov 5, 2011 at 8:53 PM, Hyrum K Wright <hyrum.wri...@wandisco.com> wrote: > > > On Sat, Nov 5, 2011 at 7:21 PM, Greg Stein <gst...@gmail.com> wrote: >> >> Julian raised a question when I saw him in September: how does Ev2 >> deal with a node swap? ie. swap the contents of A and B, retaining >> metadata that they were moves [rather than copies from history]. >> >> In Ev2, we attempt to disallow "mv A B ; mv B C". The semantics around >> the interface say you should just do "mv A C". >> >> But to accomplish an A/B swap, you must "mv A temp; mv B A; mv temp >> B". Thus, A -> temp -> B, which is nominally disallowed in the Ev2 >> interface. >> >> Question: how to resolve this, given that we're starting to record >> *move* information? (the current interface could do: mv(replacing) A >> B; copy B@REV A) >> >> I've been thinking about a "multiple move" API to describe these kinds >> of changes. It isn't just a simple swap, since you could rotate >> content through an arbitrary set of N nodes. Suppose we had an >> interface that specified N nodes, and node[i] is moved to node[i+1], >> with the last one moving to node[0]. Would this be sufficient? I >> suspect that any moves *outside* of this ring uses the standard >> interface -- it is not participating in the rotation. Hrm. Maybe the >> API is named "rotate" rather than multiple move. (I say rotate because >> I'm envisioning the larger case where you have (say) 5 nodes, and each >> piece of content rotating to the next; looks like a clock or musical >> chairs) >> >> I'm not a master of topography or graphs, but I suspect that any given >> permutation of N nodes can be reduced to a set of rotations of subsets >> of those N nodes. Thus, a general "rotate" API should be sufficient. >> >> Cheers, >> -g >> >> ps. there is a corollary, unstated question of whether these kinds of >> rotations/multiple-move can be recorded in our WC schema. > > I'll do some thinking on this topic, but if you've got some time next week > during ApacheCon, it might be useful to have a sit down and hash some of > this out. > -Hyrum "down to 1154 test failures" Wright > > -- > > uberSVN: Apache Subversion Made Easy > http://www.uberSVN.com/ >