On 24.06.2013 17:22, Julian Foad wrote: > For move tracking we need a move-aware editor. svn_editor_t ("Ev2") is > designed to support moves. However, I'm not clear how its move support is > meant to work, on a fairly basic level: > > * What are the ordering rules for moves? > > * Are the move operations to be interprested in series or in parallel? > > * When an operation refers to a node that is involved in a move, how are > the paths to be interpreted?
I find that answeres to all of these questions can be inferred from the Once Rule. The way I interpret that, there's no restriction on the order of any operation, as long as the Once Rule is obeyed. I have not been able to come up with a case yet where the available operations would not cover all possible tree mutations while still adhering to that rule. That IMO makes the question of serial vs. parallel immaterial. > == Cyclic moves == > > We need to accommodate arbitrary sets of moves, including cycles. For an > example, we'll use a simple cycle, swapping two siblings A and B. This > ability is needed for two reasons: > > * Ev1 supports this (del A, copy A from B@OLD, del B, copy B from A@OLD). This is a cyclic copy, not a cyclic move. EV2 has the rotate operation that's intended to cover the effect of $ svn rename A tmp; $ svn rename B A $ svn rename tmp B Note that the above is the sequence of operations in the working copy, but the end result, as far as tree mutations are concerned, is rotate(A, B) -- regardless of how or if you want to represent intermediate states. > == Swap or Rotate == > > swap(A,B) == { A->B || B->A } > > is a primitive operation. ('||' indicates 'in parallel'.) > > rotate(A,B,...N) == { A->B || B->... || ...->N || N->A } > > is non-primitive, since any rotate can be composed from multiple sequential > swaps. For example, rotate(A,B,C) == { swap(B,C); swap(A,B) }. It is no > more useful to rotate more than two paths in one operation (rotate(A,B,C)) > than to move more than two paths in one operation (move(A,B,C) == { A->B || > B->C }). > > So if we have a sequential scheme and we don't want to use temporary paths, > we should include the 'swap' primitive rather than a 'rotate'. I don't understand why you want to devolve to more primitive operations. Most of the EV2 ops are not primitive; the add_* and alter_* can all be viewed as a combination of simpler operations. Furthermore, replacing rotate with swap would place the burden of representing a rotation with a series of swaps on every edit driver, which doesn't make much sense. Rotate is just a generalized form of swap. Given how moves are currently described in the working copy, rotate fits more naturally into the model anyway -- all you have to do is follow the moved-from links until you find a cycle, and your rotation is known. I can't think of a good reason to break that down into swaps. > == Ev2 == > > One of Ev2's goals is to parallelize text modifications, in order to take > advantage of Serf's parallelism. What about tree changes, and specifically > what are the semantics of the 'move' operation? The Ev2 documentation > indicates sequential requirements for the 'add' operation (parent before > children), and also in this rule: > > * - The ancestor of an added, copied-here, moved-here, rotated, or > * modified node may not be deleted. The ancestor may not be moved > * (instead: perform the move, *then* the edits). > > The doc string for 'svn_editor_move' says: > > * Move the node at @a src_relpath to @a dst_relpath. > * > * @a src_revision specifies the revision at which the receiver should > * expect to find this node. That is, @a src_relpath at the start of > * the whole edit and @a src_relpath at @a src_revision must lie within > * the same node-rev (aka history-segment). This is just like the > * revisions specified to svn_editor_delete() and svn_editor_rotate(). > * ... > > The discussion implies that "the node at src_relpath" refers to a node in the > initial tree state. In that respect, it's a parallel operation: the source > reference doesn't depend on previous moves. But look at the 'move_cb' > implementation in libsvn_fs/editor.c: > > # src_root == *initial* tree state > > svn_fs_copy(src_root, src_fspath, root, dst_fspath, scratch_pool) > /* Notice: we're deleting the src repos path from the dst root. */ > # root == *current* tree state (txn) > svn_fs_delete(root, src_fspath, scratch_pool) > > The delete of src_relpath within the current tree state is potentially at > odds with the copy from src_relpath relative to the initial state. The > delete would delete the wrong thing if the original node at this path has > already been replaced, or would try to delete a non-existent node if it had > already been deleted or moved away. Could it have been replaced or deleted > or moved away, if not itself then perhaps by a copy/move/add/delete of a > parent directory? This is what the once rule prevents. One of the reasons for it, as far as I could determine, is to be able to detect tree incompatible tree changes (i.e., changes that will make a commit fail) early in the edit instead of having to wait for the commit-time rebase to fail. > > Let's try the following scenario: > > A->A2 || A/B->B2 # move a child of a moved parent > > Try with Ev2: > > move(src_relpath='A', src_rev=100, dst_relpath='A2', replaces_rev=-1) > move(src_relpath='A/B', src_rev=100, dst_relpath='B2', replaces_rev=-1) You're breaking the once rule here. And the case you're describing can never occur. You cannot have a working copy that describes what you're doing. Tree mutations can only be parallelized across distinct subtrees, which isn't the case in your example; where the operations interact, they must be sequential or they don't mean anything. Your case is either: A->A2; A2/B -> B2 or A/B -> B2; A -> A2 which happen to be the two simplest sequences of working copy changes that'll generate your end result. > The implementation shown above would fail in the second move when trying to > delete 'A/B', because that path no longer exists in the transaction after the > first move deleted the whole subtree at 'A'. > > The other way works fine: > > move(src_relpath='A/B', src_rev=100, dst_relpath='B2', replaces_rev=-1) > move(src_relpath='A', src_rev=100, dst_relpath='A2', replaces_rev=-1) > > So these moves are not parallelizable with this implementation. (Is that > implementation wrong?) > > Ev2 also documents (as the first Driving Order Restriction) that there must > be an alter_directory call for the parent of moved-away node 'A/B'. What > does this look like? > > alter_dir('A', ...) > > or > > alter_dir('A2', ...) > > ? Can the alter_dir come before or after the move of 'A' to 'A2'? Is the > path it references 'A' in either case, or 'A2' in either case, or 'A' if it > comes before the move and 'A2' if it comes after the move? > > Can we start with a clear consensus of whether we're trying to deliver a > sequential or a parallel editor? Both :) However I am a bit worried that the "once rule", as stated, is always broken in the case you describe. -- Brane -- Branko Čibej | Director of Subversion WANdisco // Non-Stop Data e. br...@wandisco.com