Thanks, Stefan! Very interesting thoughts. Especially... (scroll down)...
Stefan Fuhrmann wrote: On Wed, Sep 11, 2013 at 5:21 PM, Julian Foad wrote: >> http://wiki.apache.org/subversion/MoveDev/MoveDev#Move_Semantics >> http://wiki.apache.org/subversion/MoveDev/MovesInFSFS >> [...] >> >> One issue that may be harder than it sounds at first is the concept of >> 'node-line-id' rather than (node-id, copy-id) as the basis of the >> definition. The point is that when we copy (ordinary copy, not move) >> a directory, we lazy-copy the children, which means each child keeps >> its old (node-id, copy-id) unless and until it is modified. That's >> great for achieving the O(1) copy, but for move-tracking purposes each >> child needs a unique "node-line-id" so its life-line can be uniquely >> traced forward and back between this revision and a later revision by >> which time it may have been modified and thus assigned a new copy-id. >> >> Clearly it would defeat the O(1) cost if we were to construct a >> node-line-id explicitly for every node in the tree at copy time. Can >> we instead define node-line-id such that we can compute it as needed, >> from either an unmodified lazy-copied child or after such a child has >> been modified, and get the same answer? Or perhaps re-state the >> problem to avoid this need? > > I'm currently bogged down in svnlive prep work but here is my quick feedback; > more to come next week. > > Bottom line: looks ok, especially the API seems fine and performance in f7 > should be acceptable even if it is O(changes in [rA .. rB]). > > General observations: > >* We need a format bump for the extra "M" entries in the changed path > lists, potential "lazy" markers in the tree etc. But that is not a problem > as the log-addressing branch probably gets merged in about 2 weeks > time and bumps to f7 anyway. It also brings the infrastructure for > "mixed addressing" such that we may introduce extended structures > in existing repositories without touching existing revisions. > >* Existing copy&del pairs will not be treated as move since the node-line-id > does not match. Maybe, we can add some intelligence to 'svnadmin load'. > >* A copy effectively destroys all move relationships below it. That seems > unfortunate (say, you duplicate a project) but the solution to that would > probably require hierarchical IDs ("match IDs within the context of this > sub-tree"). That's a good observation. Here's an example, to clarify: r10: trunk/foo move foo to bar r20: trunk/bar copy trunk to branch1 r30: trunk/bar branch1/bar Now request "svn diff -r10:30 branch1". It would be useful if Subversion could say trunk/foo@10 moved to branch1/bar@30 in the context of this diff. (Where I say "diff" we can also substitute "update", "merge", and so on.) This only makes sense for a copy at or above the root path of the requested diff. In this example, it makes sense for "diff -r10:30 branch1" and for "diff -r10:30 branch1/bar". It does not make sense across a copy that happened below the target: in this case, "diff -r10:30 ^/" would NOT be expected to show foo@10->bar@30 as a move. One way of looking at this is that our history-tracing that's used to find "-r10 branch@30" in such scenarios is *already* following copies at the root of the subtree as if they are moves, and in a way this would be extending that idea. This seems like functionality that should be provided in a higher layer; the FS layer just needs to provide some primitive queries to make this possible. I'm not sure what, exactly. >* Support for resurrection of deleted nodes *without* destroying any move > relationship is potentially expensive but I think we should support this > early on (maybe not in 1.9 but def. in 1.10). People just happen to delete > their /trunk once in a while and you don't want to tell them that *now* they > actually managed to break something ... Yes. > Proposal: Resurrect keeping the old node-line IDs, iff > (a) the copy source (or a parent) got deleted in the next revision > (b) no copies of that node (or any parent) were added since the source rev. > That should keep normal copying relatively cheap and still provide the > special behavior for our "undo" use-case. I think you mean that the user-level copy should do this automatically. Perhaps so. That need not be implemented in the FS layer; the FS layer could just provide the primitives necessary to implement it. > I'd like to implement that - after some in-depth more review and would even > be willing to postpone the cache-server feature to 1.10 because move tracking > is much more important atm. Wonderful! - Julian