<cmpilato> pburba: got a second (ha!) for a dry-run merge question? <pburba> cmpilato: Shoot
<cmpilato> i'm trying to determine how the code handles notifications for added items which are children of other added items. <cmpilato> i've managed to regress on my issue-4194-dev branch. got some merge --dry-run's not jiving with the actual merge. <cmpilato> merge_tests.py 2 is the one i'm looking at now. <pburba> cmpilato: Let me build that branch and take a look <cmpilato> ok. <cmpilato> but my general question is: the merge_file_added() logic branch, where does the dry-run code learn that its okay to not consider the file "foo/bar" obstructed even though "foo" doesn't exist on disk *because* "foo" was also added in the merge? <cmpilato> i can see "foo" (or, in this case, ".../A/C/Q" in the merge_b->dry_run_added hash. <cmpilato> but when handling the subsequent merge_file_added of ".../A/C/Q/bar" and ".../A/C/Q/bar2", i never see the code consult that hash to see if the reason why there's no A/C/Q on disk is because it, too, is being added as part of the merge. <pburba> cmpilato: I don't immediately follow the question, I'm going to need a few to look at the test & code <cmpilato> fair 'nuff. <cmpilato> by the way, ra_serf is driving the editor a bit wonkily, per the norm. <cmpilato> (so don't be surprised by that.) <cmpilato> but it seems to be driving it differently wonkily than it does on trunk. <pburba> cmpilato: Is the test failure only with ra_serf? <cmpilato> yes. <cmpilato> probably shoulda mentioned that. doh. <cmpilato> oh! i think i finally found it! <pburba> Old bug? <cmpilato> Not in a universe where editors are driven to the specification. :-) <cmpilato> But ra_serf... <cmpilato> But yes, I think it's a preexisting bug that just gets tickled now (as much by chance as by anything else). <cmpilato> We keep both a merge_b->added_path AND a merge_b->dry_run_added hash. I think we can drop the former and consult the latter in those situations. <cmpilato> (though, the cost will be ridiculous...) Maybe we keep merge_b->added_path and check it as we do now. We only only consult merge_b->dry_run_added as a fall-back, but if the fall-back check does determine that the dry-run added path is the child of a previous (but not most recent) dry-run addition, then we update merge_b->added_path to the immediate parent of the latest added path. That would keep the check quick for cases like merge_tests.py 2 when a dir is added, then a sibling dir is added, and then children of the first dir are added. A A\C\Q A A\C\Q2 A A\C\Q\s1 A A\C\Q\s2 . . A A\C\Q\sN A A\C\Q2\s1 A A\C\Q2\s2 . . A A\C\Q2\sN Of course in the worst case this is just as bad as having to iterate over merge_b->dry_run_added checking for potential parents of a newly added path. Alternatively, we pitch merge_b->added_path as you suggest and change merge_b->dry_run_added so it is not a complete list of added paths, but only tracks the *roots* of any added subtrees. Of course this makes dry_run_added_p more expensive since it has to iterate the hash rather than doing a single lookup. Or possibly (this is the "likely to be judged overkill" option) we still get rid of merge_b->added_path, but we change merge_b->dry_run_added from a hash to tree data structure whose leaf nodes map to the roots of added subtrees and whose root node maps the nearest common ancestor of all the added subtree roots found so far. This is still O(n) like option 2, but n is the depth of the tree rather than the number of subtree roots -- which in the worst case is likely to be a lot less (he says without much conviction) -- Paul T. Burba CollabNet, Inc. -- www.collab.net -- Enterprise Cloud Development Skype: ptburba P.S. We still won't see ordering like this will we? A A\C\Q A A\C\Q2 A A\C\Q\s1 A A\C\Q2\s1 A A\C\Q\s2