While trying to minimize our 1.8 issue count, I found myself looking issues #4287 (http://subversion.tigris.org/issues/show_bug.cgi?id=4287) and #4100 (http://subversion.tigris.org/issues/show_bug.cgi?id=4100). Both of these involve problems using svnrdump to dump a revision range of a URL which no longer exists in HEAD of the repository.
Along the way, I uncovered something that I guess has just gone unnoticed for the past ... well, quite a few years. The svn_ra_replay() API -- which is the core of the 'svnrdump dump' and 'svnsync sync' functionalities -- is documented like so: /** * Replay the changes from @a revision through @a editor and @a edit_baton. * * Changes will be limited to those that occur under @a session's URL, and * the server will assume that the client has no knowledge of revisions * prior to @a low_water_mark. These two limiting factors define the portion [...] Understand, of course, that given a PATH and a revision range, there are two different ways to do path-and-revision-based filtering: 1. Dump changes related to each revision in the range, filtering out those which affect paths not equal to or descendants of PATH. 2. Crawl the history of PATH between the revisions in the range, dumping related changes. 'svn log' uses method #2. The primary operand for 'svn log' is the line of history of some versioned object. This is why you can run 'svn log' on a branch, and see the changes as they follow the branch's copy from the trunk, etc. Not so for replay -- it's clearly a method-#1 type of operation, where the primary operands are revisions, with a path being used merely as a filter. Unfortunately, our HTTP RA layers are using a method-#2 type of addressing scheme for replay. Both ra_neon and ra_serf issue "replay-report" REPORT requests against the public session URL. For example, if the API is used with a session URL of "http://svn.apache.org/repos/asf/subversion/trunk", then both ra_neon and ra_serf issue REPORT requests against that URL. That works most of the time, but what if the path has been deleted from HEAD? Well, today that's when both ra_neon and ra_serf get 404's back from the server, and can't deal. See, the correct approach when doing a method-#1 operation is to issue the REPORT request against the "me resource URL" (or, pre-httpv2, the "default VCC URL") and then to embed the path filter in the request body itself. That URL always exists, and is a generic way of addressing "the repository". mod_dav_svn would still apply the same path filtering, only it would find the FS-PATH on which to filter in the request body, not tacked onto the end of the request URL itself. ra_neon and ra_serf should be issuing the REPORT against "http://svn.apache.org/repos/asf/!svn/me" and dropping a "filter-path=\"subversion/trunk\"" XML attribute into that request body. So. Looks like I'll be taking a little detour here to try to fix this in a compatibility-preserving way. Here's my plan: Server-side: Recognize which resource was hit with the request, looking for the (optional) path filter in the REPORT request body if the "me resource" or "default VCC" were used. Advertise the new support so clients can construct the best REPORTs possible. Client-side: If the server supports the correct constructs, drop the path filter into the REPORT body and issue the request against the "me resource" or "default VCC" (if non-httpv2). Otherwise, just keep on doing what we do today and wish for the best. -- C. Michael Pilato <cmpil...@collab.net> CollabNet <> www.collab.net <> Enterprise Cloud Development
signature.asc
Description: OpenPGP digital signature