On Mon, 2010-03-01, kame...@apache.org wrote: > Author: kameshj > Date: Mon Mar 1 13:48:01 2010 > New Revision: 917523 > > URL: http://svn.apache.org/viewvc?rev=917523&view=rev > Log: > With the below apache configuration(See the <space> character "/svn 1/"). > > <Location "/svn 1/"> [...] > Write through proxy is *not* happening and commit happens *directly* inside > the slave.
Hi Kamesh. Today I have studied this change thoroughly, and it appears to me that it changes the interpretation of the URL given in the <Location> directive (and the SVNMasterURI directive) in a way that is probably undesirable. Previously, the URL was assumed to be URI-encoded, and thus this one with a space in it did not work because it did not match the canonical equivalent ("/svn%201/"). That caused the "bug" that you were trying to fix, I believe. After this patch, it appears to me that the URL given in the <Location> directive (and the one in the SVNMasterURI directive) is assumed to be NOT encoded, and mod_dav_svn will encode it internally. The new behaviour will be wrong for any Location URL that is already written with %XX encoding, because it will encode it again. If we agree that the URL should be assumed already to be URI-encoded (at least to the extent necessary to make "%" characters unambiguous), I think there are two possible solutions: 1) Tell the user to use proper (canonically URI-encoded) URIs. 2) Assume the URI is already in (partly) URI-encoded form but re-encode it into canonical URI-encoded form, encoding characters such as SPACE but not re-encoding the "%" character, and also unencode any characters that should not be encoded in canonical URI-encoded form. (1) is rather unfriendly. More follows ... > * subversion/mod_dav_svn/mirror.c > (proxy_request_fixup): URI encode the to be proxied file name. > (dav_svn__proxy_request_fixup): r->unparsed_uri is in url encoded form while > root_dir is not in encoded form. So use r->uri to compare with root_dir. > (dav_svn__location_in_filter): URL Encode the 'find & replace' urls as > the request body has it in url encoded format. > (dav_svn__location_header_filter): Encode the master_uri as the response from > master has the Location header url encoded already. Set the outgoing Location > header url encoded. > (dav_svn__location_body_filter): URL Encode the 'find & replace' urls as > the response body has it in url encoded format. > > Modified: > subversion/trunk/subversion/mod_dav_svn/mirror.c > > Modified: subversion/trunk/subversion/mod_dav_svn/mirror.c > URL: > http://svn.apache.org/viewvc/subversion/trunk/subversion/mod_dav_svn/mirror.c?rev=917523&r1=917522&r2=917523&view=diff > ============================================================================== > --- subversion/trunk/subversion/mod_dav_svn/mirror.c (original) > +++ subversion/trunk/subversion/mod_dav_svn/mirror.c Mon Mar 1 13:48:01 2010 > @@ -45,8 +45,10 @@ > > r->proxyreq = PROXYREQ_REVERSE; > r->uri = r->unparsed_uri; > - r->filename = apr_pstrcat(r->pool, "proxy:", master_uri, > - uri_segment, NULL); > + r->filename = (char *) svn_path_uri_encode(apr_pstrcat(r->pool, "proxy:", > + master_uri, > + uri_segment, > + NULL), r->pool); That looks odd. It may be functionally correct, but normally I would assume that any parameters with names like "master_uri" and "uri_segment" would be already URI-encoded. (If so, then encoding them again would be wrong.) Don't we want to keep URI data in URI-encoded form most of the time? In this case, it appears that you have modified this function and the caller so that the arguments passed are NOT URI-encoded. Looking at where the "uri_segment" comes from ... > r->handler = "proxy-server"; > ap_add_output_filter("LocationRewrite", NULL, r, r->connection); > ap_add_output_filter("ReposRewrite", NULL, r, r->connection); > @@ -78,7 +80,7 @@ > transaction tree resouces. */ > if (r->method_number == M_PROPFIND || > r->method_number == M_GET) { > - if ((seg = ap_strstr(r->unparsed_uri, root_dir))) { > + if ((seg = ap_strstr(r->uri, root_dir))) { "r->uri" is documented as "The path portion of the URI." It appears from this commit that you have discovererd that "r->uri" is not URI-encoded, whereas r->unparsed_uri (which is the full URL) is URI-encoded. OK so far. I went looking to see where the "master_uri" comes from, and it comes from dav_svn__get_master_uri() which gets its value directly from the Apache configuration directive "SVNMasterURI". It is not clear to me whether we should expect that to be URI-encoded, but I would have thought so. To clarify that, we need first to decide what it is meant to be, and then to document dav_svn__get_master_uri() and the other functions in that group. I attach a starting-point patch for the latter. > if (ap_strstr_c(seg, apr_pstrcat(r->pool, special_uri, > "/wrk/", NULL)) > || ap_strstr_c(seg, apr_pstrcat(r->pool, special_uri, > @@ -95,7 +97,7 @@ > /* If this is a write request aimed at a public URI (such as > MERGE, LOCK, UNLOCK, etc.) or any as-yet-unhandled request > using a "special URI", we have to doctor it a bit for proxying. */ > - seg = ap_strstr(r->unparsed_uri, root_dir); > + seg = ap_strstr(r->uri, root_dir); > if (seg && (r->method_number == M_MERGE || > r->method_number == M_LOCK || > r->method_number == M_UNLOCK || > @@ -158,6 +160,11 @@ > ### PUT requests and properties in PROPPATCH requests. > ### See issue #3445 for details. */ > > + /* We are url encoding the current url and the master url > + as incoming(from client) request body has it encoded already. */ > + canonicalized_uri = (char *) svn_path_uri_encode(canonicalized_uri, > + r->pool); > + root_dir = (char *) svn_path_uri_encode(root_dir, r->pool); In a separate discussion I think we decided that we will need to define "canonical" as including "in a canonical URI-encoded form". When we do that, uri_canonicalize() will (I think) require that its input is already URI-encoded, so we will need to call uri_encode() before uri_canonicalize(). It would be good to make the calls in that order now, but we will have to audit all calls when we make that change anyway so it is not essential right now. Are the (char *) type casts necessary? If not, please remove them. (Six of them in total.) Thanks. - Julian
### Patch in progress. Clarify doc strings of mod_dav_svn functions. * subversion/mod_dav_svn/dav_svn.h (dav_svn__get_special_uri, dav_svn__get_repo_name, dav_svn__get_xslt_uri, dav_svn__get_master_uri, dav_svn__get_activities_db, dav_svn__get_root_dir): Make the doc strings more precise. --This line, and those below, will be ignored-- Index: subversion/mod_dav_svn/dav_svn.h =================================================================== --- subversion/mod_dav_svn/dav_svn.h (revision 924197) +++ subversion/mod_dav_svn/dav_svn.h (working copy) @@ -337,23 +337,34 @@ svn_boolean_t dav_svn__get_list_parentpa the root_dir when the resource structure is built. */ -/* Return the special URI to be used for this resource. */ +/* Return the repo-root-relative URI of the special namespace to be used for + * this resource. + * Comes from the <SVNSpecialURI> directive. */ +/* ### Is this assumed to be URI-encoded? */ const char *dav_svn__get_special_uri(request_rec *r); -/* Return a descriptive name for the repository */ +/* Return a descriptive name for the repository. + * Comes from the <SVNReposName> directive. */ const char *dav_svn__get_repo_name(request_rec *r); -/* Return the URI of an XSL transform stylesheet */ +/* Return the server-relative URI of an XSL transform stylesheet. + Comes from the <SVNIndexXSLT> directive. */ +/* ### Is this assumed to be URI-encoded? */ const char *dav_svn__get_xslt_uri(request_rec *r); -/* Return the master URI (for mirroring) */ -const char * dav_svn__get_master_uri(request_rec *r); - -/* Return the activities db */ -const char * dav_svn__get_activities_db(request_rec *r); - -/* Return the root directory in canonicalized form */ -const char * dav_svn__get_root_dir(request_rec *r); +/* Return the full URL of the master repository (for mirroring). + Comes from the <SVNMasterURI> directive. */ +/* ### Is this assumed to be URI-encoded? */ +const char *dav_svn__get_master_uri(request_rec *r); + +/* Return the disk path to the activities db. + Comes from the <SVNActivitiesDB> directive. */ +const char *dav_svn__get_activities_db(request_rec *r); + +/* Return the server-relative URI of the repository root. + Comes from the <Location> directive. */ +/* ### Is this assumed to be URI-encoded? */ +const char *dav_svn__get_root_dir(request_rec *r); /** For HTTP protocol v2, these are the new URIs and URI stubs