On Tue, Sep 3, 2019 at 8:01 AM Branko Čibej <br...@apache.org> wrote: > > On 02.09.2019 16:20, Johan Corveleyn wrote: > > On Fri, Jan 15, 2016 at 1:58 PM Ivan Zhakov <i...@visualsvn.com> wrote: > >> On 7 January 2016 at 10:34, Ivan Zhakov <i...@visualsvn.com> wrote: > >>> On 6 January 2016 at 08:14, Greg Stein <gst...@gmail.com> wrote: > >>>> Personally, I'd be more interested in the effects on the network and its > >>>> caching ability. Do we really need to save CPU/IO on the server? Today's > >>>> servers seem more than capable, and are there really svn servers out in > >>>> the > >>>> wild getting so crushed, that this is important? It seems that as long as > >>>> proxies/etc can properly cache the results, and (thus) avoid future > >>>> touches > >>>> on the backend server, then we're good to go. > >>>> > >>>> If the patch doesn't affect the caching (which it sounds like "no"), then > >>>> just go with it. Sure, it is neat to look at ayscalls, but... why? I > >>>> don't > >>>> understand the need to examine/profile. Educate me? > >>>> > >>> The patch should not affect HTTP caching for two reasons: > >>> 1. Browsers and proxies supports ETag and use it instead of > >>> Last-Modified header. > >>> 2. ETag and Last-Modified headers are used only for cache > >>> re-validation when max-age is expired. But Subversion sets max-age=1 > >>> week for resources with specific revision in URL > >>> (http://server/!svn/rvr/1/path). max-age=0 is only used for public > >>> URLs without revision, i.e. http://server/path) > >>> > >>> As far I know proxy usage are limited to public servers with anonymous > >>> access, since caching of HTTP responses with Authorization is > >>> prohibited by RFC. > >>> > >>> Anyway I agree that trading bandwidth usage to save some CPU/IO on the > >>> server doesn't make sense, but Last-Modified case is the different: > >>> Subversion server wasting 10%+ of server resources to produce unused > >>> header. > >>> > >>> I don't have access to svn.apache.org server performance stats, but I > >>> suppose it's pretty busy server and Infra team would welcome any > >>> Subversion server performance improvements. > >>> > >> Committed in r1724790. > >> > >> -- > >> Ivan Zhakov > > A bit late perhaps, but apparently this change (removing the > > Last-Modified header from GET responses) broke a specific use case at > > my company (we just upgraded our SVN server from 1.9 to 1.10, bringing > > along this particular change): > > > > - We use Apache Ivy (http://ant.apache.org/ivy/) for dependency > > management of our Java applications. > > > > - Third party jar files are kept in our svn repo under > > /trunk/ivyrepository (and branched / tagged in release branches, so we > > have completely reproducible builds, even if our third party jars or > > their dependency structures change on trunk). > > > > - We use Ivy's "URL Resolver" [1], which downloads the files with > > regular GET requests (and HEAD requests to check the up-to-dateness of > > the cache on the client). We effectively use SVN in this case as a > > "regular" file server (which coincidentally has branches and tags so > > we can resolve against the correct tree when making a build). > > > > This last part now fails, i.e. Ivy's URLResolver no longer detects > > that a file has changed. It used to compare its own "last-mod time of > > the file on disk in the cache" with the Last-Modified header, which > > works fine with all kinds of file servers, and worked with SVN < 1.10. > > > > I think it's unfortunate that we broke compatibility here (even if > > it's not usage by a normal svn client) for the sake of some relatively > > small performance / load gain on the server. If we could get the old > > behavior back with some Apache directive, that would have been fine, > > but there is no such option at the moment. > > > > Also: if the Last-Modified would have been removed only for the > > "internal GET urls" (like http://server/!svn/rvr/1/path), for > > optimizing checkout (as executed by normal svn clients), that would > > have been understandable. But why remove it for the "external GET > > urls" (http://svnserver/path) as well? Those have nothing to do with > > checkout load, those are only used by browsers or "tools using SVN as > > a glorified file server" :-). > > > > I am by no means an expert in HTTP standards, and various online > > sources give different recommendations for these headers (ETag, > > Last-Modified, ... request headers for conditional GETs, ...). But we > > found an old discussion thread on the "d...@rapidsvn.tigris.org" > > mailinglist from 15 years ago, discussing "a very basic idea: let > > mod_dav_svn set the Last-Modified HTTP header ..." [2]. Perhaps the > > feature dates from back then (indicating that it wasn't an accidental > > feature)? > > I'm fairly sure that it wasn't accidental. The whole idea that you can > use a browser to look at a Subversion repository was intentional, adding > appropriate HTTP headers in responses was surely part of that. > > That said, Ivan's original argument about caching not being affected by > this change is correct ... but it ignores your particular use case. The > error was in assuming it's OK to break compatibility on the protocol > level like this. > > > Anyway, how about bringing this feature back in some form? > > - Revert r1724790? > > This is clearly the simplest solution, but I have no idea what the > performance impact would be. From looking at the diff, my best guess is > that svn_fs_node_created_rev() and svn_fs_revision_prop() dominate. > > > - or only for "external GET urls"? > > - or only if some Apache directive is set? > > > > Thoughts? > > I would prefer not to add yet another configuration knob to the server. > I agree that versioned-resource URLs are only interesting for DAV-aware > clients, and those clients already know how to check for modifications > without looking at Last-Modified. That would imply that adding the > header for external URLs is the right solution. > > -- Brane
Thanks for your response, Brane. I think the below patch would do it (set the Last-Modified header only for "external URLs"). It's basically a revert of r1724790 (and adding a test), plus wrapping the actual setting of Last-Modified inside the following condition: if ((resource->type == DAV_RESOURCE_TYPE_REGULAR) && (resource->info->repos_path == resource->info->uri_path->data)) I'm not entirely sure this is the perfect way to test for "only external url's", but it seems to work (i.e. it adds the header to /iota, but not to /!svn/rvr/1/iota). If I omit the second part of the condition, it's not selective enough (apparently /!svn/rvr/1/iota is also a REGULAR resource). That second part of the condition corresponds to what is specifically set in the case of an "external request" in parse_uri(), IIUC: http://svn.apache.org/viewvc/subversion/trunk/subversion/mod_dav_svn/repos.c?revision=1724790&view=markup&pathrev=1724790#l791 [[[ Index: subversion/mod_dav_svn/repos.c =================================================================== --- subversion/mod_dav_svn/repos.c (revision 1866345) +++ subversion/mod_dav_svn/repos.c (working copy) @@ -3139,6 +3139,50 @@ seek_stream(dav_stream *stream, apr_off_t abs_posi && resource->baselined)) +/* Return the last modification time of RESOURCE, or -1 if the DAV + resource type is not handled, or if an error occurs. Temporary + allocations are made from RESOURCE->POOL. */ +static apr_time_t +get_last_modified(const dav_resource *resource) +{ + apr_time_t last_modified; + svn_error_t *serr; + svn_revnum_t created_rev; + svn_string_t *date_time; + + if (RESOURCE_LACKS_ETAG_POTENTIAL(resource)) + return -1; + + if ((serr = svn_fs_node_created_rev(&created_rev, resource->info->root.root, + resource->info->repos_path, + resource->pool))) + { + svn_error_clear(serr); + return -1; + } + + if ((serr = svn_fs_revision_prop2(&date_time, resource->info->repos->fs, + created_rev, SVN_PROP_REVISION_DATE, + TRUE, resource->pool, resource->pool))) + { + svn_error_clear(serr); + return -1; + } + + if (date_time == NULL || date_time->data == NULL) + return -1; + + if ((serr = svn_time_from_cstring(&last_modified, date_time->data, + resource->pool))) + { + svn_error_clear(serr); + return -1; + } + + return last_modified; +} + + const char * dav_svn__getetag(const dav_resource *resource, apr_pool_t *pool) { @@ -3219,6 +3263,25 @@ set_headers(request_rec *r, const dav_resource *re if (!resource->exists) return NULL; + if ((resource->type == DAV_RESOURCE_TYPE_REGULAR) + && (resource->info->repos_path == resource->info->uri_path->data)) + { + /* Include Last-Modified header for 'external' GET requests + (i.e. requests to URI's not under /!svn), to support usage of an + SVN server as a file server, where the client needs timestamps + for instance to use as "last modification time" of files on disk. */ + apr_time_t last_modified; + + last_modified = get_last_modified(resource); + if (last_modified != -1) + { + /* Note the modification time for the requested resource, and + include the Last-Modified header in the response. */ + ap_update_mtime(r, last_modified); + ap_set_last_modified(r); + } + } + /* generate our etag and place it into the output */ apr_table_setn(r->headers_out, "ETag", dav_svn__getetag(resource, resource->pool)); Index: subversion/tests/cmdline/mod_dav_svn_tests.py =================================================================== --- subversion/tests/cmdline/mod_dav_svn_tests.py (revision 1866345) +++ subversion/tests/cmdline/mod_dav_svn_tests.py (working copy) @@ -640,6 +640,29 @@ def propfind_propname(sbox): actual_response = r.read() verify_xml_response(expected_response, actual_response) +@SkipUnless(svntest.main.is_ra_type_dav) +def last_modified_header(sbox): + "verify 'Last-Modified' header on 'external' GETs" + + sbox.build(create_wc=False, read_only=True) + + headers = { + 'Authorization': 'Basic ' + base64.b64encode(b'jconstant:rayjandom').decode(), + } + + h = svntest.main.create_http_connection(sbox.repo_url) + + # GET /repos/iota + # Expect to see a Last-Modified header. + h.request('GET', sbox.repo_url + '/iota', None, headers) + r = h.getresponse() + if r.status != httplib.OK: + raise svntest.Failure('Request failed: %d %s' % (r.status, r.reason)) + svntest.verify.compare_and_display_lines(None, 'Last-Modified', + svntest.verify.RegexOutput('.+'), + r.getheader('Last-Modified')) + r.read() + ######################################################################## # Run the tests @@ -652,6 +675,7 @@ test_list = [ None, propfind_404, propfind_allprop, propfind_propname, + last_modified_header, ] serial_only = True ]]] -- Johan