Hi Mike On Mon, Nov 5, 2012 at 9:39 PM, C. Michael Pilato <cmpil...@collab.net> wrote: > On 11/05/2012 01:33 PM, C. Michael Pilato wrote: >> On 11/05/2012 11:49 AM, Lieven Govaerts wrote: >>> My current test with svn trunk & apache+mod_dav_svn 1.6.12 does not >>> reproduce this issue. >>> >>> What I am seeing is that svn is using a lot of memory, up to 1GB. >>> Currently looking at the root cause there. >>> >>> First clue: many directories are not closed until the call to >>> close_all_dirs in libsvn_ra_serf/update.c:finish_report. This means >>> that for the most part of the checkout, the associated info object is >>> kept in memory. >> >> My debugging indicates that when close_all_dirs() is called, there are a >> slew of unclosed directories which each have a non-zero ref_count. Save for >> the root directory of the edit (/tags), fetch_props == TRUE, >> propfind_handler->done == TRUE, and tag_closed == TRUE for all open child >> directories. So I'm looking around at how file references are managed to >> see if anything has leaked. I might have inadvertently botched something >> when adding the pull-contents-from-the-wc-cache logic. > > So ... if I'm reading the ra_serf code correctly (and let me tell you, > there's a good chance I'm not), the only time we opportunistically close > directory batons is when we're processing the current list of completed file > fetches. See the code which follows this comment: > > /* If we have a valid directory and > * we have no open items in this dir and > * we've closed the directory tag (no more children can be added) > * and either: > * we know we won't be fetching props or > * we've already completed the propfind > * then, we know it's time for us to close this directory. > */ > > That code is all encompassed within an outer loop that looks like so: > > /* prune our fetches list if they are done. */ > done_list = report->done_fetches; > while (done_list) > { > > And all that is nested within yet another loop that runs until the REPORT > response is all parse and all auxiliary GETs and PROPFINDs are finished. > > So, back to the opportunistic directory closure: *if* we've finished any > GETs since the last time we checked, and if ${STUFF_IN_THAT_BIG_COMMENT}, > then we try to close the parent directory of the file we fetched, plus any > other close-able parent directories thereof. > > But what about when there are no file content fetches to manage? > > Using a trunk client and 1.7.x server, I just tried to export a tree with > 15k directories (10 wide at the top, 15 or so below that, and 100 or so > below those) and no files. 153 Mb of memory peak usage, and not a single > directory closure until the whole tree had been processed/PROPFINDed/etc. > > Unless I'm mistaken, we will delay until the very end the closure of every > directory (and by refcount-management-extension, every parent thereof) for > which we don't execute at least one GET for an immediately child thereof.
With Philip's test scenario, you can see this clearly as the 3 empty directories in svn trunk are only closed at the very end: .. A /tmp/tags/52/subversion/bindings/swig/proxy A /tmp/tags/51/subversion/tests/cmdline/import_tests_data/import_tree/DIR1.noo A /tmp/tags/51/subversion/tests/cmdline/import_tests_data/import_tree/DIR6/DIR7/DIR8.noo A /tmp/tags/51/subversion/bindings/swig/proxy A /tmp/tags/50/subversion/tests/cmdline/import_tests_data/import_tree/DIR1.noo A /tmp/tags/50/subversion/tests/cmdline/import_tests_data/import_tree/DIR6/DIR7/DIR8.noo A /tmp/tags/50/subversion/bindings/swig/proxy Checked out revision 151. > I expect this situation is not new to trunk. (In fact, repeating the above > export with a 1.7 client, I see a peak memory using of 458 Mb!) Memory > usage improvements in trunk help matters. But then, the feature which > causes ra_serf to use local file contents where it could avoid hitting the > network will, I expect, cause still more instances of directories for which > no immediate file children's contents will be fetched. > > Philip, Lieven, somebody -- care to double-check my thinking here? I'm only > 80% confident in my findings at the moment. > I haven't checked the code for all details yet, but I can confirm from looking at finish_report that directories are only closed when at least all files in it have been processed, and empty directories aren't even considered in the outer close_dir loop. Your analysis confirms what I see with additional logging. Do you plan on patching this issue? If yes I can continue looking at Philip's original issue in this thread - assuming it is not caused by the memory leak. Lieven