Branko Čibej wrote: > On 11.01.2011 16:01, Julian Foad wrote: > >> I see a different issue here: The close_wcroot() call is normally > >> handled from pool cleanup for users of the svn_client api. (The > >> svn_wc_context_t is cached in the client context, which is only closed > >> on pool cleanup). > > Thanks for pointing that out. That is not when I would like the > > pristine cleanup to happen. I would like it to happen after every > > operation that changes the WC - say after every major call into > > libsvn_wc, and/or every major call into libsvn_client, or whenever the > > wc.db work queue is run. Any thoughts on where would be best? > > What exactly are you trying to achieve? Is this a disk-space optimization? > My hunch says that you do not want to do this too often at all because > it'll turn out to be space-vs-time. Deleting a file isn't cheap even on > a local filesystem these days. Better to relegate this to an explicit > "svn cleanup"; or better yet, follow CMike's advice.
>From IRC: [[[ <julianf> brane: got a minute re. pristine cleanup? <brane> why, sure <julianf> I want to achieve reasonable cache management at simplest possible dev cost at the moment. With options for enhancing it later. <brane> (nod) <julianf> So my thought is "Let's just delete them as soon as we know there might be some unref'd pristines." <brane> good job on the automagic refcounting, btw <julianf> Cheers. So I thought "It's got to be called from somewhere. Where? Upon closing the ... uh ... WC admin-handle object? WC API? WC DB? Main application pool? Dunno." But I really do think calling only from "svn cleanup" is not good enough. So my current position is I want to invite help and suggestions on where to do this for best effect while keeping the dev cost simple. Sure deleting a file isn't the cheapest op, but we only end up needing to do it when we've just been doing some WC file shuffling anyway, so it's not proportionally expensive either, is it? <brane> well, i'm a bit worried about that on two counts: a) bugs - deleting something too soon would not be nice; i know that's always a "temporary" state of affairs, but still; b) time - i've seen deletes taking more than a second on a local XFS <julianf> In other words, my intent is by cleaning up very often, the number of files deleted each time will be proportional to the size of the update/copy/revert/delete/etc. WC operation that has just happened. <brane> you have a point there when does the work queue get flushed? i'm guessing you can't have a sqlite trigger run a callback written in C ... <julianf> That might be possible; not sure. WC gets flushed within a libsvn_wc operation - e.g. several times within an "update", possibly even several times per file. <brane> sounds like deleting during WC flush could do the right thing then <julianf> As for "too soon" - yes, that's critical. That's partly why I decided to implement ref-counting and then defer deletion to some later time. Previously, I was assuming the higher-level parts of libsvn_wc would delete it as soon as the code determined that it was no longer needed by the current operation - but that gets tricky to analyze because of passing the checksum around here and there to be used by some other bit of code. <julianf> So I'm looking for some place where we can say "By design, at this place any references that are not stored in the DB are not valid for looking up in the pristine store." <julianf> I think WQ flush might be too low level, but not sure. brane: Has your speed concern been satisfied now, because it's only proportional to work already being done? <julianf> brane: Thanks for the chat. ]]] I think the speed concern is a red herring, as it would only be a problem if we were to try to delete a large number of them at some inappropriate point in the code, such as when exiting a long-running application. The files have to be deleted at some time. If we do it little and often, the time cost is only proportional to the size of the WC operation being done, and the peak disk usage is low. On a system where deletion takes one second per file, I would imagine all WC work would be very slow anyway, but running "svn cleanup" after a long period of WC work would be intolerably slow. I don't see any advantage, except for ultimate implementation simplicity, in batching up this particular kind of deletion. We certainly haven't found a need to adopt that strategy in order to make the deletion of other WC files (temp files or working files) perform acceptably. And calling the cache management (currently "deletion") function at a fine granularity won't get in the way of enhancements for longer term caching and repository traffic optimizations based on using the cache. - Julian