Re: [ISSUE] Cancelling svn checkout leaves an open file handle

Daniel Sahlberg Wed, 09 Sep 2020 01:37:12 -0700

Den ons 9 sep. 2020 kl 09:39 skrev Branko Čibej <[email protected]>:

> On 09.09.2020 08:18, Daniel Sahlberg wrote:
>
> Den ons 9 sep. 2020 kl 06:44 skrev Nathan Hartman <
> [email protected]>:
>
>> On Tue, Sep 8, 2020 at 6:08 AM Uroš Jovanović <[email protected]> wrote:
>> >
>> > Then, mid downloading some of the larger files a temp file will appear
>> in .svn\tmp. Once that happens, hit the Cancel button.
>> > It will signal the cancellation to the svn client and it will throw the
>> SvnOperationCanceledException, the SvnClient gets disposed BUT an open file
>> handle remains on ".svn\tmp\svn-XYZ123" file.
>> > If you try to delete it, Windows will complain that it is used by our
>> test app. :(
>>
>> Moving this to the dev@ list...
>>
>> Potentially long-running APIs such as 'checkout' allow the client to
>> provide a 'cancel_func' callback, which is called at various strategic
>> places to ask the client whether the operation should be canceled.
>>
>> It sounds to me like one of those places sees a cancel request and
>> returns to its caller, forgetting to do some cleanup.
>>
>> Last night I tried to find such a place by reading code.
>>
>> The 'checkout' command sets up a working copy (if necessary) and then
>> calls the 'update' logic to do the heavy lifting.
>>
>> The 'update' logic is quite involved as it handles all sorts of
>> possibilities, which means the number of branches of the call tree
>> that need to be checked are too numerous for my code reading approach
>> to be sensible.
>>
>> My thoughts for an automated approach, provided there is a way for a
>> process to inquire how many open file handles it has (I assume there
>> is a way; I've just never had to do this): The idea is to write a
>> minimal client that does the following (on a ramdrive):
>>
>> 1. Check out a working copy of a repository, giving a cancel_func 'A'
>>    that increments a global variable 'n' each time it is called and
>>    always returns "don't cancel."
>>
>> 2. Loop n times, the loop counter being a global variable 'x':
>>
>>    2.1: Delete the working copy.
>>
>>    2.2: Check out a working copy of the same repository, giving a
>>         different cancel_func 'B' that returns "don't cancel" the
>>         first (x - 1) times it is called, and returns "cancel" the
>>         x-th time it is called.
>>
>>    2.3: Test whether there are open file handles. If there are, we
>>         know at which iteration the cleanup is not done, and we break
>>         out of the loop.
>>
>> 3. If x >= n, quit; we didn't find the problem.
>>
>> 4. Delete the working copy.
>>
>> 5. Check out a working copy of the same repository, giving a different
>>    cancel_func 'C' that returns "don't cancel" the first (x - 1) times
>>    it is called, and traps the x-th time it is called, allowing the
>>    call stack to be examined.
>>
>> Notes and caveats:
>>
>> 1. This could run for days (or years).
>>
>> 2. Then again, if it can be exposed pretty reliably by a user hitting
>>    a Cancel button in a GUI, that means cancel_func is called
>>    frequently enough from the offending location that it should
>>    (hopefully) be caught relatively soon in the process.
>>
>> 3. I think a huge repository isn't needed. The Greek Tree used by the
>>    test suite may suffice. If it doesn't expose the bug, I'd retry
>>    with a larger file thrown in. If that doesn't expose it, add
>>    increasing complexity such as externals, etc.
>>
>> 4. This relies on the logic being executed identically for each
>>    checkout (i.e., cancel_func is called the same number of times from
>>    the same call sites).
>>
>> 5. No idea how this could be turned into a regression test.
>>
>> 6. If there's a better way, I'd love to hear it!
>>
>
> For a regression test (as well as trying to pinpoint what goes wrong),
> wouldn't it be enough if the cancel_func check for the presence of a file
> in .svn/tmp (maybe even checking if it is open - in Linux that should be
> easy enough to check in /proc/$PID/fd) and then signal to cancel. That
> would "only" need a repository/file that is large enough to trigger calling
> the cancel_func.
>
>
> Not even that. The cancel check function is provided by the caller (in the
> client context) and can return 'true' when the conditions are right.
> Finding the correct temp directory is a bit more involved though because
> it's platform-dependent (i.e., I'm not sure if APR can reliably tell us
> that).
>


>
> I checked quickly and I also see the open file when checking out using
> TortoiseSVN and cancelling and it seems to occur all the time.
>
>
> It could be a but in our library, or it could be a bug in TortoiseSVN,
> since the API caller controls the lifetime of pools. The easiest way to
> check would be to run a checkout from the command line client built with
> APR pool debugging enabled (that means a custom build of APR is needed,
> too) and examining the debug output.
>

Thanks for your input! If it is in TortoiseSVN, then the same bug also
exists in SharpSVN, thus I'm leaning towards a bug in Subversion itself.
I'll try to find some time to check this later this week.

Would it be a better approach to enumerate all open fd:s before and after
the call to 'checkout' and compare the list of open fd:s (after any pool
cleanup required). Of course enumerating open fd:s most probably require
platform specific code in the regression test.

>
Daniel

>

Re: [ISSUE] Cancelling svn checkout leaves an open file handle

Reply via email to