Matt Dillon <[EMAIL PROTECTED]> wrote:
> This could explain a few bug reports I've had over the years in regards
> to systems leaking swap space. Good find!
>
> Hmmm. May I suggest an alternative?
>
> * Keep the part that changes vm->vm_refcnt == 1 to
> --vm->vm_refcnt == 0 when checking whether to free
> the underlying resources or not. This introduces one
> extra ref count decrement.
>
> * Instead of breaking out the vmspace freeing code or adding
> any check fields, simply change the way vmspace_free() operates
> so instead of checking --vm->vm_refcnt == 0, it instead checks
> for vm->vm_refcnt == -1.
That sounds like a perfectly good way to do it. When I was thinking of just
doing that simple solution, I for some reason dismissed it immediately
because I didn't like the idea of the refcnt being anything < 0. However,
as a less complex solution, it should likely be fine.
> Old vmspace_free() code:
>
> if (vm->vm_refcnt == 0)
> panic(...)
>
> if (--vm->vm_refcnt == 0) {
> ...
> }
>
> Suggested fix (pseudo code / incomplete):
>
> /*
> * One extra refcnt decrement occurs before freeing. The
> * process that takes responsibility for releasing the
> * vmspace resources decrements refcnt to 0, then the vmspace
> * is further decremented when released from cpu_wait().
> */
> if (vm->vm_refcnt <= -1)
> panic(...)
>
> if (--vm->vm_refcnt == -1) {
> ...
> }
Well, it doesn't particularly matter on whose behalf specifically the
vmspace is freed, only that it's one and only one time... As long as the
access here to vm->vm_refcnt is locked (it was by Giant the last time I had
looked...), I suppose this could easily be the simplest solution. It still
feels a tiny bit hackish that the refcnt should go negative, but the
behavior you propose seems correct.
Now, just to decide if it's ncessary having a separate vmspace_exitfree() to
separate the two... I don't think it should be now, but I'll need to review
the current vmspace_free() callers to make sure.
> You might have to make other adjustments in the codebase, since there
> are a couple of other places where vm_refcnt is used
> (fgrep vm_refcnt */*.c). This is only a suggestion. I have not
> tested it in any way. We use a similar trick for the vnode ref count.
>From a casual perusal, I don't really see where this kind of behavior is
implemented wrt vput()/vrele().
> I would be happy to review and test your final solution.
Wunderbar =)
--
Brian Fundakowski Feldman \ FreeBSD: The Power to Serve! /
[EMAIL PROTECTED] `------------------------------'
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message