On 01/15/15 10:38, Konstantin Belousov wrote:
On Thu, Jan 15, 2015 at 08:41:31AM +0100, Hans Petter Selasky wrote:
On 01/15/15 04:31, Konstantin Belousov wrote:
On Wed, Jan 14, 2015 at 10:07:13PM +0000, Hans Petter Selasky wrote:
Author: hselasky
Date: Wed Jan 14 22:07:13 2015
New Revision: 277199
URL: https://svnweb.freebsd.org/changeset/base/277199
Log:
Avoid race with "dev_rel()" when using the recently added
"delist_dev()" function. Make sure the character device structure
doesn't go away until the end of the "destroy_dev()" function due to
concurrently running cleanup code inside "devfs_populate()".
MFC after: 1 week
Reported by: dchagin@
Modified:
head/sys/fs/devfs/devfs_devs.c
head/sys/kern/kern_conf.c
Modified: head/sys/fs/devfs/devfs_devs.c
==============================================================================
--- head/sys/fs/devfs/devfs_devs.c Wed Jan 14 22:05:57 2015
(r277198)
+++ head/sys/fs/devfs/devfs_devs.c Wed Jan 14 22:07:13 2015
(r277199)
@@ -137,6 +137,12 @@ devfs_alloc(int flags)
vfs_timestamp(&ts);
cdev->si_atime = cdev->si_mtime = cdev->si_ctime = ts;
cdev->si_cred = NULL;
+ /*
+ * Avoid race with dev_rel() by setting the initial
+ * reference count to 1. This last reference is taken
+ * by the destroy_dev() function.
+ */
+ cdev->si_refcount = 1;
This is wrong. Not all devices are destroyed with destroy_dev().
dev_rel() must be allowed to clean up allocated device.
That said, I do not understand what race you are trying to solve.
Freeing of the accessible cdev memory cannot happen in parallel while
dev_mtx is owned.
Please do not commit (to devfs) without seeking for the review first.
Hi Konstantin,
From my analysis there are basically three ways for a cdev to die:
1) Through dev_free_devlocked()
2) Through destroy_devl() which then later calls dev_free_devlocked()
3) Through destroy_dev_sched() which really is a wrapper around
destroy_devl().
You only look from the consumers PoV. Devfs cdev can be dereferenced
because e.g. clone handler decides that cdev is not valid/needed,
and now the memory is never freed due to extra reference.
Do not assume that all cdevs go through destroy_dev().
Hi,
All cdevs go through either case #2 or case #1 eventually from what I
can see, including clone devices, which call destroy_devl() in the end
aswell. See the "clone_destroy()" function!
I did a simple test with /dev/dspX.Y which use clone devices. I did:
vmstat -m | grep -i devfs1
1) Before plugging USB audio device:
DEVFS1 157 79K - 189 512
2) Plug USB audio device:
DEVFS1 164 82K - 196 512
3) Play something (env AUDIODEV=/dev/dsp2.4 play track01.wav)
DEVFS1 165 83K - 197 512
4) Stop playing (clone device still exits):
DEVFS1 165 83K - 197 512
5) Detach USB audio device:
DEVFS1 157 79K - 197 512
I see no leakage in that case!
Other case:
1) After "kldload if_tap"
DEVFS1 158 79K - 201 512
2) After creating TAP device (cat /dev/tap99)
DEVFS1 159 80K - 204 512
3) After creating TAP device (cat /dev/tap101)
DEVFS1 160 80K - 207 512
5) After "kldunload if_tap":
DEVFS1 158 79K - 207 512
6) After "kldload if_tap" again:
DEVFS1 158 79K - 207 512
I see no leakage in that case either!
Are there more cases which I don't see?
In the case of direct free through #1, the reference count is ignored
and it doesn't matter if it is one or zero. Only in the case of
destruction through destroy_dev() it matters.
Like the comment says in destroy_devl():
/* Avoid race with dev_rel() */
The problem is that the "cdev->si_refcount" is zero when the initial
devfs_create() is called. Then one ref is made. When we clear the
CDP_ACTIVE flag in devfs_destroy() it instructs a !parallel! running
process to destroy all the FS related structures and the reference count
goes back to zero when the "cdp" is removed from the "cdevp_list". Then
the cdev is freed too early. This happens because destroy_devl() is
dropping the dev_lock() to sleep waiting for pending references.
Basically, this is very good explanation why your delist hack is wrong,
for one of the reason. Another reason is explained below.
You are trying to cover it with additional reference, but this is wrong
as well.
Do you see something else?
I think that what you are trying to do with the CDP_ACTIVE hack is doomed
anyway, because you are allowing for devfs directory to have two entries
with the same name, until the populate loop cleans up the inactive one.
In the meantime, any access to the directory operates on random entry.
The entry will not be random, because upon an open() call to a character
device, I believe the devfs_lookup() function will be called, which
always populate the devfs tree at first by calls to
devfs_populate_xxx(). Any delisted devices which don't have the
"CDP_ACTIVE" bit set, will never be seen by any open.
Regarding leftover filedescriptors which still access the old "cdev"
this is not a problem, and these will be closed when the si_refcount
goes to zero after the destroy_devl() call.
The checks for existent names in make_dev() are performed for the reason,
and you makes the rounds to effectively ignore it.
These checks are still correct and don't conflict with my patch from
what I can see. Else the existing destroy_devl() would also be broken
even before my patch with regard to the "random" selection of character
devices at open() from userspace.
--HPS
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"