On Thu, Aug 25, 2011 at 05:12:09PM -0500, Brandon Gooch wrote: > On Thu, Aug 25, 2011 at 4:53 PM, Kostik Belousov <kostik...@gmail.com> wrote: > > On Thu, Aug 25, 2011 at 03:16:09PM -0600, Charlie Martin wrote: > >> We're having a crash in some internal code running on FreeBSD 7.2 > >> (specifically 7.2-PRERELEASE FreeBSD 7.2-PRERELEASE and yeah, I know > >> it's quite a bit behind) in which after 18-30 hours of running load > >> tests, the code panics with: > >> > >> panic: Bad link elm 0xffffff0044c09600 next->prev != elm > >> cpuid = 0 > >> KDB: stack backtrace: > >> db_trace_self_wrapper() at 0xffffffff8019119a = db_trace_self_wrapper+0x2a > >> panic() at 0xffffffff80307c72 = panic+0x182 > >> devfs_populate_loop() at 0xffffffff802a43a8 = devfs_populate_loop+0x548 > >> > >> > >> First question: where's the most appropriate place to ask about this > >> kind of bug on a back version. > > It is fine to ask there. > > > >> > >> Second: does this remind anyone of any bugs? Googling came up with a > >> few somewhat similar things but hasn't provided much insight so far. > > In 99% of the cases, it means that you forgot to dev_ref() some cdev. > > So dev_ref increments the reference count for a cdev. Even though the > work "loop" seems to indicate that we will iterate over a list of > objects (one of which we may be missing a reference to via a missing > dev_ref()), I'm not seeing how this can cause a panic from inside > devfs_populate_loop(). > > Can you help me understand this? > Missing dev_ref() means that the memory for the cdev (and cdev_priv) is freed prematurely. If this happens before destroy_dev() is called, then the list which is iterated over by populate_loop(), is corrupted.
See e.g. MAKEDEV_REF flag for make_dev(9) and its use in the (old) clone handlers.
pgpWJlf9huRNl.pgp
Description: PGP signature