On Thu, Nov 15, 2007 at 06:16:24PM +0100, Kay Sievers wrote: > On Thu, 2007-11-15 at 09:06 -0800, Greg KH wrote: > > On Thu, Nov 15, 2007 at 04:14:07PM +0800, Dave Young wrote: > > > On Thu, Nov 15, 2007 at 03:38:13AM +0100, Kay Sievers wrote: > > > > On Thu, 2007-11-15 at 09:01 +0800, Dave Young wrote: > > > > > On Nov 15, 2007 5:27 AM, Kay Sievers <[EMAIL PROTECTED]> wrote: > > > > > > On Wed, 2007-11-14 at 20:19 +0100, Jiri Kosina wrote: > > > > > > > On Wed, 14 Nov 2007, Kay Sievers wrote: > > > > > > > > > > > > > > > Could it be an init-order problem, where something tries to use > > > > > > > > the > > > > > > > > block subsystem? Before it is initialized with: > > > > > > > > block/genhd.c :: subsys_initcall(genhd_device_init); > > > > > > > > If that's the case, we have an old bug that nobody noticed with > > > > > > > > static > > > > > > > > structures, which are zeroed that time, but definitely not > > > > > > > > properly > > > > > > > > initialized. I'll try to build loop non-modular now, and see if > > > > > > > > that > > > > > > > > makes the bug appear here. > > > > > > > > > > > > > my .config with which I reproduc this on 2.6.24-rc2-mm1 reliably > > > > > > > can be > > > > > > > obtained from http://www.jikos.cz/jikos/junk/.config > > > > > > > > > > > > Hmm, that config doesn't do anything here, and if I make it boot, it > > > > > > does not show the bug. > > > > > > > > > > > > Could you possibly enable kobject debugging and see if that exposes > > > > > > something, maybe something goes wrong with the kset refcount and it > > > > > > gets > > > > > > released while in use. > > > > > > > > > > > Hi, > > > > > I would do that. > > > > > > > > That would be great. > > > > > > > > > BTW, The bug report as EIP at __list_add with CONFIG_DEBUG_LIST=y > > > > > > > > Yeah, that hints that the kset, which contains the list, is not > > > > allocated at the time it is used, or it is already released (kfree) > > > > again by some buggy logic. > > > Yes, I debugged it, there's some new findings. > > > It is freed by put_disk. > > > The floppy driver alloc_disk and then call put_disk without register_disk. > > > in kobject_cleanup line 551: > > > if(s) > > > kset_put(s); > > > Now the kset is set in alloc_disk after kobject_init, so it is not > > > refereced yet. > > > please try this patch: > > > > > > block/genhd.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff -upr linux/block/genhd.c linux.new/block/genhd.c > > > --- linux/block/genhd.c 2007-11-15 15:59:11.000000000 +0800 > > > +++ linux.new/block/genhd.c 2007-11-15 15:59:39.000000000 +0800 > > > @@ -718,9 +718,9 @@ struct gendisk *alloc_disk_node(int mino > > > } > > > } > > > disk->minors = minors; > > > - kobject_init(&disk->kobj); > > > disk->kobj.kset = block_kset; > > > disk->kobj.ktype = &ktype_block; > > > + kobject_init(&disk->kobj); > > > rand_initialize_disk(disk); > > > INIT_WORK(&disk->async_notify, > > > media_change_notify_thread); > > > > Ah, yes, that is a bug, and it's my fault, let me go fix that in my > > patch series. > > Oh, this is an old bug, that just didn't crash with the static ksets, it > did all the refcounting wrong, but nobody noticed it because the kset > data was still there.
No, I messed it up when I did the initial kset changes. If you look at 2.6.24-rc2, it's correct there: disk->minors = minors; kobj_set_kset_s(disk,block_subsys); kobject_init(&disk->kobj); I have no idea why I switched those lines around, sorry about that. greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/