On 31-08-16, 12:46, Alan Stern wrote: > On Wed, 31 Aug 2016, Viresh Kumar wrote: > > > On 05-08-16, 11:51, Alan Stern wrote: > > > +++ usb-4.x/drivers/usb/core/hub.c > > > @@ -1052,7 +1052,7 @@ static void hub_activate(struct usb_hub > > > > > > /* Continue a partial initialization */ > > > if (type == HUB_INIT2 || type == HUB_INIT3) { > > > - device_lock(hub->intfdev); > > > + device_lock(&hdev->dev); > > > > Hi Alan, > > > > I have received reports of kernel crashes (NULL dereference) due to this > > patch > > in some of the corner cases. Note that we have backported this patch (and > > few > > other) to 3.10 kernel. I have attached my hub.c file as well for reference. > > > > Here is the reported kernel OOPs: > > > > [ 19.476228] Unable to handle kernel NULL pointer dereference at virtual > > address 00000000 > > [ 19.476231] pgd = ffffffc00007d000 > > [ 19.476236] [00000000] *pgd=000000000e90b003, *pmd=000000000e90c003, > > *pte=00e00000f9000407 > > [ 19.476242] Internal error: Oops: 96000045 [#1] PREEMPT SMP > > [ 19.476273] Modules linked in: gb_vibrator(O) gb_usb(O) gb_uart(O) > > gb_spi(O) gb_sdio(O) gb_raw(O) gb_pwm(O) gb_power_supply(O) gb) > > [ 19.476279] CPU: 0 PID: 344 Comm: kworker/0:3 Tainted: G O > > 3.10.97-g4b7224f-dirty #454 > > [ 19.476290] Workqueue: events hub_init_func2 > > [ 19.476293] task: ffffffc09b3560c0 ti: ffffffc09ada8000 task.ti: > > ffffffc09ada8000 > > [ 19.476300] PC is at __mutex_lock_slowpath+0x138/0x224 > > [ 19.476303] LR is at __mutex_lock_slowpath+0x128/0x224 > ... > > [ 19.476582] Call trace: > > [ 19.476586] [<ffffffc000ccf13c>] __mutex_lock_slowpath+0x138/0x224 > > [ 19.476590] [<ffffffc000ccf254>] mutex_lock+0x2c/0x48 > > [ 19.476593] [<ffffffc0006f6eac>] hub_activate+0x50/0x4d8 > > [ 19.476596] [<ffffffc0006f7388>] hub_init_func2+0x14/0x1c > > [ 19.476602] [<ffffffc0002387ac>] process_one_work+0x26c/0x3cc > > [ 19.476605] [<ffffffc000239988>] worker_thread+0x208/0x358 > > [ 19.476610] [<ffffffc00023f360>] kthread+0xbc/0xc4 > ... > > This happens when the device is infinitely generating connected and then > > removed > > (not manually, but due to some hardware issues). > > If I'm reading this right, it means that hub->hdev is NULL in
I am not sure I am in sync here :( > hub_activate(). We would have gotten the crash right from hub_activate() in that case, isn't it? The fact that the call sequence reached mutex_lock() here, it means that hub->hdev->dev was valid at least. The mutex dev->mutex is somewhat corrupted or uninitialized, etc.. And that's where it all went wrong. As __mutex_lock_slowpath() is called, it means that the mutex had a count of 0 instead of 1 during the lock and then we crashed during __mutex_lock_slowpath(), which can only happen if the mutex is uninitialized in the first place. The mutex gets initialized as part of device_add() and so things can go wrong if device_add() was skipped here somehow. I may be completely wrong, but that's what I read :) > I don't see how that could be true; it is initialized > to a non-NULL value and then never changed until the hub structure is > deallocated. > > Is it possible to add some debugging printk's in there to find out > what's happening? I will ask Vaibhav (cc'd) to do it, not sure if he is around today or not. Thanks for your quick reply. -- viresh -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html