On Tue, Sep 26, 2017 at 8:04 AM, Tariq Toukan <tar...@mellanox.com> wrote: > > > On 26/09/2017 3:51 PM, Eric Dumazet wrote: >> >> On Tue, Sep 26, 2017 at 4:21 AM, Tariq Toukan <tar...@mellanox.com> wrote: >>> >>> >>> Hi Eric, >>> >>> We see a regression introduced in this series, specifically in the >>> patches >>> touching lib/kobject_uevent.c. >>> We tried to figure out what is wrong there, but couldn't point it out. >>> >>> Bug is that mlx4 driver restart fails, because mlx4_core is still in use. >>> According to module dependencies, both mlx4_en and mlx4_ib should have >>> been >>> unloaded at this point >>> Please see log below. >>> >>> This looks to be some kind of a race, as the repro is not deterministic. >>> Probably the en/ib modules are now mistakenly reloaded. >>> >>> Any idea what could this be? >>> >>> Regards, >>> Tariq >>> >>> >>> [root@reg-l-vrt-41016-009 ~]# /etc/init.d/openibd stop >>> Unloading HCA driver: [ OK ] >>> [root@reg-l-vrt-41016-009 ~]# /etc/init.d/openibd start >>> Loading HCA driver and Access Layer: [ OK ] >>> [root@reg-l-vrt-41016-009 ~]# /etc/init.d/openibd stop >>> Unloading mlx4_core [FAILED] >>> rmmod: ERROR: Module mlx4_core is in use >> >> I have absolutely no idea. Please bisect. > > We previously saw a similar issue, that was reported in mailing list. > Dmitry Torokhov suggested the following fix: > https://lkml.org/lkml/2017/9/12/523 > > And indeed, it solved the issue. > > We kept the suggested patch in our internal branch, and rebased. > Issue appeared again once your series was accepted. > > By bisecting, we see that the issue re-appears in this patch: > 4a336a23d619 kobject: copy env blob in one go > >> >> Are you really using netns in the first place ? > > No. But seems like it still affects the modules load/unload. > > Regards, > Tariq
Ah this makes sense now. Dmitry Torokhov hack breaks the assumption I used in my patch. Since it is not upstream yet, I believe that it will need more work before being in a proper state. Thanks.