On Wed, 2020-09-23 at 22:44 +0200, Heiner Kallweit wrote: > On 23.09.2020 22:15, David Miller wrote: > > From: Heiner Kallweit <hkallwe...@gmail.com> > > Date: Wed, 23 Sep 2020 21:58:59 +0200 > > > > > On 23.09.2020 20:35, Saeed Mahameed wrote: > > > > Why would a driver detach the device on ndo_stop() ? > > > > seems like this is the bug you need to be chasing .. > > > > which driver is doing this ? > > > > > > > Some drivers set the device to PCI D3hot at the end of ndo_stop() > > > to save power (using e.g. Runtime PM). Marking the device as > > > detached > > > makes clear to to the net core that the device isn't accessible > > > any > > > longer. > > > > That being the case, the problem is that IFF_UP+!present is not a > > valid netdev state. > > > If this combination is invalid, then netif_device_detach() should > clear IFF_UP? At a first glance this should be sufficient to avoid > the issue I was dealing with. >
Feels like a work around and would conflict with the assumption that netif_device_detach() should only be called when !IFF_UP Maybe we need to clear IFF_UP before calling ops->ndo_stop(dev), instead of after on __dev_close_many(). Assuming no driver is checking IFF_UP state on its own ndo_stop(), other than this, the order shouldn't really matter, since clearing the flag and calling ndo_stop() should be considered as one atomic operation. > > Is it simply the issue that, upon resume, IFF_UP is marked true > > before > > the device is brought out from D3hot state and thus marked as > > present > > again? > > > I can't really comment on that. The issue I was dealing with at the > time I submitted this change was about an async linkwatch event > (caused by powering down the PHY in ndo_stop) trying to access the > device when it was powered down already.