Hi From: Benoit Ganne > Sent: Monday, March 30, 2020 3:03 PM > To: Matan Azrad <ma...@mellanox.com>; dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH] net/mlx5: fix link state update > > Hi Matan, > > >>>> mlx5 PMD refuses to update link state if link speed is defined but > >>>> status is down or if link speed is undefined but status is up, even > >>>> if the ioctl() succeeded. > >>>> This prevents application to detect link up/down event, especially > >>>> when the link speed is not correctly detected. > >>> Do you use the wait option? Or no wait? > >> We are using the no wait option. > > I suggest to call again if failed for N retries time. > > Unfortunately it will not solve our problem: if link speed is undefined but > the > link is up then the test '!dev_link.link_speed && dev_link.link_status' at > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgit.dp > dk.org%2Fdpdk%2Ftree%2Fdrivers%2Fnet%2Fmlx5%2Fmlx5_ethdev.c%23n8 > 99&data=02%7C01%7Cmatan%40mellanox.com%7C8a0fb9f9ba94422a4a > 4208d7d4a2539c%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637 > 211665962283891&sdata=%2FExOJyhm0Bi4kkp434P0U4kRpQs7U0knuTa > kvsWFO5Q%3D&reserved=0 will always be true and the function will > always return EAGAIN. > This actually happens in Azure with CX4-Lx VFs. > > >> What I meant was to let the app decide whether it should retry or > >> not, based on the data it gets. > >> Right now, the PMD *prevents* the app to get link state if the link > >> speed is undefined even if the app does not care about link speed. > > > In mlx5 this is not the case, we have no one updated and second not - > > there are going together: > > You can see that we have 2 different system calls: 1 to get up\down > > and second to get link speed. > > If link speed doesn't appropriate to the link state it may say that > > something was changed between the calls and the link status we got > > from the first call is not correct anymore. > > In this case, we should call both calls again, that’s what we are > > doing in "nowait" option. > > If the user doesn't want "nowait" option, (means PMD is not allowed to > > take more time for response) he should call again when the callback > > failed in the time and retries manner the user prefers. > > Ok, now I understand the logic behind the current behavior: the 2 syscalls > being not atomics, you try to detect inconsistencies that way. > But if the link speed is undefined, then the state will never be correctly > updated.
Why link speed is undefined? Old kernel? Kernel mlx5 driver issue? Do you know? > I still believe it is unnecessarily heavy-handed: in most networking > application > I have seen (and I have 2 examples of current shipping networking products), > a missing link speed is not critical whereas link being reported as down means > no traffic flowing. > > Best > ben