On 21.09.2018 17:25, Zhang, Qi Z wrote: > > >> -----Original Message----- >> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Zhang, Qi Z >> Sent: Wednesday, September 12, 2018 4:29 PM >> To: Ilya Maximets <i.maxim...@samsung.com>; dev@dpdk.org >> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin >> <konstantin.anan...@intel.com>; Laurent Hardy >> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; >> sta...@dpdk.org >> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber link >> update >> >> >> >>> -----Original Message----- >>> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >>> Sent: Wednesday, September 12, 2018 4:05 PM >>> To: Zhang, Qi Z <qi.z.zh...@intel.com>; dev@dpdk.org >>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin >>> <konstantin.anan...@intel.com>; Laurent Hardy >>> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; >>> sta...@dpdk.org >>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while >>> fiber link update >>> >>> On 12.09.2018 09:49, Zhang, Qi Z wrote: >>>> >>>> >>>>> -----Original Message----- >>>>> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >>>>> Sent: Monday, September 10, 2018 11:09 PM >>>>> To: Zhang, Qi Z <qi.z.zh...@intel.com>; dev@dpdk.org >>>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin >>>>> <konstantin.anan...@intel.com>; Laurent Hardy >>>>> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; >>>>> sta...@dpdk.org >>>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while >>>>> fiber link update >>>>> >>>>> On 04.09.2018 09:08, Zhang, Qi Z wrote: >>>>>> Hi Ilya: >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ilya >>>>>>> Maximets >>>>>>> Sent: Friday, August 31, 2018 8:40 PM >>>>>>> To: dev@dpdk.org >>>>>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin >>>>>>> <konstantin.anan...@intel.com>; Laurent Hardy >>>>>>> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; Ilya >>>>>>> Maximets <i.maxim...@samsung.com>; sta...@dpdk.org >>>>>>> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while >>>>>>> fiber link update >>>>>>> >>>>>>> If the multispeed fiber link is in DOWN state, ixgbe_setup_link >>>>>>> could take around a second of busy polling. This is highly >>>>>>> inconvenient for the case where single thread periodically checks >>>>>>> the link >>> statuses. >>>>>>> For example, OVS main thread periodically updates the link >>>>>>> statuses and hangs for a really long time busy waiting on >>>>>>> ixgbe_setup_link() for a DOWN fiber ports. For case with 3 down >>>>>>> ports it hangs for a 3 seconds and unable to do anything including >> packet processing. >>>>>>> Fix that by shifting that workaround to a separate thread by >>>>>>> alarm handler that will try to set up link if it is DOWN. >>>>>> >>>>>> Does that mean we will block the interrupt thread for 3 seconds? >>>>> >>>>> Three times for one second. Other work could be scheduled between. >>>>> IMHO, it's much better than blocking usual caller for 3 seconds. >>>>> >>>>>> Also, can we guarantee there will not be any race condition if we >>>>>> call >>>>> ixgbe_setup_link at another thread, the base code API is not >>>>> assumed to be thread-safe as I know. >>>>> >>>>> The only user of 'ixgbe_setup_link' is 'ixgbe_dev_start', but it >>>>> could be called only if device stopped. 'ixgbe_dev_stop' cancels the >> alarm. >>>>> Race with 'link_update' avoided by 'IXGBE_FLAG_NEED_LINK_CONFIG' >> flag. >>>> >>>> I guess, it' not only about when ixgb_setup_link race with itself, >>>> but also >>> when it race with other APIs. >>>> Also the concern is, even in current version, we can prove there is >>>> no issue, >>> how can we guarantee we are safe for future base code update? It's not >>> designed as thread-safe. >>>> For my option, the change is risky. >>> >>> In current implementation interrupt handler already calls the >>> 'ixgbe_dev_link_update' which subsequently calls 'ixgbe_setup_link' >>> in our case if LSC interrupts enabled. So, my change makes the driver >>> even safer by moving 'ixgbe_setup_link' to the same interrupt thread. >>> Otherwise two threads (interrupts handler and the link status checking >>> thread) could call 'ixgbe_setup_link' simultaneously. >> >> Ok, you are right, seems the concern I have is already exist , your patch >> does >> not introduce new issue. >> So I have no objection if this will fix some issue. >> But let's check if any ixgbe experts will comment. >> >> Regards >> Qi >> >>> >>>> >>>> Btw, since ixgbe support LSC, it is not necessary for "single thread >>> periodically checks the link statuses", right? >>> >>> In current implementation it will take at least 5 seconds (4 + 1) for >>> the interrupt handler to detect DOWN link state for ixgbe multispeed >>> fiber. This is too much for many real world cases. >>> >>>> >>>>> >>>>>> >>>>>> Regards >>>>>> Qi >>>>>> >>>>>>> >>>>>>> Fixes: c12d22f65b13 ("net/ixgbe: ensure link status is updated") >>>>>>> CC: sta...@dpdk.org >>>>>>> >>>>>>> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com> > > Reviewed-by: Qi Zhang <qi.z.zh...@intel.com>
Hi. Is there any chance for this to be merged in a near future? Situation becomes even worse. Recently accepted commit b2815c41bd0b ("net/ixgbe: wait longer for link after fiber MAC setup") increases the time of the thread hang busy waiting up to 1.5 seconds per port on each link state check. Best regards, Ilya Maximets.