Hi, Ilya Maximets > -----Original Message----- > From: Ilya Maximets [mailto:i.maxim...@samsung.com] > Sent: Monday, October 15, 2018 4:40 PM > To: Zhao1, Wei <wei.zh...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>; > Laurent Hardy <laurent.ha...@6wind.com> > Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin > <konstantin.anan...@intel.com>; sta...@dpdk.org; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber link > update > > On 15.10.2018 06:03, Zhao1, Wei wrote: > > Hi, Ilya Maximets > > > >> -----Original Message----- > >> From: Ilya Maximets [mailto:i.maxim...@samsung.com] > >> Sent: Friday, October 12, 2018 6:15 PM > >> To: Zhao1, Wei <wei.zh...@intel.com>; Zhang, Qi Z > >> <qi.z.zh...@intel.com>; Laurent Hardy <laurent.ha...@6wind.com> > >> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin > >> <konstantin.anan...@intel.com>; sta...@dpdk.org; dev@dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while > >> fiber link update > >> > >> On 12.10.2018 12:19, Zhao1, Wei wrote: > >>> Hi, > >>> > >>>> -----Original Message----- > >>>> From: Ilya Maximets [mailto:i.maxim...@samsung.com] > >>>> Sent: Thursday, October 11, 2018 6:27 PM > >>>> To: Zhao1, Wei <wei.zh...@intel.com>; Zhang, Qi Z > >>>> <qi.z.zh...@intel.com>; Laurent Hardy <laurent.ha...@6wind.com> > >>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin > >>>> <konstantin.anan...@intel.com>; sta...@dpdk.org; dev@dpdk.org > >>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while > >>>> fiber link update > >>>> > >>>> On 11.10.2018 12:56, Zhao1, Wei wrote: > >>>>> Hi, Ilya Maximets AND laurent.hardy > >>>> > >>>> Hi, thanks for sharing your thoughts. > >>>> Comments inline. > >>>> > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ilya > >>>>>> Maximets > >>>>>> Sent: Wednesday, September 12, 2018 4:05 PM > >>>>>> To: Zhang, Qi Z <qi.z.zh...@intel.com>; dev@dpdk.org > >>>>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin > >>>>>> <konstantin.anan...@intel.com>; Laurent Hardy > >>>>>> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; > >>>>>> sta...@dpdk.org > >>>>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while > >>>>>> fiber link update > >>>>>> > >>>>>> On 12.09.2018 09:49, Zhang, Qi Z wrote: > >>>>>>> > >>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: Ilya Maximets [mailto:i.maxim...@samsung.com] > >>>>>>>> Sent: Monday, September 10, 2018 11:09 PM > >>>>>>>> To: Zhang, Qi Z <qi.z.zh...@intel.com>; dev@dpdk.org > >>>>>>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin > >>>>>>>> <konstantin.anan...@intel.com>; Laurent Hardy > >>>>>>>> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; > >>>>>>>> sta...@dpdk.org > >>>>>>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling > >>>>>>>> while fiber link update > >>>>>>>> > >>>>>>>> On 04.09.2018 09:08, Zhang, Qi Z wrote: > >>>>>>>>> Hi Ilya: > >>>>>>>>> > >>>>>>>>>> -----Original Message----- > >>>>>>>>>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ilya > >>>>>>>>>> Maximets > >>>>>>>>>> Sent: Friday, August 31, 2018 8:40 PM > >>>>>>>>>> To: dev@dpdk.org > >>>>>>>>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, > Konstantin > >>>>>>>>>> <konstantin.anan...@intel.com>; Laurent Hardy > >>>>>>>>>> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; > Ilya > >>>>>>>>>> Maximets <i.maxim...@samsung.com>; sta...@dpdk.org > >>>>>>>>>> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while > >>>>>>>>>> fiber link update > >>>>>>>>>> > >>>>>>>>>> If the multispeed fiber link is in DOWN state, > >>>>>>>>>> ixgbe_setup_link could take around a second of busy polling. > >>>>>>>>>> This is highly inconvenient for the case where single thread > >>>>>>>>>> periodically checks the > >>>>>> link statuses. > >>>>>>>>>> For example, OVS main thread periodically updates the link > >>>>>>>>>> statuses and hangs for a really long time busy waiting on > >>>>>>>>>> ixgbe_setup_link() for a DOWN fiber ports. For case with 3 > >>>>>>>>>> down ports it hangs for a 3 seconds and unable to do anything > >>>>>>>>>> including > >>>> packet processing. > >>>>>>>>>> Fix that by shifting that workaround to a separate thread by > >>>>>>>>>> alarm handler that will try to set up link if it is DOWN. > >>>>>>>>> > >>>>>>>>> Does that mean we will block the interrupt thread for 3 seconds? > >>>>>>>> > >>>>>>>> Three times for one second. Other work could be scheduled > >> between. > >>>>>>>> IMHO, it's much better than blocking usual caller for 3 seconds. > >>>>>>>> > >>>>>>>>> Also, can we guarantee there will not be any race condition if > >>>>>>>>> we call > >>>>>>>> ixgbe_setup_link at another thread, the base code API is not > >>>>>>>> assumed to be thread-safe as I know. > >>>>>>>> > >>>>>>>> The only user of 'ixgbe_setup_link' is 'ixgbe_dev_start', but > >>>>>>>> it could be called only if device stopped. 'ixgbe_dev_stop' > >>>>>>>> cancels the > >>>> alarm. > >>>>>>>> Race with 'link_update' avoided by > >> 'IXGBE_FLAG_NEED_LINK_CONFIG' > >>>>>> flag. > >>>>>>> > >>>>>>> I guess, it' not only about when ixgb_setup_link race with > >>>>>>> itself, but also > >>>>>> when it race with other APIs. > >>>>>>> Also the concern is, even in current version, we can prove there > >>>>>>> is no issue, > >>>>>> how can we guarantee we are safe for future base code update? > >>>>>> It's not designed as thread-safe. > >>>>>>> For my option, the change is risky. > >>>>>> > >>>>>> In current implementation interrupt handler already calls the > >>>>>> 'ixgbe_dev_link_update' which subsequently calls 'ixgbe_setup_link' > >>>>>> in our case if LSC interrupts enabled. So, my change makes the > >>>>>> driver even safer by moving 'ixgbe_setup_link' to the same > >>>>>> interrupt > >> thread. > >>>>>> Otherwise two threads (interrupts handler and the link status > >>>>>> checking > >>>>>> thread) could call 'ixgbe_setup_link' simultaneously. > >>>>>> > >>>>>>> > >>>>>>> Btw, since ixgbe support LSC, it is not necessary for "single > >>>>>>> thread > >>>>>> periodically checks the link statuses", right? > >>>>>> > >>>>>> In current implementation it will take at least 5 seconds (4 + 1) > >>>>>> for the interrupt handler to detect DOWN link state for ixgbe > >>>>>> multispeed fiber. This is too much for many real world cases. > >>>>> > >>>>> I have reviewed this patch, now I agree with you of the point > >>>>> that when port is down, " main thread periodically updates the > >>>>> link statuses and > >>>> hangs for a really long time busy waiting on ixgbe_setup_link() for > >>>> a DOWN fiber ports ". > >>>>> This is introduced by a patch in the following: > >>>>> SHA-1: c12d22f65b132c56db7b4fdbfd5ddce27d1e9572 > >>>>> * net/ixgbe: ensure link status is updated > >>>>> > >>>>> Because in this patch, ixgbe_setup_link() is called with input > >>>>> parameter > >>>> autoneg_wait_to_complete=1, this will cause loop check and sleep > delay. > >>>>> At least 82599 seems has this delay.(BTW, whivh type of NIC are > >>>>> you use? X550 or 82599) > >>>> > >>>> I have 82599. > >>>> > >>>>> Your solution is add a eal_alarm_set for ixgbe_setup_link in the > >>>>> thread of > >>>> PMD driver, and do the set up work in that thread, is that right? > >>>>> And main thread avoid hang by the flag of > >>>> IXGBE_FLAG_NEED_LINK_CONFIG. > >>>>> I think this is a good idea for this problem, but it may cause > >>>>> problem for other legacy user of ixgbe pmd, because their legacy > >>>>> code, which use > >>>> main thread to check link state and setup_link when port is down, > >>>> and they are not aware of it is done by other thread if add your patch. > >>>> > >>>> What are these applications? I see no public API for setup_link function. > >>>> It's internal to driver and should not be used externally. > >>>> Am I missing something? > >>> > >>> rte_eth_link_get() , it will also call the function of > >>> ixgbe_setup_link(). > >>> > >> > >> rte_eth_link_get() does not call ixgbe_setup_link(). > >> It only calls dev_ops->link_update(). > > > > No, dev_ops->link_update call function ixgbe_dev_link_update() > > -> ixgbe_dev_link_update_share() -> ixgbe_setup_link() > > But with my patch, calling of ixgbe_setup_link() happens in a separate > (interrupt) thread. There is no direct call from > ixgbe_dev_link_update_share(). > All the calls from the interrupt thread are sequenced by implementation.
Yes, that is just the point that I worry about. ixgbe_setup_link() process will hang 1s at least in other thread even if it is move out from the link status update which is being handle in user thread now. I can not foresee what impact it will have to ixgbe PMD thread that long time, it may be a hidden danger. And the wait time of 1s of link setup can not eliminate also. The difference is which thread to block. I agree with you of the exist of this hang time for user thread in link status update for set up link when port is down, but I can not support the code to move hang time to another thread with reason above. Thank you for your explanation. I am willing to see if anyone can give an definite conclusion for this code change and ACK this patch. Reviewed-by: Wei Zhao <wei.zh...@intel.com> > > > > > > >> > >>> > >>>> > >>>>> > >>>>> And is that ok if we change code in ixgbe_dev_link_update_share() > >>>>> to > >>>>> > >>>>> ixgbe_dev_link_update_share() > >>>>> { > >>>>> > >>>>> /* check if it needs to wait to complete, if lsc interrupt is > >>>>> enabled */ > >>>>> if (wait_to_complete == 0 || dev->data->dev_conf.intr_conf.lsc > >>>>> != 0) > >>>>> wait = 0; > >>>>> > >>>>> if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) && > >>>>> ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) { > >>>>> speed = hw->phy.autoneg_advertised; > >>>>> if (!speed) > >>>>> ixgbe_get_link_capabilities(hw, &speed, > >>>>> &autoneg); > >>>>> ixgbe_setup_link(hw, speed, wait); > >>>>> } > >>>>> } > >>>>> > >>>>> Then, your application can call rte_eth_link_get_nowait () to make > >>>>> wait_to_complete=0 when doing periodic link state check, Which > >>>>> will not > >>>> cause loop check and sleep delay. Legacy code of other user call > >>>> rte_eth_link_get() will not be affected also. > >>>>> But, I am NOT confident ,whether this will introduce new problem > >>>>> when > >>>> set up link without wait! > >>>>> So, this is just a discussion topic. > >>>> > >>>> Unfortunately this will not help. Take a look to the function > >>>> 'ixgbe_setup_mac_link_multispeed_fiber()', which is the main > >>>> problematic function here. 'wait_to_complete' here used only as > >>>> argument for ixgbe_setup_mac_link(), and the busy waiting loops are > >> outside of it. > >>>> Regardless of the 'wait_to_complete' value, this function will busy > >>>> poll the link for 1040 ms trying to setup 10GB speed and 140ms more > >>>> trying to setup 1GB speed. After that, it will call itself > >>>> recursively and wait again... Looks like I miscalculated last time. > >>>> Right now it'll be more than 2 seconds for each down port since > >>>> following > >> patch merged: > >>>> 8fc1f32fa615 ("net/ixgbe: wait longer for link after fiber MAC setup"). > >>> > >>> Yes, you are right, link state check loop in function > >>> ixgbe_setup_mac_link_multispeed_fiber() are not blocked by bool > >> autoneg_wait_to_complete, It will cause about 1s wait when port is down. > >>> And also, can we update code in function > >> ixgbe_setup_mac_link_multispeed_fiber() to block link state check > >> loop using autoneg_wait_to_complete? > >>> I am not sure. Because there is a comment for this loop check: > >>> /* > >>> * Wait for the controller to acquire link. Per IEEE 802.3ap, > >>> * Section 73.10.2, we may have to wait up to 500ms if KR is > >>> * attempted. 82599 uses the same timing for 10g SFI. > >>> */ > >>> It seems we have to wait for at least 500ms for spec requirement > >>> before > >> we check link after configuration. > >>> If that is true, we can not do any change to these loop check. > >>> But why not main thread take some action to stop periodic link sate > >>> check > >> when it find it has been hang or link is down? > >> > >> To find that device is DOWN, thread will have to call this function > >> at least once for each port and wait a few seconds. > >> And how in that case we'll know that device is UP again? > >> As I already wrote in discussion for this patch, LSC is not an > >> option, because it takes at least 5 seconds to detect link state > >> change, which is way too much for many real world apps. > >> > >>> > >>> > >>> > >>>> > >>>>> > >>>>> Hi, laurent.hardy > >>>>> You are the author for the patch (* net/ixgbe: ensure link status > >>>>> is > >>>> updated), why do you implement code that way? > >>>>> Is that must that set up link with wait? > >>>>> > >>>>> ixgbe_setup_link(hw, speed, true); > >>>>> > >>>>> > >>>>> > >>>>>> > >>>>>>> > >>>>>>>> > >>>>>>>>> > >>>>>>>>> Regards > >>>>>>>>> Qi > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Fixes: c12d22f65b13 ("net/ixgbe: ensure link status is > >>>>>>>>>> updated") > >>>>>>>>>> CC: sta...@dpdk.org > >>>>>>>>>> > >>>>>>>>>> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com> > >>>>>>>>>> --- > >>>>>>>>>> drivers/net/ixgbe/ixgbe_ethdev.c | 43 > >>>>>>>>>> ++++++++++++++++++++++++-------- > >>>>>>>>>> 1 file changed, 32 insertions(+), 11 deletions(-) > >>>>>>>>>> > >>>>>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > >>>>>>>>>> b/drivers/net/ixgbe/ixgbe_ethdev.c > >>>>>>>>>> index 26b192737..a33b9a6e8 100644 > >>>>>>>>>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c > >>>>>>>>>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > >>>>>>>>>> @@ -221,6 +221,8 @@ static int > >>>>>>>>>> ixgbe_dev_interrupt_action(struct rte_eth_dev *dev, > >>>>>>>>>> struct rte_intr_handle *handle); > >>>> static > >>>>>> void > >>>>>>>>>> ixgbe_dev_interrupt_handler(void *param); static void > >>>>>>>>>> ixgbe_dev_interrupt_delayed_handler(void *param); > >>>>>>>>>> +static void ixgbe_dev_setup_link_alarm_handler(void > *param); > >>>>>>>>>> + > >>>>>>>>>> static int ixgbe_add_rar(struct rte_eth_dev *dev, struct > >>>>>>>>>> ether_addr *mac_addr, > >>>>>>>>>> uint32_t index, uint32_t pool); static void > >>>>>>>>>> ixgbe_remove_rar(struct rte_eth_dev *dev, uint32_t index); > @@ > >>>>>>>>>> -2791,6 +2793,8 @@ ixgbe_dev_stop(struct rte_eth_dev *dev) > >>>>>>>>>> > >>>>>>>>>> PMD_INIT_FUNC_TRACE(); > >>>>>>>>>> > >>>>>>>>>> + rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler, > >>>> dev); > >>>>>>>>>> + > >>>>>>>>>> /* disable interrupts */ > >>>>>>>>>> ixgbe_disable_intr(hw); > >>>>>>>>>> > >>>>>>>>>> @@ -3969,6 +3973,25 @@ ixgbevf_check_link(struct ixgbe_hw > >> *hw, > >>>>>>>>>> ixgbe_link_speed *speed, > >>>>>>>>>> return ret_val; > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> +static void > >>>>>>>>>> +ixgbe_dev_setup_link_alarm_handler(void *param) { > >>>>>>>>>> + struct rte_eth_dev *dev = (struct rte_eth_dev *)param; > >>>>>>>>>> + struct ixgbe_hw *hw = > >>>>>>>>>> IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > >>>>>>>>>> + struct ixgbe_interrupt *intr = > >>>>>>>>>> + IXGBE_DEV_PRIVATE_TO_INTR(dev->data- > >>>>> dev_private); > >>>>>>>>>> + u32 speed; > >>>>>>>>>> + bool autoneg = false; > >>>>>>>>>> + > >>>>>>>>>> + speed = hw->phy.autoneg_advertised; > >>>>>>>>>> + if (!speed) > >>>>>>>>>> + ixgbe_get_link_capabilities(hw, &speed, &autoneg); > >>>>>>>>>> + > >>>>>>>>>> + ixgbe_setup_link(hw, speed, true); > >>>>>>>>>> + > >>>>>>>>>> + intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG; } > >>>>>>>>>> + > >>>>>>>>>> /* return 0 means link status changed, -1 means not changed > >>>>>>>>>> */ int ixgbe_dev_link_update_share(struct rte_eth_dev *dev, > >>>>>>>>>> @@ - > >>>>>> 3981,9 > >>>>>>>>>> +4004,7 @@ ixgbe_dev_link_update_share(struct rte_eth_dev > >>>> *dev, > >>>>>>>>>> IXGBE_DEV_PRIVATE_TO_INTR(dev->data- > >>>>> dev_private); > >>>>>>>>>> int link_up; > >>>>>>>>>> int diag; > >>>>>>>>>> - u32 speed = 0; > >>>>>>>>>> int wait = 1; > >>>>>>>>>> - bool autoneg = false; > >>>>>>>>>> > >>>>>>>>>> memset(&link, 0, sizeof(link)); > >>>>>>>>>> link.link_status = ETH_LINK_DOWN; @@ -3993,13 +4014,8 > >>>> @@ > >>>>>>>>>> ixgbe_dev_link_update_share(struct > >>>>>>>> rte_eth_dev > >>>>>>>>>> *dev, > >>>>>>>>>> > >>>>>>>>>> hw->mac.get_link_status = true; > >>>>>>>>>> > >>>>>>>>>> - if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) && > >>>>>>>>>> - ixgbe_get_media_type(hw) == > >>>> ixgbe_media_type_fiber) { > >>>>>>>>>> - speed = hw->phy.autoneg_advertised; > >>>>>>>>>> - if (!speed) > >>>>>>>>>> - ixgbe_get_link_capabilities(hw, &speed, > >>>> &autoneg); > >>>>>>>>>> - ixgbe_setup_link(hw, speed, true); > >>>>>>>>>> - } > >>>>>>>>>> + if (intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) > >>>>>>>>>> + return rte_eth_linkstatus_set(dev, &link); > >>>>>>>>>> > >>>>>>>>>> /* check if it needs to wait to complete, if lsc interrupt > >>>>>>>>>> is > >>>> enabled */ > >>>>>>>>>> if (wait_to_complete == 0 || dev->data- > >>>>> dev_conf.intr_conf.lsc > >>>>>>>>>> != > >>>>>>>>>> 0) @@ > >>>>>>>>>> -4017,11 +4033,14 @@ ixgbe_dev_link_update_share(struct > >>>>>> rte_eth_dev > >>>>>>>> *dev, > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> if (link_up == 0) { > >>>>>>>>>> - intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG; > >>>>>>>>>> + if (ixgbe_get_media_type(hw) == > >>>> ixgbe_media_type_fiber) > >>>>>> { > >>>>>>>>>> + intr->flags |= > >>>> IXGBE_FLAG_NEED_LINK_CONFIG; > >>>>>>>>>> + rte_eal_alarm_set(10, > >>>>>>>>>> + > >>>> ixgbe_dev_setup_link_alarm_handler, dev); > >>>>>>>>>> + } > >>>>>>>>>> return rte_eth_linkstatus_set(dev, &link); > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> - intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG; > >>>>>>>>>> link.link_status = ETH_LINK_UP; > >>>>>>>>>> link.link_duplex = ETH_LINK_FULL_DUPLEX; > >>>>>>>>>> > >>>>>>>>>> @@ -5128,6 +5147,8 @@ ixgbevf_dev_stop(struct rte_eth_dev > >>>> *dev) > >>>>>>>>>> > >>>>>>>>>> PMD_INIT_FUNC_TRACE(); > >>>>>>>>>> > >>>>>>>>>> + rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler, > >>>> dev); > >>>>>>>>>> + > >>>>>>>>>> ixgbevf_intr_disable(dev); > >>>>>>>>>> > >>>>>>>>>> hw->adapter_stopped = 1; > >>>>>>>>>> -- > >>>>>>>>>> 2.17.1 > >>>>>>>>>