The request for stats should be happening only once every 2 seconds.  Do you 
have a script pounding on getting stats repeatedly?  Are you sure that it's the 
request for stats that is causing the issue you are seeing or are you guessing 
that this is the case?  Can you make this happen without bonding being involved 
(i.e. using multiple interfaces)?

Also, we see that at least some of you work for RH.  If this is on RHEL this is 
the incorrect forum for this and it should be handled through bugzillas and the 
weekly engineering call with Peter Martuccelli.   If it's for Fedora what's the 
RT stuff being used for there?  Please explain.

Cheers,
John


> -----Original Message-----
> From: Luis Claudio R. Goncalves [mailto:[email protected]]
> Sent: Friday, July 12, 2013 11:08 AM
> To: Wyborny, Carolyn
> Cc: [email protected]; Clark Williams
> Subject: Re: [E1000-devel] [RFC] igb: minimize busy loop on
> igb_get_hw_semaphore
> 
> On Thu, Jul 11, 2013 at 10:46:31PM +0000, Wyborny, Carolyn wrote:
> | > -----Original Message-----
> | > From: Luis Claudio R. Goncalves [mailto:[email protected]]
> | > Sent: Thursday, July 11, 2013 11:45 AM
> | > To: [email protected]
> | > Cc: Clark Williams
> | > Subject: Re: [E1000-devel] [RFC] igb: minimize busy loop on
> | > igb_get_hw_semaphore
> | >
> | > Hello,
> | >
> | > A customer noticed a strange issue on his setup, a bonding
> interface
> | > composed of two igb nics. After several debug sessions we are
> pretty
> | > sure the specific symptom reported is caused by a busy loop on
> | > igb_get_hw_semaphore(). The problem was reported on a 3.0.25 kernel
> | > but the patch below was written on 3.8.13.
> | >
> | > The complete scenario is described below and there is a great
> chance
> | > that this issue is only present (or at least more likely to be
> | > triggered) on the PREEMPT_RT enabled kernels... but I would like to
> | > confirm whether this solution is valid or if there is a better way
> to mitigate the problem.
> | >
> | > Thanks,
> | > Luis
> |
> | Hello Luis,
> |
> | This is a complicated setup and not something we'd be doing much
> | testing on.  The semaphore calls are intended to serialize access to
> | certain areas in the hw, usually the PHY.  Making the delays
> | pre-emptible does not necessarily accomplish the same thing.
> |
> | Have you tested the proposed patch and does it speed things up enough
> to
> | find what you need to find?   Another thing to try is to reduce the
> time
> | value rather than change the type of delay being used and see if you
> | can find a way to speed things up that way.
> 
> First of all, thanks for replying! :)
> 
> I have the impression that reducing the delay time on
> igb_get_hw_semaphore() wouldn't help much here because
> igb_release_swfw_sync_82575() has this piece of code:
> 
>       while (igb_get_hw_semaphore(hw) != 0);
> 
> So, even if the udelay used there was 1us, in cases like the one
> described below, you would be subjected to unbound busy waits.
> 
> I can see that the issue happened because someone else (maybe even the
> HW) was holding the semaphore for a long time.
> 
> Busy waits/loops are dangerous on RT when the process is running at
> higher priorities. In this case ifconfig was a regular process but when
> it requested the NIC stats, it held the bond->lock.
> 
> Then, while ifconfig was busy waiting on 'while
> (igb_get_hw_semaphore(hw) != 0);'
> the igb TX threads (we use threaded IRQs on RT) needed that lock. As
> these IRQ threads run at higher RT priorities, in order to have their
> work interrupted for smaller periods while waiting for threads running
> at lower priorities they perform a Priority Inheritance operation, they
> lend their priority to the lower priority thread until it releases the
> lock.
> 
> This way, the regular process 'ifconfig' busy waiting for the HW
> semaphore becomes a Real Time thread, running (in this example) at
> FIFO:85 and therefore preventing any other thread of equal or lower
> priority from getting any CPU time. If this persists for a long time,
> several subsystems may experience problems and even collapse. One such
> example is RCU.
> 
> Sorry if this email is getting a bit too big. While I understand the
> need for serialization and the way it was done on
> igb_get_hw_semaphore(), I would like to see if there is another way,
> less likely to create a corner case in RT.
> 
> Again, this was observed only once and may not be easy to reproduce.
> But it seems to be a real issue. All this scenario data was gathered by
> debugging the vmcores (created by kdump) using crash.
> 
> Luis
> 
> | Let me know if there is more info I can provide.  I can review your
> | full lspci -vvv , ethtool ethX output and your .CONFIG for anything
> | else to check and, of course a full dmesg that shows the problem you
> are seeing.
> | I'm no bonding expert though, so if the problem is there, I may not
> | have much to offer.
> |
> | Hope this helps.
> |
> | Carolyn
> |
> | Carolyn Wyborny
> | Linux Development
> | Networking Division
> | Intel Corporation
> |
> |
> | >
> | > ----
> | >
> | > igb: minimize busy loop on igb_get_hw_semaphore
> | >
> | > Bugzilla: 976912
> | >
> | > In drivers/net/ethernet/intel/igb/e1000_82575.c, funtion
> | > igb_release_swfw_sync_82575() there is this line:
> | >
> | >   while (igb_get_hw_semaphore(hw) != 0);
> | >
> | > That is basically a busy loop waiting on a HW semaphore.
> | >
> | > A customer has a setup where two igb NICs are part of a bonding
> interface.
> | > This customer also has a monitoring script that calls ifconfig
> | > often. It was observed that in this scenario there is a chance that
> | > this ifconfig, that happens to hold the bond->lock while collecting
> | > statistics, enters this busy loop waiting for another thread clear
> that HW semaphore.
> | >
> | > Meanwhile, the irq/xxx-ethY-Tx threads, running at FIFO:85, try to
> | > acquire the bond lock, held by ifconfig. As it happens on RT, a
> | > Priority Inheritance operation is started and ifconfig is boosted
> to
> | > FIFO:85 so that it may be able to finish its work sooner and
> release
> | > the bond->lock, desired by the aforementioned threads.
> | >
> | > As ifconfig is running on a busy loop, waiting for the HW
> semaphore,
> | > this thread now runs a busy loop at a very high priority,
> preventing
> | > other threads on that CPU from progressing.
> | >
> | > On that scenario, it seems that the thread holding the HW semaphore
> | > is also waiting for a lock held by other task. This whole scenario
> | > leads to RCU stall warnings, that have as side effects a crescent
> number of threads being stuck.
> | > As this progresses, the livelock reaches threads on other CPUs and
> | > the system becomes more and more unresponsive.
> | >
> | > This little patch aims to prevent the busy loop at a high priority
> | > (the code called by ifconfig in this example) to starve the threads
> | > on the same CPU. It may not solve the issue but will at least lead
> | > us closer to the real issue, masked by the RCU stalls created by
> the busy loop.
> | >
> | > This is mostly a debug patch for a testing kernel.
> | >
> | > Signed-off-by: Luis Claudio R. Goncalves <[email protected]>
> | >
> | > diff --git a/drivers/net/ethernet/intel/igb/e1000_mac.c
> | > b/drivers/net/ethernet/intel/igb/e1000_mac.c
> | > index a5c7200..ec0be87 100644
> | > --- a/drivers/net/ethernet/intel/igb/e1000_mac.c
> | > +++ b/drivers/net/ethernet/intel/igb/e1000_mac.c
> | > @@ -1225,7 +1225,7 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
> | >           if (!(swsm & E1000_SWSM_SMBI))
> | >                   break;
> | >
> | > -         udelay(50);
> | > +         usleep_range(50,51);
> | >           i++;
> | >   }
> | >
> | > @@ -1244,7 +1244,7 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
> | >           if (rd32(E1000_SWSM) & E1000_SWSM_SWESMBI)
> | >                   break;
> | >
> | > -         udelay(50);
> | > +         usleep_range(50,51);
> | >   }
> | >
> | >   if (i == timeout) {
> | > --
> --
> [ Luis Claudio R. Goncalves             Red Hat  -  Realtime Team ]
> [ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8 ]
> 
> 
> -----------------------------------------------------------------------
> -------
> See everything from the browser to the database with AppDynamics Get
> end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.cl
> ktrk
> _______________________________________________
> E1000-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel&#174; Ethernet, visit
> http://communities.intel.com/community/wired

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to