On Thu, Jul 11, 2013 at 10:46:31PM +0000, Wyborny, Carolyn wrote:
| > -----Original Message-----
| > From: Luis Claudio R. Goncalves [mailto:[email protected]]
| > Sent: Thursday, July 11, 2013 11:45 AM
| > To: [email protected]
| > Cc: Clark Williams
| > Subject: Re: [E1000-devel] [RFC] igb: minimize busy loop on
| > igb_get_hw_semaphore
| > 
| > Hello,
| > 
| > A customer noticed a strange issue on his setup, a bonding interface 
composed
| > of two igb nics. After several debug sessions we are pretty sure the 
specific
| > symptom reported is caused by a busy loop on igb_get_hw_semaphore(). The
| > problem was reported on a 3.0.25 kernel but the patch below was written on
| > 3.8.13.
| > 
| > The complete scenario is described below and there is a great chance that 
this
| > issue is only present (or at least more likely to be triggered) on the 
PREEMPT_RT
| > enabled kernels... but I would like to confirm whether this solution is 
valid or if
| > there is a better way to mitigate the problem.
| > 
| > Thanks,
| > Luis
| 
| Hello Luis,
| 
| This is a complicated setup and not something we'd be doing much testing
| on.  The semaphore calls are intended to serialize access to certain areas
| in the hw, usually the PHY.  Making the delays pre-emptible does not
| necessarily accomplish the same thing.  
| 
| Have you tested the proposed patch and does it speed things up enough to
| find what you need to find?   Another thing to try is to reduce the time
| value rather than change the type of delay being used and see if you can
| find a way to speed things up that way.

First of all, thanks for replying! :)

I have the impression that reducing the delay time on igb_get_hw_semaphore()
wouldn't help much here because igb_release_swfw_sync_82575() has this
piece of code:

        while (igb_get_hw_semaphore(hw) != 0);

So, even if the udelay used there was 1us, in cases like the one described
below, you would be subjected to unbound busy waits.

I can see that the issue happened because someone else (maybe even the HW)
was holding the semaphore for a long time. 

Busy waits/loops are dangerous on RT when the process is running at higher
priorities. In this case ifconfig was a regular process but when it
requested the NIC stats, it held the bond->lock.

Then, while ifconfig was busy waiting on 'while (igb_get_hw_semaphore(hw) != 
0);'
the igb TX threads (we use threaded IRQs on RT) needed that lock. As these
IRQ threads run at higher RT priorities, in order to have their work
interrupted for smaller periods while waiting for threads running at lower
priorities they perform a Priority Inheritance operation, they lend their
priority to the lower priority thread until it releases the lock.

This way, the regular process 'ifconfig' busy waiting for the HW semaphore
becomes a Real Time thread, running (in this example) at FIFO:85 and
therefore preventing any other thread of equal or lower priority from
getting any CPU time. If this persists for a long time, several subsystems
may experience problems and even collapse. One such example is RCU.

Sorry if this email is getting a bit too big. While I understand the need
for serialization and the way it was done on igb_get_hw_semaphore(), I
would like to see if there is another way, less likely to create a corner
case in RT.

Again, this was observed only once and may not be easy to reproduce. But it
seems to be a real issue. All this scenario data was gathered by
debugging the vmcores (created by kdump) using crash.

Luis

| Let me know if there is more info I can provide.  I can review your full
| lspci -vvv , ethtool ethX output and your .CONFIG for anything else to
| check and, of course a full dmesg that shows the problem you are seeing.
| I'm no bonding expert though, so if the problem is there, I may not have
| much to offer.
| 
| Hope this helps.
| 
| Carolyn
| 
| Carolyn Wyborny 
| Linux Development 
| Networking Division 
| Intel Corporation 
| 
| 
| > 
| > ----
| > 
| > igb: minimize busy loop on igb_get_hw_semaphore
| > 
| > Bugzilla: 976912
| > 
| > In drivers/net/ethernet/intel/igb/e1000_82575.c, funtion
| > igb_release_swfw_sync_82575() there is this line:
| > 
| >     while (igb_get_hw_semaphore(hw) != 0);
| > 
| > That is basically a busy loop waiting on a HW semaphore.
| > 
| > A customer has a setup where two igb NICs are part of a bonding interface.
| > This customer also has a monitoring script that calls ifconfig often. It was
| > observed that in this scenario there is a chance that this ifconfig, that 
happens
| > to hold the bond->lock while collecting statistics, enters this busy loop 
waiting
| > for another thread clear that HW semaphore.
| > 
| > Meanwhile, the irq/xxx-ethY-Tx threads, running at FIFO:85, try to acquire 
the
| > bond lock, held by ifconfig. As it happens on RT, a Priority Inheritance 
operation
| > is started and ifconfig is boosted to FIFO:85 so that it may be able to 
finish its
| > work sooner and release the bond->lock, desired by the aforementioned
| > threads.
| > 
| > As ifconfig is running on a busy loop, waiting for the HW semaphore, this 
thread
| > now runs a busy loop at a very high priority, preventing other threads on 
that
| > CPU from progressing.
| > 
| > On that scenario, it seems that the thread holding the HW semaphore is also
| > waiting for a lock held by other task. This whole scenario leads to RCU 
stall
| > warnings, that have as side effects a crescent number of threads being 
stuck.
| > As this progresses, the livelock reaches threads on other CPUs and the 
system
| > becomes more and more unresponsive.
| > 
| > This little patch aims to prevent the busy loop at a high priority (the 
code called
| > by ifconfig in this example) to starve the threads on the same CPU. It may 
not
| > solve the issue but will at least lead us closer to the real issue, masked 
by the
| > RCU stalls created by the busy loop.
| > 
| > This is mostly a debug patch for a testing kernel.
| > 
| > Signed-off-by: Luis Claudio R. Goncalves <[email protected]>
| > 
| > diff --git a/drivers/net/ethernet/intel/igb/e1000_mac.c
| > b/drivers/net/ethernet/intel/igb/e1000_mac.c
| > index a5c7200..ec0be87 100644
| > --- a/drivers/net/ethernet/intel/igb/e1000_mac.c
| > +++ b/drivers/net/ethernet/intel/igb/e1000_mac.c
| > @@ -1225,7 +1225,7 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
| >             if (!(swsm & E1000_SWSM_SMBI))
| >                     break;
| > 
| > -           udelay(50);
| > +           usleep_range(50,51);
| >             i++;
| >     }
| > 
| > @@ -1244,7 +1244,7 @@ s32 igb_get_hw_semaphore(struct e1000_hw *hw)
| >             if (rd32(E1000_SWSM) & E1000_SWSM_SWESMBI)
| >                     break;
| > 
| > -           udelay(50);
| > +           usleep_range(50,51);
| >     }
| > 
| >     if (i == timeout) {
| > --
-- 
[ Luis Claudio R. Goncalves             Red Hat  -  Realtime Team ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8 ]


------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to