On 06/13/2012 03:43 PM, Roch wrote:
> 
> Sašo Kiselkov writes:
>  > On 06/12/2012 05:37 PM, Roch Bourbonnais wrote:
>  > > 
>  > > So the xcall are necessary part of memory reclaiming, when one needs to 
> tear down the TLB entry mapping the physical memory (which can from here on 
> be repurposed).
>  > > So the xcall are just part of this. Should not cause trouble, but they 
> do. They consume a cpu for some time.
>  > > 
>  > > That in turn can cause infrequent latency bubble on the network. A 
> certain root cause of these latency bubble is that network thread are bound 
> by default and
>  > > if the xcall storm ends up on the CPU that the network thread is bound 
> to, it will wait for the storm to pass.
>  > 
>  > I understand, but the xcall storm settles only eats up a single core out
>  > of a total of 32, plus it's not a single specific one, it tends to
>  > change, so what are the odds of hitting the same core as the one on
>  > which the mac thread is running?
>  > 
> 
> That's easy :-) : 1/32 each time it needs to run. So depending on how often 
> it runs (which depends on how
> much churn there is in the ARC) and how often you see the latency bubbles, 
> that may or may
> not be it.
> 
> What is zio_taskq_batch_pct on your system ? That is another storm bit of 
> code which
> causes bubble. Setting it down to 50 (versus an older default of 100) should 
> help if it's
> not done already.
> 
> -r

So I tried all of the suggestions above (mac unbinding, zio_taskq
tuning) and none helped. I'm beginning to suspect it has something to do
with the networking cards. When I try and snoop filtered traffic from
one interface into a file ("snoop -o /tmp/dump -rd vlan935 host
a.b.c.d"), my multicast reception throughput plummets to about 1/3 of
the original.

I'm running a link-aggregation of 4 on-board Broadcom NICs:

# dladm show-aggr -x
LINK     PORT     SPEED DUPLEX   STATE     ADDRESS            PORTSTATE
aggr0    --       1000Mb full    up        d0:67:e5:fc:bd:38  --
         bnx1     1000Mb full    up        d0:67:e5:fc:bd:38  attached
         bnx2     1000Mb full    up        d0:67:e5:fc:bd:3a  attached
         bnx3     1000Mb full    up        d0:67:e5:fc:bd:3c  attached
         bnx0     1000Mb full    up        d0:67:e5:fc:bd:36  attached

# dladm show-vlan
LINK            VID      OVER         FLAGS
vlan49          49       aggr0        -----
vlan934         934      aggr0        -----
vlan935         935      aggr0        -----

Normally, I'm getting around 46MB/s on vlan935, however, once I run any
snoop command which puts the network interfaces into promisc mode, my
throughput plummets to around 20MB/s. During that I can see context
switches skyrocket on 4 CPU cores and them being around 75% busy. Now I
understand that snoop has some probe effect, but this is definitely too
large. I've never seen this kind of bad behavior before on any of my
other Solaris systems (with similar load).

Are there any tunings I can make to my network to track down the issue?
My module for bnx is:

# modinfo | grep bnx
169 fffffffff80a7000  63ba0 197   1  bnx (Broadcom NXII GbE 6.0.1)

Regards,
--
Saso
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to