On 06/13/2012 03:43 PM, Roch wrote: > > Sašo Kiselkov writes: > > On 06/12/2012 05:37 PM, Roch Bourbonnais wrote: > > > > > > So the xcall are necessary part of memory reclaiming, when one needs to > tear down the TLB entry mapping the physical memory (which can from here on > be repurposed). > > > So the xcall are just part of this. Should not cause trouble, but they > do. They consume a cpu for some time. > > > > > > That in turn can cause infrequent latency bubble on the network. A > certain root cause of these latency bubble is that network thread are bound > by default and > > > if the xcall storm ends up on the CPU that the network thread is bound > to, it will wait for the storm to pass. > > > > I understand, but the xcall storm settles only eats up a single core out > > of a total of 32, plus it's not a single specific one, it tends to > > change, so what are the odds of hitting the same core as the one on > > which the mac thread is running? > > > > That's easy :-) : 1/32 each time it needs to run. So depending on how often > it runs (which depends on how > much churn there is in the ARC) and how often you see the latency bubbles, > that may or may > not be it. > > What is zio_taskq_batch_pct on your system ? That is another storm bit of > code which > causes bubble. Setting it down to 50 (versus an older default of 100) should > help if it's > not done already. > > -r
So I tried all of the suggestions above (mac unbinding, zio_taskq tuning) and none helped. I'm beginning to suspect it has something to do with the networking cards. When I try and snoop filtered traffic from one interface into a file ("snoop -o /tmp/dump -rd vlan935 host a.b.c.d"), my multicast reception throughput plummets to about 1/3 of the original. I'm running a link-aggregation of 4 on-board Broadcom NICs: # dladm show-aggr -x LINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATE aggr0 -- 1000Mb full up d0:67:e5:fc:bd:38 -- bnx1 1000Mb full up d0:67:e5:fc:bd:38 attached bnx2 1000Mb full up d0:67:e5:fc:bd:3a attached bnx3 1000Mb full up d0:67:e5:fc:bd:3c attached bnx0 1000Mb full up d0:67:e5:fc:bd:36 attached # dladm show-vlan LINK VID OVER FLAGS vlan49 49 aggr0 ----- vlan934 934 aggr0 ----- vlan935 935 aggr0 ----- Normally, I'm getting around 46MB/s on vlan935, however, once I run any snoop command which puts the network interfaces into promisc mode, my throughput plummets to around 20MB/s. During that I can see context switches skyrocket on 4 CPU cores and them being around 75% busy. Now I understand that snoop has some probe effect, but this is definitely too large. I've never seen this kind of bad behavior before on any of my other Solaris systems (with similar load). Are there any tunings I can make to my network to track down the issue? My module for bnx is: # modinfo | grep bnx 169 fffffffff80a7000 63ba0 197 1 bnx (Broadcom NXII GbE 6.0.1) Regards, -- Saso _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss