Re: net80211 and interface requests

2011-03-31 Thread Bernhard Schmidt
On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote:
> Hello,
> 
> This list has helped me before so I'll email again with the hopes that
> somebody has an answer.  All is working well with my project, however for
> the life of me I cannot get the interface to inject the raw frames faster
> than 11mbps.  I'm following the example given in
> /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying
> parameters such as ucastrate, mcastrate, and mgmtrate within ifconfig.  I'm
> putting the card into pureg mode, and yet I still can't inject any faster.
>  I've even gone so far as to specify an ieee802211_txparam struct giving
> values of 255 both mcast and ucast rates within the struct (and of course
> anding them by 0xff).  I then used the ioctl call to set the flags within
> the interface request.  Any help would be greatly appreciated.

You've set the ibp_rate0 parameter right? This one is in half-mbps, so
a value of 108 should give you 54m. The only thing I can think of right
now is that the device (or channel) is actually configured for 11b not
11g mode. Can we rule that out? Which device are you using?

> I am doing nanosleeps in between transmissions as if I don't the bpf clone
> can't inject due to the buffer being too full.  There's probably a better
> way of doing this, but I doubt the nanosleeps are the issue (afterall, I get
> almost exactly 11mbps).  I should probably note I'm not doing any ACKs, this
> is pure transmits.
> 
> If anybody cares enough to look at my unpolished code to get a better idea,
> look here:
> 
> http://projhinternet.svn.sourceforge.net/
> 
> The idea is to allow unidirectional traffic so that with an FCC amateur
> license (yes I know I'm not currently broadcasting the call sign as of yet)
> you can broadcast unencrypted transmissions for miles (with a linear
> amplifier spec'd to 2.4ghz).  With the license FCC part15 no longer applies
> and you can operate just like in any other amateur band.
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 

-- 
Bernhard
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Kernel memory corruption(?) with age(4)

2011-03-31 Thread Yamagi Burmeister

On Wed, 30 Mar 2011, YongHyeon PYUN wrote:


Okay, I did a test run with RX checksum, TX checksum and both disabled.
In all three cases the crash occurs within about 20 minutes. I'm either
not sure that age(4) is the problem but it has definedly something to do
with the problem, since with another nic driver the same scenario is
rock solid...



OK.


The workload: It's a NFS3 server (FreeBSDs non-experimental
implementation), serving and receiving file with about 250 to 500
megabytes at about 20mb/s. The clients are FreeBSD 7 and 8 systems and
are mounting the shares via TCP. The connection is 1000mbit/s via a
"dumb" gigabit switch.



That's too broad to narrow down the issue. :-(
I'm not sure but your box seem to have more than 4GB memory. Could
you limit the available memory to 3GB via loader.conf and test it
again?


All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64.
After limiting the memory via hw.physmem to 3GB the problems are gone.
The box is running crashfree for more than 6 hours and has served over
300GB of data via age(4).

--
Homepage: www.yamagi.org
Jabber:   yam...@yamagi.org
GnuPG/GPG:0xEFBCCBCB
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: UDP on FreeBSD

2011-03-31 Thread Julian Elischer

On 3/30/11 2:32 PM, Michael Proto wrote:

On Wed, Mar 30, 2011 at 3:43 PM, Kyungsoo Lee  wrote:

Hi All,

I want to check UDP on FreeBSD.

I am using IPERF on FreeBSD for wireless testing with Proxim 8470 FC PCMCIA
card on IBM T42 and T61.

When I'm transmitting data from FreeBSD to FreeBSD or CentOS using Iperf
with -u -b 100M on iperf, they had lost lots of packets. Sniffer near the
two nodes shows the sender could not send all packets. Iperf sender said
that they try to send 85469 packets but they lost 68824 packets. I think
that the UDP buffer on the sender could not handle all packets.

But if I'm trying to send data from CentOS to FreeBSD using Iperf with -u -b
100M option on iperf, the sender tries 18636 packets so they lost few
packets like 1 or 2 packets.As a result, they have similar bandwidth result
on the report. I think that it happens from different implement between
FreeBSD and Linux.

But I want to double check that this is normal for FreeBSD or not. If I have
some missing points, let me know please.

Thank you!
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Just a guess, but have you tried adjusting the net.inet.udp.maxdgram
sysctl? I believe the default is somewhat low for UDP transmit. I
don't know what size packets iperf is using but increasing the
maxdgram value might help your testing.


this is many years out of date but a decade or so ago freebsd would 
return ENOBUFS

and linux would block when the outgoing queues filled up.
the answer then was that teh programs are all written for Linux and 
didn't check for ENOBUFS

but that may be out of date now in many different ways.


-Proto
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: net80211 and interface requests

2011-03-31 Thread Adam Stylinski
On Thu, Mar 31, 2011 at 09:02:45AM +0200, Bernhard Schmidt wrote:
> On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote:
> > Hello,
> > 
> > This list has helped me before so I'll email again with the hopes that
> > somebody has an answer.  All is working well with my project, however for
> > the life of me I cannot get the interface to inject the raw frames faster
> > than 11mbps.  I'm following the example given in
> > /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying
> > parameters such as ucastrate, mcastrate, and mgmtrate within ifconfig.  I'm
> > putting the card into pureg mode, and yet I still can't inject any faster.
> >  I've even gone so far as to specify an ieee802211_txparam struct giving
> > values of 255 both mcast and ucast rates within the struct (and of course
> > anding them by 0xff).  I then used the ioctl call to set the flags within
> > the interface request.  Any help would be greatly appreciated.
> 
> You've set the ibp_rate0 parameter right? This one is in half-mbps, so
> a value of 108 should give you 54m. The only thing I can think of right
> now is that the device (or channel) is actually configured for 11b not
> 11g mode. Can we rule that out? Which device are you using?
> 
> > I am doing nanosleeps in between transmissions as if I don't the bpf clone
> > can't inject due to the buffer being too full.  There's probably a better
> > way of doing this, but I doubt the nanosleeps are the issue (afterall, I get
> > almost exactly 11mbps).  I should probably note I'm not doing any ACKs, this
> > is pure transmits.
> > 
> > If anybody cares enough to look at my unpolished code to get a better idea,
> > look here:
> > 
> > http://projhinternet.svn.sourceforge.net/
> > 
> > The idea is to allow unidirectional traffic so that with an FCC amateur
> > license (yes I know I'm not currently broadcasting the call sign as of yet)
> > you can broadcast unencrypted transmissions for miles (with a linear
> > amplifier spec'd to 2.4ghz).  With the license FCC part15 no longer applies
> > and you can operate just like in any other amateur band.
> > ___
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> > 
> 
> -- 
> Bernhard

I'm using an atheros AR2413 chipset, running in pure g mode, with also the card 
put into "mode 11g" and ucast, mcast, and mgmt rates set to 54.  I think the 
parameter for ibp_rate0 is just for setting it in the header (but I could be 
wrong).  Regardless I am doing this, let me give you the exact source files I'm 
doing this in.

Line 38 in this file:
http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/callbacks.c?revision=69&view=markup
 

And the setup_if function in this:
http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/libinject.c?revision=69&view=markup


pgpO34YtxFSi5.pgp
Description: PGP signature


mpd5/Netgraph issues after upgrading to 7.4

2011-03-31 Thread Przemyslaw Frasunek
Hello,

I have upgraded one of my mpd5 based PPPoE access servers from 7.3-RELEASE to
7.4-RELEASE. Just after upgrade, I started getting following errors:

Mar 31 13:48:06 lsm-gw mpd: [B-150] Bundle: Interface ng149 created
Mar 31 13:48:06 lsm-gw mpd: [B-150] can't create ppp node at ".:"->"b150":
Operation not permitted
Mar 31 13:48:06 lsm-gw mpd: [B-150] Bundle netgraph initialization failed
Mar 31 13:48:06 lsm-gw mpd: [zemborzyce-156] Bundle creation error
Mar 31 13:48:06 lsm-gw mpd: [zemborzyce-156] link did not validate in bundle
Mar 31 13:48:06 lsm-gw mpd: [zemborzyce-156] LCP: parameter negotiation failed
Mar 31 13:48:06 lsm-gw mpd: [zemborzyce-156] LCP: state change Opened --> 
Stopping

[root@lsm-gw /var/log]# grep -ic "operation not permitted" mpd.log
1756

It seems to occur only on specific bundles and after brief period of time,
session establishment eventually succeeds. Is this related to resource shortage?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: net80211 and interface requests

2011-03-31 Thread Bernhard Schmidt
On Thursday, March 31, 2011 14:20:33 Adam Stylinski wrote:
> On Thu, Mar 31, 2011 at 09:02:45AM +0200, Bernhard Schmidt wrote:
> > On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote:
> > > Hello,
> > > 
> > > This list has helped me before so I'll email again with the hopes that
> > > somebody has an answer.  All is working well with my project, however for
> > > the life of me I cannot get the interface to inject the raw frames faster
> > > than 11mbps.  I'm following the example given in
> > > /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying
> > > parameters such as ucastrate, mcastrate, and mgmtrate within ifconfig.  
> > > I'm
> > > putting the card into pureg mode, and yet I still can't inject any faster.
> > >  I've even gone so far as to specify an ieee802211_txparam struct giving
> > > values of 255 both mcast and ucast rates within the struct (and of course
> > > anding them by 0xff).  I then used the ioctl call to set the flags within
> > > the interface request.  Any help would be greatly appreciated.
> > 
> > You've set the ibp_rate0 parameter right? This one is in half-mbps, so
> > a value of 108 should give you 54m. The only thing I can think of right
> > now is that the device (or channel) is actually configured for 11b not
> > 11g mode. Can we rule that out? Which device are you using?
> > 
> > > I am doing nanosleeps in between transmissions as if I don't the bpf clone
> > > can't inject due to the buffer being too full.  There's probably a better
> > > way of doing this, but I doubt the nanosleeps are the issue (afterall, I 
> > > get
> > > almost exactly 11mbps).  I should probably note I'm not doing any ACKs, 
> > > this
> > > is pure transmits.
> > > 
> > > If anybody cares enough to look at my unpolished code to get a better 
> > > idea,
> > > look here:
> > > 
> > > http://projhinternet.svn.sourceforge.net/
> > > 
> > > The idea is to allow unidirectional traffic so that with an FCC amateur
> > > license (yes I know I'm not currently broadcasting the call sign as of 
> > > yet)
> > > you can broadcast unencrypted transmissions for miles (with a linear
> > > amplifier spec'd to 2.4ghz).  With the license FCC part15 no longer 
> > > applies
> > > and you can operate just like in any other amateur band.
> > > ___
> > > freebsd-net@freebsd.org mailing list
> > > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> > > 
> > 
> 
> I'm using an atheros AR2413 chipset, running in pure g mode, with also the 
> card put into "mode 11g" and ucast, mcast, and mgmt rates set to 54.  I think 
> the parameter for ibp_rate0 is just for setting it in the header (but I could 
> be wrong).  Regardless I am doing this, let me give you the exact source 
> files I'm doing this in.

Well, the ath_rate_* modules afaik do not honor the fixed rate
settings. At least I've heard something about those being broken. The
ibp_rate0 parameter set to 108 seems to be correct though.

No clue why that doesn't work, you may have to debug ath_tx_findrix().
Adding a printf of the passed over rate and ridx should shed some light
on this I guess.

> Line 38 in this file:
> http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/callbacks.c?revision=69&view=markup
>  
> 
> And the setup_if function in this:
> http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/libinject.c?revision=69&view=markup
> 

-- 
Bernhard
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"]

2011-03-31 Thread Arnaud Lacombe
Hi

[let's start a new thread :)]

On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel  wrote:
> Read the code in HEAD, em_local_timer() has a test of ALL the rx queues and
> will schedule a task that refreshes mbufs if they are empty. This has
> exactly the
> same effect as checking for some interrupt cause, a cause that is not
> available
> when using MSIX on 82574, but this approach works for everything.
>
ok, it took me a long time to reproduce the issue with em(4) version
7.1.9, about 3h rather than a few minutes a month ago and only got
~875 allocations failure vs. several thousand before, here are some
stats:

# sysctl -a | grep missed
dev.em.0.mac_stats.missed_packets: 1917112
dev.em.1.mac_stats.missed_packets: 0
dev.em.2.mac_stats.missed_packets: 0
dev.em.3.mac_stats.missed_packets: 0
dev.em.4.mac_stats.missed_packets: 0
dev.em.5.mac_stats.missed_packets: 0

# sysctl dev.em.0.debug=1
dev.em.0.debug: I-1nterface is RUNNING and INACTIVE
em0: hw tdh = 861, hw tdt = 861
em0: hw rdh = 929, hw rdt = 929
em0: Tx Queue Status = 0
em0: TX descriptors avail = 1024
em0: Tx Descriptors avail failure = 0
em0: RX discarded packets = 0
em0: RX Next to Check = 929
em0: RX Next to Refresh = 930
 -> -1

I backported the -current driver to 7.1 and re-ran the test overnight.
Now, the box is running 7.2.2. The box was hung this morning:

dev.em.0.mac_stats.missed_packets: 25513991
dev.em.1.mac_stats.missed_packets: 0
dev.em.2.mac_stats.missed_packets: 0
dev.em.3.mac_stats.missed_packets: 0
dev.em.4.mac_stats.missed_packets: 0
dev.em.5.mac_stats.missed_packets: 0

There has been about 1000 mbuf allocation denial. I changed some
relevant field of the RX soft stat in the sysctl output of the device
[Of course, the only field of interest, `next_to_check' is invalid
because of a typo... I should not change code past a certain hour :)],
here it is:

# sysctl dev.em.0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086
subdevice=0x class=0x02
dev.em.0.%parent: pci1
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 3
dev.em.0.eee_control: 0
dev.em.0.link_irq: 11621474
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1477444168
dev.em.0.rx_control: 67141634
dev.em.0.fc_high_water: 18432
dev.em.0.fc_low_water: 16932
dev.em.0.queue0.txd_head: 904
dev.em.0.queue0.txd_tail: 904
dev.em.0.queue0.tx_irq: 10291170
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 766
dev.em.0.queue0.rxd_tail: 767
dev.em.0.queue0.rx_irq: 6937760
dev.em.0.queue0.rx_discarded: 0
dev.em.0.queue0.rx_forced_refill: 0
dev.em.0.queue0.next_to_check: 6937760
^^^ this field is invalid... bad code... :(
dev.em.0.queue0.next_to_refresh: 767
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.missed_packets: 25752895
dev.em.0.mac_stats.recv_no_buff: 3
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.xon_recvd: 0
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xoff_recvd: 0
dev.em.0.mac_stats.xoff_txd: 25752073
dev.em.0.mac_stats.total_pkts_recvd: 39996734
dev.em.0.mac_stats.good_pkts_recvd: 14243839
dev.em.0.mac_stats.bcast_pkts_recvd: 5
dev.em.0.mac_stats.mcast_pkts_recvd: 0
dev.em.0.mac_stats.rx_frames_64: 13878627
dev.em.0.mac_stats.rx_frames_65_127: 365212
dev.em.0.mac_stats.rx_frames_128_255: 0
dev.em.0.mac_stats.rx_frames_256_511: 0
dev.em.0.mac_stats.rx_frames_512_1023: 0
dev.em.0.mac_stats.rx_frames_1024_1522: 0
dev.em.0.mac_stats.good_octets_recvd: 916346006
dev.em.0.mac_stats.good_octets_txd: 21377046229
dev.em.0.mac_stats.total_pkts_txd: 44415008
dev.em.0.mac_stats.good_pkts_txd: 18661905
dev.em.0.mac_stats.bcast_pkts_txd: 24822815
dev.em.0.mac_stats.mcast_pkts_txd: 0
dev.em.0.mac_stats.tx_frames_64: 1278447
dev.em.0.mac_stats.tx_frames_65_127: 1221602
dev.em.0.mac_stats.tx_frames_128_255: 503121
dev.em.0.mac_stats.tx_frames_256_511: 770073
dev.em.0.mac_stats.tx_frames_512_1023: 1921953
dev.em.0.mac_stats.tx_frames_1024_1522: 12966709
dev.em.0.mac_stats.tso_txd: 0
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.0.interrupts.asserts: 5297765
dev.em.0.interrupts.rx_pkt_timer: 0
dev.em.0.interrupts.rx_abs_timer: 0
dev.em.0.interrupts.tx_pkt_timer: 1
dev

Re: net80211 and interface requests

2011-03-31 Thread Adam Stylinski
On Thu, Mar 31, 2011 at 03:07:15PM +0200, Bernhard Schmidt wrote:
> On Thursday, March 31, 2011 14:20:33 Adam Stylinski wrote:
> > On Thu, Mar 31, 2011 at 09:02:45AM +0200, Bernhard Schmidt wrote:
> > > On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote:
> > > > Hello,
> > > > 
> > > > This list has helped me before so I'll email again with the hopes that
> > > > somebody has an answer.  All is working well with my project, however 
> > > > for
> > > > the life of me I cannot get the interface to inject the raw frames 
> > > > faster
> > > > than 11mbps.  I'm following the example given in
> > > > /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying
> > > > parameters such as ucastrate, mcastrate, and mgmtrate within ifconfig.  
> > > > I'm
> > > > putting the card into pureg mode, and yet I still can't inject any 
> > > > faster.
> > > >  I've even gone so far as to specify an ieee802211_txparam struct giving
> > > > values of 255 both mcast and ucast rates within the struct (and of 
> > > > course
> > > > anding them by 0xff).  I then used the ioctl call to set the flags 
> > > > within
> > > > the interface request.  Any help would be greatly appreciated.
> > > 
> > > You've set the ibp_rate0 parameter right? This one is in half-mbps, so
> > > a value of 108 should give you 54m. The only thing I can think of right
> > > now is that the device (or channel) is actually configured for 11b not
> > > 11g mode. Can we rule that out? Which device are you using?
> > > 
> > > > I am doing nanosleeps in between transmissions as if I don't the bpf 
> > > > clone
> > > > can't inject due to the buffer being too full.  There's probably a 
> > > > better
> > > > way of doing this, but I doubt the nanosleeps are the issue (afterall, 
> > > > I get
> > > > almost exactly 11mbps).  I should probably note I'm not doing any ACKs, 
> > > > this
> > > > is pure transmits.
> > > > 
> > > > If anybody cares enough to look at my unpolished code to get a better 
> > > > idea,
> > > > look here:
> > > > 
> > > > http://projhinternet.svn.sourceforge.net/
> > > > 
> > > > The idea is to allow unidirectional traffic so that with an FCC amateur
> > > > license (yes I know I'm not currently broadcasting the call sign as of 
> > > > yet)
> > > > you can broadcast unencrypted transmissions for miles (with a linear
> > > > amplifier spec'd to 2.4ghz).  With the license FCC part15 no longer 
> > > > applies
> > > > and you can operate just like in any other amateur band.
> > > > ___
> > > > freebsd-net@freebsd.org mailing list
> > > > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > > > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> > > > 
> > > 
> > 
> > I'm using an atheros AR2413 chipset, running in pure g mode, with also the 
> > card put into "mode 11g" and ucast, mcast, and mgmt rates set to 54.  I 
> > think the parameter for ibp_rate0 is just for setting it in the header (but 
> > I could be wrong).  Regardless I am doing this, let me give you the exact 
> > source files I'm doing this in.
> 
> Well, the ath_rate_* modules afaik do not honor the fixed rate
> settings. At least I've heard something about those being broken. The
> ibp_rate0 parameter set to 108 seems to be correct though.
> 
> No clue why that doesn't work, you may have to debug ath_tx_findrix().
> Adding a printf of the passed over rate and ridx should shed some light
> on this I guess.
> 
> > Line 38 in this file:
> > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/callbacks.c?revision=69&view=markup
> >  
> > 
> > And the setup_if function in this:
> > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/libinject.c?revision=69&view=markup
> > 
> 
> -- 
> Bernhard

Is there any way to do this without using a kernel debugger?  It'd be really 
disappointing if the whole problem is a bug in the driver.  


pgpwEmJk9SK92.pgp
Description: PGP signature


Re: net80211 and interface requests

2011-03-31 Thread Adam Stylinski
On Thu, Mar 31, 2011 at 05:35:40PM +0200, Bernhard Schmidt wrote:
> On Thursday, March 31, 2011 17:14:21 Adam Stylinski wrote:
> > On Thu, Mar 31, 2011 at 03:07:15PM +0200, Bernhard Schmidt wrote:
> > > On Thursday, March 31, 2011 14:20:33 Adam Stylinski wrote:
> > > > On Thu, Mar 31, 2011 at 09:02:45AM +0200, Bernhard Schmidt wrote:
> > > > > On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote:
> > > > > > Hello,
> > > > > > 
> > > > > > This list has helped me before so I'll email again with the hopes 
> > > > > > that
> > > > > > somebody has an answer.  All is working well with my project, 
> > > > > > however for
> > > > > > the life of me I cannot get the interface to inject the raw frames 
> > > > > > faster
> > > > > > than 11mbps.  I'm following the example given in
> > > > > > /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying
> > > > > > parameters such as ucastrate, mcastrate, and mgmtrate within 
> > > > > > ifconfig.  I'm
> > > > > > putting the card into pureg mode, and yet I still can't inject any 
> > > > > > faster.
> > > > > >  I've even gone so far as to specify an ieee802211_txparam struct 
> > > > > > giving
> > > > > > values of 255 both mcast and ucast rates within the struct (and of 
> > > > > > course
> > > > > > anding them by 0xff).  I then used the ioctl call to set the flags 
> > > > > > within
> > > > > > the interface request.  Any help would be greatly appreciated.
> > > > > 
> > > > > You've set the ibp_rate0 parameter right? This one is in half-mbps, so
> > > > > a value of 108 should give you 54m. The only thing I can think of 
> > > > > right
> > > > > now is that the device (or channel) is actually configured for 11b not
> > > > > 11g mode. Can we rule that out? Which device are you using?
> > > > > 
> > > > > > I am doing nanosleeps in between transmissions as if I don't the 
> > > > > > bpf clone
> > > > > > can't inject due to the buffer being too full.  There's probably a 
> > > > > > better
> > > > > > way of doing this, but I doubt the nanosleeps are the issue 
> > > > > > (afterall, I get
> > > > > > almost exactly 11mbps).  I should probably note I'm not doing any 
> > > > > > ACKs, this
> > > > > > is pure transmits.
> > > > > > 
> > > > > > If anybody cares enough to look at my unpolished code to get a 
> > > > > > better idea,
> > > > > > look here:
> > > > > > 
> > > > > > http://projhinternet.svn.sourceforge.net/
> > > > > > 
> > > > > > The idea is to allow unidirectional traffic so that with an FCC 
> > > > > > amateur
> > > > > > license (yes I know I'm not currently broadcasting the call sign as 
> > > > > > of yet)
> > > > > > you can broadcast unencrypted transmissions for miles (with a linear
> > > > > > amplifier spec'd to 2.4ghz).  With the license FCC part15 no longer 
> > > > > > applies
> > > > > > and you can operate just like in any other amateur band.
> > > > > > ___
> > > > > > freebsd-net@freebsd.org mailing list
> > > > > > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > > > > > To unsubscribe, send any mail to 
> > > > > > "freebsd-net-unsubscr...@freebsd.org"
> > > > > > 
> > > > > 
> > > > 
> > > > I'm using an atheros AR2413 chipset, running in pure g mode, with also 
> > > > the card put into "mode 11g" and ucast, mcast, and mgmt rates set to 
> > > > 54.  I think the parameter for ibp_rate0 is just for setting it in the 
> > > > header (but I could be wrong).  Regardless I am doing this, let me give 
> > > > you the exact source files I'm doing this in.
> > > 
> > > Well, the ath_rate_* modules afaik do not honor the fixed rate
> > > settings. At least I've heard something about those being broken. The
> > > ibp_rate0 parameter set to 108 seems to be correct though.
> > > 
> > > No clue why that doesn't work, you may have to debug ath_tx_findrix().
> > > Adding a printf of the passed over rate and ridx should shed some light
> > > on this I guess.
> > > 
> > > > Line 38 in this file:
> > > > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/callbacks.c?revision=69&view=markup
> > > >  
> > > > 
> > > > And the setup_if function in this:
> > > > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/libinject.c?revision=69&view=markup
> > > > 
> > > 
> > 
> > It turns out strange coincidences can happen.  I decided to busy loop, 
> > thinking maybe it was my nanosleep call.  And what do you know, 52Mb/sec.  
> > Is there some sort of call I can use to probe the fd to see if the buffer 
> > has been sent yet?  
> 
> Honestly, no clue. The bpf transmit path is a bunch of ugly hacks..
> What you can try though is to enable various debug options for
> net80211 and ath to figure out what's going on, especially the bits
> for xmit.
> 
> On a unrelated side note, how is the ath/wlan0 interface configured?
> I mean, is it in sta mode or ahdemo? I guess most tests have been done
> in ahdemo mode. Also I'm sure that all frames are simply di

Re: Kernel memory corruption(?) with age(4)

2011-03-31 Thread YongHyeon PYUN
On Thu, Mar 31, 2011 at 09:05:19AM +0200, Yamagi Burmeister wrote:
> On Wed, 30 Mar 2011, YongHyeon PYUN wrote:
> 
> >>Okay, I did a test run with RX checksum, TX checksum and both disabled.
> >>In all three cases the crash occurs within about 20 minutes. I'm either
> >>not sure that age(4) is the problem but it has definedly something to do
> >>with the problem, since with another nic driver the same scenario is
> >>rock solid...
> >>
> >
> >OK.
> >
> >>The workload: It's a NFS3 server (FreeBSDs non-experimental
> >>implementation), serving and receiving file with about 250 to 500
> >>megabytes at about 20mb/s. The clients are FreeBSD 7 and 8 systems and
> >>are mounting the shares via TCP. The connection is 1000mbit/s via a
> >>"dumb" gigabit switch.
> >>
> >
> >That's too broad to narrow down the issue. :-(
> >I'm not sure but your box seem to have more than 4GB memory. Could
> >you limit the available memory to 3GB via loader.conf and test it
> >again?
> 
> All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64.
> After limiting the memory via hw.physmem to 3GB the problems are gone.
> The box is running crashfree for more than 6 hours and has served over
> 300GB of data via age(4).
> 

Thanks for testing. Remove the hw.physmem configuration and try
attached patch and let me know how it goes.
Index: sys/dev/age/if_age.c
===
--- sys/dev/age/if_age.c	(revision 220116)
+++ sys/dev/age/if_age.c	(working copy)
@@ -2452,6 +2452,9 @@
 		/* Update the consumer index. */
 		sc->age_cdata.age_rr_cons = rr_cons;
 
+		bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag,
+		sc->age_cdata.age_rx_ring_map,
+		BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
 		/* Sync descriptors. */
 		bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag,
 		sc->age_cdata.age_rr_ring_map,
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Kernel memory corruption(?) with age(4)

2011-03-31 Thread Yamagi Burmeister

On Thu, 31 Mar 2011, YongHyeon PYUN wrote:


All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64.
After limiting the memory via hw.physmem to 3GB the problems are gone.
The box is running crashfree for more than 6 hours and has served over
300GB of data via age(4).



Thanks for testing. Remove the hw.physmem configuration and try
attached patch and let me know how it goes.


Thanks for your help, but the patch doesn't work. Another random panic -
this time "page fault in kernel mode" - with nothing age(4) or network
stack related stuff in the backtrace...

Maybe it'll help to know about a bug fix in the linux atl1 driver, now
replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4
64 bit DMA was disabled:

  64-bit DMA causes data corruption with atl1.  We don't know why, and
  Atheros is working on it. For now, just use 32-bit DMA. This is a big
  hack that is probably wrong, but it stops the bleeding.

There was no later follow up on it. I think that this can't be problem
on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org
gitweb URL is:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4

--
Homepage: www.yamagi.org
Jabber:   yam...@yamagi.org
GnuPG/GPG:0xEFBCCBCB
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Kernel memory corruption(?) with age(4)

2011-03-31 Thread YongHyeon PYUN
On Thu, Mar 31, 2011 at 08:07:17PM +0200, Yamagi Burmeister wrote:
> On Thu, 31 Mar 2011, YongHyeon PYUN wrote:
> 
> >>All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64.
> >>After limiting the memory via hw.physmem to 3GB the problems are gone.
> >>The box is running crashfree for more than 6 hours and has served over
> >>300GB of data via age(4).
> >>
> >
> >Thanks for testing. Remove the hw.physmem configuration and try
> >attached patch and let me know how it goes.
> 
> Thanks for your help, but the patch doesn't work. Another random panic -
> this time "page fault in kernel mode" - with nothing age(4) or network
> stack related stuff in the backtrace...
> 
> Maybe it'll help to know about a bug fix in the linux atl1 driver, now
> replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4
> 64 bit DMA was disabled:
> 
>   64-bit DMA causes data corruption with atl1.  We don't know why, and
>   Atheros is working on it. For now, just use 32-bit DMA. This is a big
>   hack that is probably wrong, but it stops the bleeding.
> 
> There was no later follow up on it. I think that this can't be problem
> on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org
> gitweb URL is:
> 
> http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4
> 

Thanks a lot! It seems the L1 controller has data corruption issue
when 64bit DMA addressing is used. Try this one.
Index: sys/dev/age/if_age.c
===
--- sys/dev/age/if_age.c(revision 220116)
+++ sys/dev/age/if_age.c(working copy)
@@ -1092,10 +1092,13 @@
 * Create Tx/Rx buffer parent tag.
 * L1 supports full 64bit DMA addressing in Tx/Rx buffers
 * so it needs separate parent DMA tag.
+* XXX
+* It seems enabling 64bit DMA causes data corruption. Limit
+* DMA address space to 32bit.
 */
error = bus_dma_tag_create(
bus_get_dma_tag(sc->age_dev), /* parent */
-   1, 0,   /* alignment, boundary */
+   BUS_SPACE_MAXADDR_32BIT, 0, /* alignment, boundary */
BUS_SPACE_MAXADDR,  /* lowaddr */
BUS_SPACE_MAXADDR,  /* highaddr */
NULL, NULL, /* filter, filterarg */
@@ -2452,6 +2455,9 @@
/* Update the consumer index. */
sc->age_cdata.age_rr_cons = rr_cons;
 
+   bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag,
+   sc->age_cdata.age_rx_ring_map,
+   BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
/* Sync descriptors. */
bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag,
sc->age_cdata.age_rr_ring_map,
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Kernel memory corruption(?) with age(4)

2011-03-31 Thread YongHyeon PYUN
On Thu, Mar 31, 2011 at 11:16:52AM -0700, YongHyeon PYUN wrote:
> On Thu, Mar 31, 2011 at 08:07:17PM +0200, Yamagi Burmeister wrote:
> > On Thu, 31 Mar 2011, YongHyeon PYUN wrote:
> > 
> > >>All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64.
> > >>After limiting the memory via hw.physmem to 3GB the problems are gone.
> > >>The box is running crashfree for more than 6 hours and has served over
> > >>300GB of data via age(4).
> > >>
> > >
> > >Thanks for testing. Remove the hw.physmem configuration and try
> > >attached patch and let me know how it goes.
> > 
> > Thanks for your help, but the patch doesn't work. Another random panic -
> > this time "page fault in kernel mode" - with nothing age(4) or network
> > stack related stuff in the backtrace...
> > 
> > Maybe it'll help to know about a bug fix in the linux atl1 driver, now
> > replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4
> > 64 bit DMA was disabled:
> > 
> >   64-bit DMA causes data corruption with atl1.  We don't know why, and
> >   Atheros is working on it. For now, just use 32-bit DMA. This is a big
> >   hack that is probably wrong, but it stops the bleeding.
> > 
> > There was no later follow up on it. I think that this can't be problem
> > on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org
> > gitweb URL is:
> > 
> > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4
> > 
> 
> Thanks a lot! It seems the L1 controller has data corruption issue
> when 64bit DMA addressing is used. Try this one.

Oops, there was a bug in previous patch.
Try this instead.
Index: sys/dev/age/if_age.c
===
--- sys/dev/age/if_age.c(revision 220116)
+++ sys/dev/age/if_age.c(working copy)
@@ -1092,11 +1092,14 @@
 * Create Tx/Rx buffer parent tag.
 * L1 supports full 64bit DMA addressing in Tx/Rx buffers
 * so it needs separate parent DMA tag.
+* XXX
+* It seems enabling 64bit DMA causes data corruption. Limit
+* DMA address space to 32bit.
 */
error = bus_dma_tag_create(
bus_get_dma_tag(sc->age_dev), /* parent */
1, 0,   /* alignment, boundary */
-   BUS_SPACE_MAXADDR,  /* lowaddr */
+   BUS_SPACE_MAXADDR_32BIT,/* lowaddr */
BUS_SPACE_MAXADDR,  /* highaddr */
NULL, NULL, /* filter, filterarg */
BUS_SPACE_MAXSIZE_32BIT,/* maxsize */
@@ -2452,6 +2455,9 @@
/* Update the consumer index. */
sc->age_cdata.age_rr_cons = rr_cons;
 
+   bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag,
+   sc->age_cdata.age_rx_ring_map,
+   BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
/* Sync descriptors. */
bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag,
sc->age_cdata.age_rr_ring_map,
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Kernel memory corruption(?) with age(4)

2011-03-31 Thread Yamagi Burmeister

On Thu, 31 Mar 2011, YongHyeon PYUN wrote:


Thanks a lot! It seems the L1 controller has data corruption issue
when 64bit DMA addressing is used. Try this one.


Oops, there was a bug in previous patch.
Try this instead.


Okay, that patch seems to do the trick. This was just a short test run
of about one hour with just 50gb copied, but without the patch the
system would have crashed in the first 20 minutes. I'll do a more
comprehensive test over night and report back tomorrow morning.

--
Homepage: www.yamagi.org
Jabber:   yam...@yamagi.org
GnuPG/GPG:0xEFBCCBCB
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"]

2011-03-31 Thread Arnaud Lacombe
Hi Jack,

On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe  wrote:
> [...]
> I'll remove part of the changes I made to keep only `rx_forced_refill'
> and the associated sysctl, re-run the tests and come back with correct
> value, hopefully in a few hours.
>
Here it is:

# sysctl dev.em.0.%desc
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2

# sysctl dev.em.0.mac_stats.missed_packets
dev.em.0.mac_stats.missed_packets: 917428

# sysctl dev.em.0.debug=1
dev.em.0.debug: I-1nterface is RUNNING and INACTIVE
em0: hw tdh = 975, hw tdt = 975
em0: hw rdh = 884, hw rdt = 885
em0: Tx Queue Status = 0
em0: TX descriptors avail = 1024
em0: Tx Descriptors avail failure = 0
em0: RX discarded packets = 0
em0: RX Next to Check = 884
em0: RX Next to Refresh = 885
 -> -1

So the taskqueue cannot be scheduled to run and the driver is stuck.

> On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel  wrote:
>> Read the code in HEAD, em_local_timer() has a test of ALL the rx queues and
>> will schedule a task that refreshes mbufs if they are empty. This has
>> exactly the
>> same effect as checking for some interrupt cause, a cause that is not
>> available
>> when using MSIX on 82574, but this approach works for everything.
>>
Can you please point me to a reference datasheet (or errata), provided
by Intel, about the RX Overrun interrupt not being available with
MSI-X on the 82574 ?

Currently, I only have access to [0], which precises the following:

7.4 Interrupts
7.4.2 MSI-X Mode
[...]
The following configuration and parameters are involved:
• The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues and other
events to 5 interrupt vectors
• The ICR[24:20] bits reflect specific interrupt causes
• Five MSI-X interrupt vectors are provided (calculated based on four
vectors for
queues and one vector for other causes). The requested number of vectors is
loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X capability
structure of the function.

10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC)
[...]

about bit 24:

Other Interrupt. Indicates one of the following interrupts was set:
• Link Status Change.
• Receiver Overrun.
• MDIO Access Complete.
• Small Receive Packet Detected.
• Receive ACK Frame Detected.
• Manageability Event Detected.

Thanks in advance,
 - Arnaud

[0]: ftp://download.intel.com/design/network/datashts/82574.pdf
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"]

2011-03-31 Thread Jack Vogel
So, what is the evidence that the driver is stuck here?

I see that next_to_check != next_to_refresh, which is why the
local timer won't schedule anything. OH, and I also realized there
is a problem with local_timer anyway, it will run rxeof, but that won't help
if you can't enter the loop, so I need to add some code at the top to
call em_refresh_mbufs() when in this state.

On this interrupt cause that you are focused upon, although its there in the
design, I had talked with some of our most seasoned developers on both
the Windows and Linux side of the house, and NO one has ever used this
'feature', because (and I'm quoting here) "there's no good use case for it".
Meaning, there's always some simpler way of handling the issue.

When you use MSIX you can't read causes btw, if you configured it, it would
mean you'd just get into the regular RX handler, same as always, so why
some special bother with this cause?

On non-MSIX hardware there is just no particular reason to worry about the
cause either, we can just handle the RX situation in the interrupt handler.

Jack


On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe  wrote:

> Hi Jack,
>
> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe 
> wrote:
> > [...]
> > I'll remove part of the changes I made to keep only `rx_forced_refill'
> > and the associated sysctl, re-run the tests and come back with correct
> > value, hopefully in a few hours.
> >
> Here it is:
>
> # sysctl dev.em.0.%desc
> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2
>
> # sysctl dev.em.0.mac_stats.missed_packets
> dev.em.0.mac_stats.missed_packets: 917428
>
> # sysctl dev.em.0.debug=1
> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE
> em0: hw tdh = 975, hw tdt = 975
> em0: hw rdh = 884, hw rdt = 885
> em0: Tx Queue Status = 0
> em0: TX descriptors avail = 1024
> em0: Tx Descriptors avail failure = 0
> em0: RX discarded packets = 0
> em0: RX Next to Check = 884
> em0: RX Next to Refresh = 885
>  -> -1
>
> So the taskqueue cannot be scheduled to run and the driver is stuck.
>
> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel  wrote:
> >> Read the code in HEAD, em_local_timer() has a test of ALL the rx queues
> and
> >> will schedule a task that refreshes mbufs if they are empty. This has
> >> exactly the
> >> same effect as checking for some interrupt cause, a cause that is not
> >> available
> >> when using MSIX on 82574, but this approach works for everything.
> >>
> Can you please point me to a reference datasheet (or errata), provided
> by Intel, about the RX Overrun interrupt not being available with
> MSI-X on the 82574 ?
>
> Currently, I only have access to [0], which precises the following:
>
> 7.4 Interrupts
> 7.4.2 MSI-X Mode
> [...]
> The following configuration and parameters are involved:
> • The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues and
> other
> events to 5 interrupt vectors
> • The ICR[24:20] bits reflect specific interrupt causes
> • Five MSI-X interrupt vectors are provided (calculated based on four
> vectors for
> queues and one vector for other causes). The requested number of vectors is
> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X capability
> structure of the function.
>
> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC)
> [...]
>
> about bit 24:
>
> Other Interrupt. Indicates one of the following interrupts was set:
> • Link Status Change.
> • Receiver Overrun.
> • MDIO Access Complete.
> • Small Receive Packet Detected.
> • Receive ACK Frame Detected.
> • Manageability Event Detected.
>
> Thanks in advance,
>  - Arnaud
>
> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


any restrictions on nmbclusters vs nmbjumbop

2011-03-31 Thread Joe Schaefer
I have the following config in boot/loader.conf

kern.ipc.nmbclusters="65536"
kern.ipc.nmbjumbop="65536"

and having just tried running a host with that config
the host stopped responding to commands (not even login
worked) and I had to power cycle it.

My situation is that I have a need for a large nmbjumbop setting
but the nmbclusters size (according to netstat -m) can remain small.
Is this possible or do I need to bump the nmbclusters to 128K
in order to get nmbjumbop where I want (64K)?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"]

2011-03-31 Thread Arnaud Lacombe
Hi,

On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel  wrote:
> So, what is the evidence that the driver is stuck here?
>
About 800 pps (mostly SYN) present wire but never ever seen on em0,
plus a couple of ARP reply, which still never hit em0, plus the
`missed_packets' count increasing by the same 800 pps in the last
hour. Is that enough ?

 - Arnaud

ps: I forgot to add that MAC address on the wire are fine.

> I see that next_to_check != next_to_refresh, which is why the
> local timer won't schedule anything. OH, and I also realized there
> is a problem with local_timer anyway, it will run rxeof, but that won't help
> if you can't enter the loop, so I need to add some code at the top to
> call em_refresh_mbufs() when in this state.
>
> On this interrupt cause that you are focused upon, although its there in the
> design, I had talked with some of our most seasoned developers on both
> the Windows and Linux side of the house, and NO one has ever used this
> 'feature', because (and I'm quoting here) "there's no good use case for it".
> Meaning, there's always some simpler way of handling the issue.
>
> When you use MSIX you can't read causes btw, if you configured it, it would
> mean you'd just get into the regular RX handler, same as always, so why
> some special bother with this cause?
>
> On non-MSIX hardware there is just no particular reason to worry about the
> cause either, we can just handle the RX situation in the interrupt handler.
>
> Jack
>
>
> On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe  wrote:
>>
>> Hi Jack,
>>
>> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe 
>> wrote:
>> > [...]
>> > I'll remove part of the changes I made to keep only `rx_forced_refill'
>> > and the associated sysctl, re-run the tests and come back with correct
>> > value, hopefully in a few hours.
>> >
>> Here it is:
>>
>> # sysctl dev.em.0.%desc
>> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2
>>
>> # sysctl dev.em.0.mac_stats.missed_packets
>> dev.em.0.mac_stats.missed_packets: 917428
>>
>> # sysctl dev.em.0.debug=1
>> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE
>> em0: hw tdh = 975, hw tdt = 975
>> em0: hw rdh = 884, hw rdt = 885
>> em0: Tx Queue Status = 0
>> em0: TX descriptors avail = 1024
>> em0: Tx Descriptors avail failure = 0
>> em0: RX discarded packets = 0
>> em0: RX Next to Check = 884
>> em0: RX Next to Refresh = 885
>>  -> -1
>>
>> So the taskqueue cannot be scheduled to run and the driver is stuck.
>>
>> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel  wrote:
>> >> Read the code in HEAD, em_local_timer() has a test of ALL the rx queues
>> >> and
>> >> will schedule a task that refreshes mbufs if they are empty. This has
>> >> exactly the
>> >> same effect as checking for some interrupt cause, a cause that is not
>> >> available
>> >> when using MSIX on 82574, but this approach works for everything.
>> >>
>> Can you please point me to a reference datasheet (or errata), provided
>> by Intel, about the RX Overrun interrupt not being available with
>> MSI-X on the 82574 ?
>>
>> Currently, I only have access to [0], which precises the following:
>>
>> 7.4 Interrupts
>> 7.4.2 MSI-X Mode
>> [...]
>> The following configuration and parameters are involved:
>> • The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues and
>> other
>> events to 5 interrupt vectors
>> • The ICR[24:20] bits reflect specific interrupt causes
>> • Five MSI-X interrupt vectors are provided (calculated based on four
>> vectors for
>> queues and one vector for other causes). The requested number of vectors
>> is
>> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X
>> capability
>> structure of the function.
>>
>> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC)
>> [...]
>>
>> about bit 24:
>>
>> Other Interrupt. Indicates one of the following interrupts was set:
>> • Link Status Change.
>> • Receiver Overrun.
>> • MDIO Access Complete.
>> • Small Receive Packet Detected.
>> • Receive ACK Frame Detected.
>> • Manageability Event Detected.
>>
>> Thanks in advance,
>>  - Arnaud
>>
>> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf
>
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"]

2011-03-31 Thread Jack Vogel
OK, but those are not something present in this data, that was what I'm
asking.

So, you have a hang for which we do not have a certain cause.  What does
netstat -m show?

Jack


On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe  wrote:

> Hi,
>
> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel  wrote:
> > So, what is the evidence that the driver is stuck here?
> >
> About 800 pps (mostly SYN) present wire but never ever seen on em0,
> plus a couple of ARP reply, which still never hit em0, plus the
> `missed_packets' count increasing by the same 800 pps in the last
> hour. Is that enough ?
>
>  - Arnaud
>
> ps: I forgot to add that MAC address on the wire are fine.
>
> > I see that next_to_check != next_to_refresh, which is why the
> > local timer won't schedule anything. OH, and I also realized there
> > is a problem with local_timer anyway, it will run rxeof, but that won't
> help
> > if you can't enter the loop, so I need to add some code at the top to
> > call em_refresh_mbufs() when in this state.
> >
> > On this interrupt cause that you are focused upon, although its there in
> the
> > design, I had talked with some of our most seasoned developers on both
> > the Windows and Linux side of the house, and NO one has ever used this
> > 'feature', because (and I'm quoting here) "there's no good use case for
> it".
> > Meaning, there's always some simpler way of handling the issue.
> >
> > When you use MSIX you can't read causes btw, if you configured it, it
> would
> > mean you'd just get into the regular RX handler, same as always, so why
> > some special bother with this cause?
> >
> > On non-MSIX hardware there is just no particular reason to worry about
> the
> > cause either, we can just handle the RX situation in the interrupt
> handler.
> >
> > Jack
> >
> >
> > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe 
> wrote:
> >>
> >> Hi Jack,
> >>
> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe 
> >> wrote:
> >> > [...]
> >> > I'll remove part of the changes I made to keep only `rx_forced_refill'
> >> > and the associated sysctl, re-run the tests and come back with correct
> >> > value, hopefully in a few hours.
> >> >
> >> Here it is:
> >>
> >> # sysctl dev.em.0.%desc
> >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2
> >>
> >> # sysctl dev.em.0.mac_stats.missed_packets
> >> dev.em.0.mac_stats.missed_packets: 917428
> >>
> >> # sysctl dev.em.0.debug=1
> >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE
> >> em0: hw tdh = 975, hw tdt = 975
> >> em0: hw rdh = 884, hw rdt = 885
> >> em0: Tx Queue Status = 0
> >> em0: TX descriptors avail = 1024
> >> em0: Tx Descriptors avail failure = 0
> >> em0: RX discarded packets = 0
> >> em0: RX Next to Check = 884
> >> em0: RX Next to Refresh = 885
> >>  -> -1
> >>
> >> So the taskqueue cannot be scheduled to run and the driver is stuck.
> >>
> >> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel 
> wrote:
> >> >> Read the code in HEAD, em_local_timer() has a test of ALL the rx
> queues
> >> >> and
> >> >> will schedule a task that refreshes mbufs if they are empty. This has
> >> >> exactly the
> >> >> same effect as checking for some interrupt cause, a cause that is not
> >> >> available
> >> >> when using MSIX on 82574, but this approach works for everything.
> >> >>
> >> Can you please point me to a reference datasheet (or errata), provided
> >> by Intel, about the RX Overrun interrupt not being available with
> >> MSI-X on the 82574 ?
> >>
> >> Currently, I only have access to [0], which precises the following:
> >>
> >> 7.4 Interrupts
> >> 7.4.2 MSI-X Mode
> >> [...]
> >> The following configuration and parameters are involved:
> >> • The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues and
> >> other
> >> events to 5 interrupt vectors
> >> • The ICR[24:20] bits reflect specific interrupt causes
> >> • Five MSI-X interrupt vectors are provided (calculated based on four
> >> vectors for
> >> queues and one vector for other causes). The requested number of vectors
> >> is
> >> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X
> >> capability
> >> structure of the function.
> >>
> >> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC)
> >> [...]
> >>
> >> about bit 24:
> >>
> >> Other Interrupt. Indicates one of the following interrupts was set:
> >> • Link Status Change.
> >> • Receiver Overrun.
> >> • MDIO Access Complete.
> >> • Small Receive Packet Detected.
> >> • Receive ACK Frame Detected.
> >> • Manageability Event Detected.
> >>
> >> Thanks in advance,
> >>  - Arnaud
> >>
> >> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf
> >
> >
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"]

2011-03-31 Thread Arnaud Lacombe
Hi,

On Thu, Mar 31, 2011 at 6:28 PM, Jack Vogel  wrote:
> OK, but those are not something present in this data, that was what I'm
> asking.
>
> So, you have a hang for which we do not have a certain cause.  What does
> netstat -m show?
>
# netstat -m
3073/74927/78000 mbufs in use (current/cache/total)
3070/29698/32768/32768 mbuf clusters in use (current/cache/total/max)
0/383 mbuf+clusters out of packet secondary zone in use (current/cache)
0/12800/12800/12800 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
6908K/129327K/136236K bytes allocated to network (current/cache/total)
0/1080/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/7/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Note that the mbuf allocation denial did not appended at once. It has
been progressively increasing by block of ~200 over the 5h of uptime
of the machine, until the current condition occurred.

I have previously been trying to simulate the depletion and the hang,
but the driver recovered. I assume the condition is met in
em_local_timer() to refresh the ring, I'd still need to check that.

 - Arnaud

> Jack
>
>
> On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe  wrote:
>>
>> Hi,
>>
>> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel  wrote:
>> > So, what is the evidence that the driver is stuck here?
>> >
>> About 800 pps (mostly SYN) present wire but never ever seen on em0,
>> plus a couple of ARP reply, which still never hit em0, plus the
>> `missed_packets' count increasing by the same 800 pps in the last
>> hour. Is that enough ?
>>
>>  - Arnaud
>>
>> ps: I forgot to add that MAC address on the wire are fine.
>>
>> > I see that next_to_check != next_to_refresh, which is why the
>> > local timer won't schedule anything. OH, and I also realized there
>> > is a problem with local_timer anyway, it will run rxeof, but that won't
>> > help
>> > if you can't enter the loop, so I need to add some code at the top to
>> > call em_refresh_mbufs() when in this state.
>> >
>> > On this interrupt cause that you are focused upon, although its there in
>> > the
>> > design, I had talked with some of our most seasoned developers on both
>> > the Windows and Linux side of the house, and NO one has ever used this
>> > 'feature', because (and I'm quoting here) "there's no good use case for
>> > it".
>> > Meaning, there's always some simpler way of handling the issue.
>> >
>> > When you use MSIX you can't read causes btw, if you configured it, it
>> > would
>> > mean you'd just get into the regular RX handler, same as always, so why
>> > some special bother with this cause?
>> >
>> > On non-MSIX hardware there is just no particular reason to worry about
>> > the
>> > cause either, we can just handle the RX situation in the interrupt
>> > handler.
>> >
>> > Jack
>> >
>> >
>> > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe 
>> > wrote:
>> >>
>> >> Hi Jack,
>> >>
>> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe 
>> >> wrote:
>> >> > [...]
>> >> > I'll remove part of the changes I made to keep only
>> >> > `rx_forced_refill'
>> >> > and the associated sysctl, re-run the tests and come back with
>> >> > correct
>> >> > value, hopefully in a few hours.
>> >> >
>> >> Here it is:
>> >>
>> >> # sysctl dev.em.0.%desc
>> >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2
>> >>
>> >> # sysctl dev.em.0.mac_stats.missed_packets
>> >> dev.em.0.mac_stats.missed_packets: 917428
>> >>
>> >> # sysctl dev.em.0.debug=1
>> >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE
>> >> em0: hw tdh = 975, hw tdt = 975
>> >> em0: hw rdh = 884, hw rdt = 885
>> >> em0: Tx Queue Status = 0
>> >> em0: TX descriptors avail = 1024
>> >> em0: Tx Descriptors avail failure = 0
>> >> em0: RX discarded packets = 0
>> >> em0: RX Next to Check = 884
>> >> em0: RX Next to Refresh = 885
>> >>  -> -1
>> >>
>> >> So the taskqueue cannot be scheduled to run and the driver is stuck.
>> >>
>> >> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel 
>> >> > wrote:
>> >> >> Read the code in HEAD, em_local_timer() has a test of ALL the rx
>> >> >> queues
>> >> >> and
>> >> >> will schedule a task that refreshes mbufs if they are empty. This
>> >> >> has
>> >> >> exactly the
>> >> >> same effect as checking for some interrupt cause, a cause that is
>> >> >> not
>> >> >> available
>> >> >> when using MSIX on 82574, but this approach works for everything.
>> >> >>
>> >> Can you please point me to a reference datasheet (or errata), provided
>> >> by Intel, about the RX Overrun interrupt not being available with
>> >> MSI-X on the 82574 ?
>> >>
>> >> Currently, I only have access to [0], which precises the following:
>> >>
>> >> 7.4 Interrupts
>> >> 7.4.2 MSI-X Mode
>> >

Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"]

2011-03-31 Thread Jack Vogel
My validation group has some kind of hang... happens when they use a certain
number
of clients each running a stress test to the SUT, its like this, no real
handle on what's
wrong, if I knew what was wrong it would be half way or more to fixing it :)

The evidence shows you have hit the max clusters at one point, but have
freed most
of them back up again, there is no shortage right at this point. Your
previous data
showed a normal idle head/tail relationship

Just as a data point, will you please disable msix, recompile and run in MSI
mode,
I just want to see if that makes a difference. Search in the driver for
em_enable_msix
and set it FALSE.

Jack


On Thu, Mar 31, 2011 at 4:06 PM, Arnaud Lacombe  wrote:

> Hi,
>
> On Thu, Mar 31, 2011 at 6:28 PM, Jack Vogel  wrote:
> > OK, but those are not something present in this data, that was what I'm
> > asking.
> >
> > So, you have a hang for which we do not have a certain cause.  What does
> > netstat -m show?
> >
> # netstat -m
> 3073/74927/78000 mbufs in use (current/cache/total)
> 3070/29698/32768/32768 mbuf clusters in use (current/cache/total/max)
> 0/383 mbuf+clusters out of packet secondary zone in use (current/cache)
> 0/12800/12800/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 6908K/129327K/136236K bytes allocated to network (current/cache/total)
> 0/1080/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/7/6656 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
>
> Note that the mbuf allocation denial did not appended at once. It has
> been progressively increasing by block of ~200 over the 5h of uptime
> of the machine, until the current condition occurred.
>
> I have previously been trying to simulate the depletion and the hang,
> but the driver recovered. I assume the condition is met in
> em_local_timer() to refresh the ring, I'd still need to check that.
>
>  - Arnaud
>
> > Jack
> >
> >
> > On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe 
> wrote:
> >>
> >> Hi,
> >>
> >> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel  wrote:
> >> > So, what is the evidence that the driver is stuck here?
> >> >
> >> About 800 pps (mostly SYN) present wire but never ever seen on em0,
> >> plus a couple of ARP reply, which still never hit em0, plus the
> >> `missed_packets' count increasing by the same 800 pps in the last
> >> hour. Is that enough ?
> >>
> >>  - Arnaud
> >>
> >> ps: I forgot to add that MAC address on the wire are fine.
> >>
> >> > I see that next_to_check != next_to_refresh, which is why the
> >> > local timer won't schedule anything. OH, and I also realized there
> >> > is a problem with local_timer anyway, it will run rxeof, but that
> won't
> >> > help
> >> > if you can't enter the loop, so I need to add some code at the top to
> >> > call em_refresh_mbufs() when in this state.
> >> >
> >> > On this interrupt cause that you are focused upon, although its there
> in
> >> > the
> >> > design, I had talked with some of our most seasoned developers on both
> >> > the Windows and Linux side of the house, and NO one has ever used this
> >> > 'feature', because (and I'm quoting here) "there's no good use case
> for
> >> > it".
> >> > Meaning, there's always some simpler way of handling the issue.
> >> >
> >> > When you use MSIX you can't read causes btw, if you configured it, it
> >> > would
> >> > mean you'd just get into the regular RX handler, same as always, so
> why
> >> > some special bother with this cause?
> >> >
> >> > On non-MSIX hardware there is just no particular reason to worry about
> >> > the
> >> > cause either, we can just handle the RX situation in the interrupt
> >> > handler.
> >> >
> >> > Jack
> >> >
> >> >
> >> > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe 
> >> > wrote:
> >> >>
> >> >> Hi Jack,
> >> >>
> >> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe 
> >> >> wrote:
> >> >> > [...]
> >> >> > I'll remove part of the changes I made to keep only
> >> >> > `rx_forced_refill'
> >> >> > and the associated sysctl, re-run the tests and come back with
> >> >> > correct
> >> >> > value, hopefully in a few hours.
> >> >> >
> >> >> Here it is:
> >> >>
> >> >> # sysctl dev.em.0.%desc
> >> >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2
> >> >>
> >> >> # sysctl dev.em.0.mac_stats.missed_packets
> >> >> dev.em.0.mac_stats.missed_packets: 917428
> >> >>
> >> >> # sysctl dev.em.0.debug=1
> >> >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE
> >> >> em0: hw tdh = 975, hw tdt = 975
> >> >> em0: hw rdh = 884, hw rdt = 885
> >> >> em0: Tx Queue Status = 0
> >> >> em0: TX descriptors avail = 1024
> >> >> em0: Tx Descriptors avail failure = 0
> >> >> em0: RX discarded packets = 0

Re: The tale of a TCP bug

2011-03-31 Thread Stefan `Sec` Zehl
On Wed, Mar 30, 2011 at 08:38 -0400, John Baldwin wrote:
> There is at least one case I know of related to a bug I reported earlier
> where a window probe from a remote connection can cause rcv_nxt to advance
> past rcv_adv by one.  However, I think we want to know about those cases,
> and we should probably be treating rcv_adv - rcv_nxt as if it is zero in 
> that case, not -1 (my patch in my original e-mail does just that in a
> different place in tcp_output() when we calculate the window "for real").

I've been running for about a day now with the committed patch and
adv_neg is still zero:

| ice:~>uptime; sysctl net.inet.tcp.adv_neg
|  1:36AM  up 1 day,  4:52, 1 user, load averages: 0.12, 0.06, 0.05
| net.inet.tcp.adv_neg: 0

I'll of course monitor this value and report back if I ever see it
increase :-)

CU,
Sec
-- 
Diplomacy is the ability to tell a person to go to hell in such a nice way
that he or she looks forward to the trip.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"]

2011-03-31 Thread Jack Vogel
You know what Arnaud, I've looked at the numbers again, and I suddenly saw
that next_to_check and next_to_refresh are NOT in a good state, exactly the
opposite, check is BEHIND refresh, which means the whole ring is empty, the
HEAD (next_to_check) is pointing at 929, but next_to_refresh is at 930,
RIGHT
IN FRONT of it, so the whole ring is depleted!!

What this means is that just a test of check == refresh is not going to be
good
enough to protect against all cases,  so let me think about how to handle
this...

Jack


On Thu, Mar 31, 2011 at 4:38 PM, Jack Vogel  wrote:

> My validation group has some kind of hang... happens when they use a
> certain number
> of clients each running a stress test to the SUT, its like this, no real
> handle on what's
> wrong, if I knew what was wrong it would be half way or more to fixing it
> :)
>
> The evidence shows you have hit the max clusters at one point, but have
> freed most
> of them back up again, there is no shortage right at this point. Your
> previous data
> showed a normal idle head/tail relationship
>
> Just as a data point, will you please disable msix, recompile and run in
> MSI mode,
> I just want to see if that makes a difference. Search in the driver for
> em_enable_msix
> and set it FALSE.
>
> Jack
>
>
>
> On Thu, Mar 31, 2011 at 4:06 PM, Arnaud Lacombe wrote:
>
>> Hi,
>>
>> On Thu, Mar 31, 2011 at 6:28 PM, Jack Vogel  wrote:
>> > OK, but those are not something present in this data, that was what I'm
>> > asking.
>> >
>> > So, you have a hang for which we do not have a certain cause.  What does
>> > netstat -m show?
>> >
>> # netstat -m
>> 3073/74927/78000 mbufs in use (current/cache/total)
>> 3070/29698/32768/32768 mbuf clusters in use (current/cache/total/max)
>> 0/383 mbuf+clusters out of packet secondary zone in use (current/cache)
>> 0/12800/12800/12800 4k (page size) jumbo clusters in use
>> (current/cache/total/max)
>> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
>> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
>> 6908K/129327K/136236K bytes allocated to network (current/cache/total)
>> 0/1080/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
>> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
>> 0/7/6656 sfbufs in use (current/peak/max)
>> 0 requests for sfbufs denied
>> 0 requests for sfbufs delayed
>> 0 requests for I/O initiated by sendfile
>> 0 calls to protocol drain routines
>>
>> Note that the mbuf allocation denial did not appended at once. It has
>> been progressively increasing by block of ~200 over the 5h of uptime
>> of the machine, until the current condition occurred.
>>
>> I have previously been trying to simulate the depletion and the hang,
>> but the driver recovered. I assume the condition is met in
>> em_local_timer() to refresh the ring, I'd still need to check that.
>>
>>  - Arnaud
>>
>> > Jack
>> >
>> >
>> > On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe 
>> wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel  wrote:
>> >> > So, what is the evidence that the driver is stuck here?
>> >> >
>> >> About 800 pps (mostly SYN) present wire but never ever seen on em0,
>> >> plus a couple of ARP reply, which still never hit em0, plus the
>> >> `missed_packets' count increasing by the same 800 pps in the last
>> >> hour. Is that enough ?
>> >>
>> >>  - Arnaud
>> >>
>> >> ps: I forgot to add that MAC address on the wire are fine.
>> >>
>> >> > I see that next_to_check != next_to_refresh, which is why the
>> >> > local timer won't schedule anything. OH, and I also realized there
>> >> > is a problem with local_timer anyway, it will run rxeof, but that
>> won't
>> >> > help
>> >> > if you can't enter the loop, so I need to add some code at the top to
>> >> > call em_refresh_mbufs() when in this state.
>> >> >
>> >> > On this interrupt cause that you are focused upon, although its there
>> in
>> >> > the
>> >> > design, I had talked with some of our most seasoned developers on
>> both
>> >> > the Windows and Linux side of the house, and NO one has ever used
>> this
>> >> > 'feature', because (and I'm quoting here) "there's no good use case
>> for
>> >> > it".
>> >> > Meaning, there's always some simpler way of handling the issue.
>> >> >
>> >> > When you use MSIX you can't read causes btw, if you configured it, it
>> >> > would
>> >> > mean you'd just get into the regular RX handler, same as always, so
>> why
>> >> > some special bother with this cause?
>> >> >
>> >> > On non-MSIX hardware there is just no particular reason to worry
>> about
>> >> > the
>> >> > cause either, we can just handle the RX situation in the interrupt
>> >> > handler.
>> >> >
>> >> > Jack
>> >> >
>> >> >
>> >> > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe 
>> >> > wrote:
>> >> >>
>> >> >> Hi Jack,
>> >> >>
>> >> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe > >
>> >> >> wrote:
>> >> >> > [...]
>> >> >> > I'll remove part of the changes I made to keep only
>> >> >> > `rx_forc

Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"]

2011-03-31 Thread Jack Vogel
I know how I'm going to handle this, am formulating code for it, should have
a
something that can be tested tomorrow, time to head out for the night..

Essentially, rather than just looking for equality, I will calculate the
number
of unrefreshed mbufs given the check/refresh values, and then call refresh
when anything is unrefreshed. This will happen in rxeof, but I will also put
back the rx interrupt trigger into local timer. I'm pretty sure this will be
bullet proof, at least for this kind of hang.

Jack


On Thu, Mar 31, 2011 at 5:28 PM, Jack Vogel  wrote:

> You know what Arnaud, I've looked at the numbers again, and I suddenly saw
> that next_to_check and next_to_refresh are NOT in a good state, exactly the
> opposite, check is BEHIND refresh, which means the whole ring is empty, the
> HEAD (next_to_check) is pointing at 929, but next_to_refresh is at 930,
> RIGHT
> IN FRONT of it, so the whole ring is depleted!!
>
> What this means is that just a test of check == refresh is not going to be
> good
> enough to protect against all cases,  so let me think about how to handle
> this...
>
> Jack
>
>
>
> On Thu, Mar 31, 2011 at 4:38 PM, Jack Vogel  wrote:
>
>> My validation group has some kind of hang... happens when they use a
>> certain number
>> of clients each running a stress test to the SUT, its like this, no real
>> handle on what's
>> wrong, if I knew what was wrong it would be half way or more to fixing it
>> :)
>>
>> The evidence shows you have hit the max clusters at one point, but have
>> freed most
>> of them back up again, there is no shortage right at this point. Your
>> previous data
>> showed a normal idle head/tail relationship
>>
>> Just as a data point, will you please disable msix, recompile and run in
>> MSI mode,
>> I just want to see if that makes a difference. Search in the driver for
>> em_enable_msix
>> and set it FALSE.
>>
>> Jack
>>
>>
>>
>> On Thu, Mar 31, 2011 at 4:06 PM, Arnaud Lacombe wrote:
>>
>>> Hi,
>>>
>>> On Thu, Mar 31, 2011 at 6:28 PM, Jack Vogel  wrote:
>>> > OK, but those are not something present in this data, that was what I'm
>>> > asking.
>>> >
>>> > So, you have a hang for which we do not have a certain cause.  What
>>> does
>>> > netstat -m show?
>>> >
>>> # netstat -m
>>> 3073/74927/78000 mbufs in use (current/cache/total)
>>> 3070/29698/32768/32768 mbuf clusters in use (current/cache/total/max)
>>> 0/383 mbuf+clusters out of packet secondary zone in use (current/cache)
>>> 0/12800/12800/12800 4k (page size) jumbo clusters in use
>>> (current/cache/total/max)
>>> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
>>> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
>>> 6908K/129327K/136236K bytes allocated to network (current/cache/total)
>>> 0/1080/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
>>> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
>>> 0/7/6656 sfbufs in use (current/peak/max)
>>> 0 requests for sfbufs denied
>>> 0 requests for sfbufs delayed
>>> 0 requests for I/O initiated by sendfile
>>> 0 calls to protocol drain routines
>>>
>>> Note that the mbuf allocation denial did not appended at once. It has
>>> been progressively increasing by block of ~200 over the 5h of uptime
>>> of the machine, until the current condition occurred.
>>>
>>> I have previously been trying to simulate the depletion and the hang,
>>> but the driver recovered. I assume the condition is met in
>>> em_local_timer() to refresh the ring, I'd still need to check that.
>>>
>>>  - Arnaud
>>>
>>> > Jack
>>> >
>>> >
>>> > On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe 
>>> wrote:
>>> >>
>>> >> Hi,
>>> >>
>>> >> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel 
>>> wrote:
>>> >> > So, what is the evidence that the driver is stuck here?
>>> >> >
>>> >> About 800 pps (mostly SYN) present wire but never ever seen on em0,
>>> >> plus a couple of ARP reply, which still never hit em0, plus the
>>> >> `missed_packets' count increasing by the same 800 pps in the last
>>> >> hour. Is that enough ?
>>> >>
>>> >>  - Arnaud
>>> >>
>>> >> ps: I forgot to add that MAC address on the wire are fine.
>>> >>
>>> >> > I see that next_to_check != next_to_refresh, which is why the
>>> >> > local timer won't schedule anything. OH, and I also realized there
>>> >> > is a problem with local_timer anyway, it will run rxeof, but that
>>> won't
>>> >> > help
>>> >> > if you can't enter the loop, so I need to add some code at the top
>>> to
>>> >> > call em_refresh_mbufs() when in this state.
>>> >> >
>>> >> > On this interrupt cause that you are focused upon, although its
>>> there in
>>> >> > the
>>> >> > design, I had talked with some of our most seasoned developers on
>>> both
>>> >> > the Windows and Linux side of the house, and NO one has ever used
>>> this
>>> >> > 'feature', because (and I'm quoting here) "there's no good use case
>>> for
>>> >> > it".
>>> >> > Meaning, there's always some simpler way of handling the issue.
>>> >> >
>>> >> > When you use M