Current problem reports assigned to freebsd-net@FreeBSD.org

2011-07-11 Thread FreeBSD bugmaster
Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker  Resp.  Description

o kern/158726  net[ip6] [patch] ICMPv6 Router Announcement flooding limi
o kern/158694  net[ix] [lagg] ix0 is not working within lagg(4)
o kern/158665  net[ip6] [panic] kernel pagefault in in6_setscope()
o kern/158635  net[em] TSO breaks BPF packet captures with em driver
f kern/158426  net[e1000] [panic] _mtx_lock_sleep: recursed on non-recur
o kern/158156  net[bce] bce driver shows "no carrier" on IBM blade (HS22
f kern/157802  net[dummynet] [panic] kernel panic in dummynet
o kern/157785  netamd64 + jail + ipfw + natd = very slow outbound traffi
o kern/157429  net[re] Realtek RTL8169 doesn't work with re(4)
o kern/157418  net[em] em driver lockup during boot on Supermicro X9SCM-
o kern/157410  net[ip6] IPv6 Router Advertisements Cause Excessive CPU U
o kern/157287  net[re] [panic] INVARIANTS panic (Memory modified after f
o kern/157209  net[ip6] [patch] locking error in rip6_input() (sys/netin
o kern/157200  net[network.subr] [patch] stf(4) can not communicate betw
o kern/157182  net[lagg] lagg interface not working together with epair 
o kern/156978  net[lagg][patch] Take lagg rlock before checking flags
o kern/156877  net[dummynet] [panic] dummynet move_pkt() null ptr derefe
o kern/156667  net[em] em0 fails to init on CURRENT after March 17
o kern/156408  net[vlan] Routing failure when using VLANs vs. Physical e
o kern/156328  net[icmp]: host can ping other subnet but no have IP from
o kern/156317  net[ip6] Wrong order of IPv6 NS DAD/MLD Report
o kern/156283  net[ip6] [patch] nd6_ns_input - rtalloc_mpath does not re
o kern/156279  net[if_bridge][divert][ipfw] unable to correctly re-injec
o kern/156226  net[lagg]: failover does not announce the failover to swi
o kern/156030  net[ip6] [panic] Crash in nd6_dad_start() due to null ptr
o kern/155772  netifconfig(8): ioctl (SIOCAIFADDR): File exists on direc
o kern/155680  net[multicast] problems with multicast
s kern/155642  net[request] Add driver for Realtek RTL8191SE/RTL8192SE W
o kern/155604  net[flowtable] Flowtable excessively caches dest MAC addr
o kern/155597  net[panic] Kernel panics with "sbdrop" message
o kern/155585  net[tcp] [panic] tcp_output tcp_mtudisc loop until kernel
o kern/155498  net[ral] ral(4) needs to be resynced with OpenBSD's to ga
o kern/155420  net[vlan] adding vlan break existent vlan
o bin/155365   net[patch] routed(8): if.c in routed fails to compile if 
o kern/155177  net[route] [panic] Panic when inject routes in kernel
o kern/155030  net[igb] igb(4) DEVICE_POLLING does not work with carp(4)
o kern/155010  net[msk] ntfs-3g via iscsi using msk driver cause kernel 
o kern/155004  net[bce] [panic] kernel panic in bce0 driver
o kern/154943  net[gif] ifconfig gifX create on existing gifX clears IP
s kern/154851  net[request]: Port brcm80211 driver from Linux to FreeBSD
o kern/154850  net[netgraph] [patch] ng_ether fails to name nodes when t
p kern/154831  net[arp] [patch] arp sysctl setting log_arp_permanent_mod
o kern/154679  net[em] Fatal trap 12: "em1 taskq" only at startup (8.1-R
o kern/154600  net[tcp] [panic] Random kernel panics on tcp_output
o kern/154557  net[tcp] Freeze tcp-session of the clients, if in the gat
o kern/154443  net[if_bridge] Kernel module bridgestp.ko missing after u
o kern/154286  net[netgraph] [panic] 8.2-PRERELEASE panic in netgraph
o kern/154255  net[nfs] NFS not responding
o kern/154214  net[stf] [panic] Panic when creating stf interface
o kern/154185  netrace condition in mb_dupcl
o kern/154169  net[multicast] [ip6] Node Information Query multicast add
o kern/154134  net[ip6] stuck kernel state in LISTEN on ipv6 daemon whic
o kern/154091  net[netgraph] [panic] netgraph, unaligned mbuf?
o conf/154062  net[vlan] [patch] change to way of auto-generatation of v
o kern/153937  net[ral] ralink panics the system (amd64 freeBSDD 8.X) wh
o kern/153936  net[ixgbe] [patch] MPRC workaround incorrectly applied to
o kern/153816  net[ixgbe] ixgbe doesn't work properly with the Intel 10g
o kern/153772  net[ixgbe] [patch] sysctls reference wrong XON/XOFF varia
o kern/153497  net[netgraph] netgraph panic due to race conditions
o kern/153454  net[p

Repeating kernel panic within dummynet

2011-07-11 Thread Eugene Grosbein
Hi!

My FreeBSD 8.2/amd64 routers use dummynet heavily
and keep panic with the *same* KDB backtrace:

dummynet: bad switch -256!


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code  = supervisor read instruction, page not present
instruction pointer = 0x20:0x0
stack pointer   = 0x28:0xff81229d9a10
frame pointer   = 0x28:0xff81229d9a40
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (dummynet)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at 0x801aaaca = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0x80329667 = kdb_backtrace+0x37
panic() at 0x802f6cb7 = panic+0x187
trap_fatal() at 0x804d8b50 = trap_fatal+0x290
trap_pfault() at 0x804d8f2f = trap_pfault+0x28f
trap() at 0x804d940f = trap+0x3df
calltrap() at 0x804c0b44 = calltrap+0x8
--- trap 0xc, rip = 0, rsp = 0xff81229d9a10, rbp = 0xff81229d9a40 ---
uart_z8530_class() at 0
mb_dtor_pack() at 0x802e4787 = mb_dtor_pack+0x37
uma_zfree_arg() at 0x8049ba5a = uma_zfree_arg+0x3a
m_freem() at 0x803556a7 = m_freem+0x37
dummynet_send() at 0x803e909d = dummynet_send+0x2d
dummynet_task() at 0x803e93c6 = dummynet_task+0x1c6
taskqueue_run_locked() at 0x80335a65 = taskqueue_run_locked+0x85
taskqueue_thread_loop() at 0x80335bfe = taskqueue_thread_loop+0x4e
fork_exit() at 0x802ca4bf = fork_exit+0x11f
fork_trampoline() at 0x804c108e = fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xff81229d9d00, rbp = 0 ---
Uptime: 2d5h17m39s
Dumping 4087 MB (4 chunks)
  chunk 0: 1MB (150 pages) ... ok
  chunk 1: 3575MB (915072 pages) 3559 3543 3527 3511 3495 3479


It does not finish writing dump and hangs until IPMI watchdog reboots the box.
I've tried to use debug.minidump=1 but it still hangs while crashdumps is 
generating
and stops responding to Ctrl-Alt-ESC meantime.

Sadly, I cannot add options INVARIANTS to the kernel because it makes my 
mpd-based
routers to panic very often (every 2-3 hours) due to famous 'dangling pointer'
problem - PPPoE user disconnects, its ngXXX interface got removed, then its 
traffic
goes out various system queues (netisr, dummynet etc.) and another kind of panic
occurs due to INVARIANTS' references to non-existent ifp.

Please help.

Eugene Grosbein
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Repeating kernel panic within dummynet

2011-07-11 Thread Eugene Grosbein
11.07.2011 18:45, Vlad Galu пишет:
> 
> On Jul 11, 2011, at 1:42 PM, Eugene Grosbein wrote:
> 
>> Hi!
>>
>> My FreeBSD 8.2/amd64 routers use dummynet heavily
>> and keep panic with the *same* KDB backtrace:
>>
>> dummynet: bad switch -256!

Forgot to mention that I use io_fast dummynet mode
and have increased pipe lengths:

net.inet.ip.dummynet.pipe_slot_limit=1000
net.inet.ip.dummynet.io_fast=1

Distinct pipes do really use long lengths.

>> Sadly, I cannot add options INVARIANTS to the kernel because it makes my 
>> mpd-based
>> routers to panic very often (every 2-3 hours) due to famous 'dangling 
>> pointer'
>> problem - PPPoE user disconnects, its ngXXX interface got removed, then its 
>> traffic
>> goes out various system queues (netisr, dummynet etc.) and another kind of 
>> panic
>> occurs due to INVARIANTS' references to non-existent ifp.
> 
> Hi Eugene,
> If your ISR threads aren't already bound to CPUs, you can bind them and try 
> using INVARIANTS.

Please explain how to bind them. I have 4-core boxes with 4 NICs grouped to 2 
laggs,
one lagg(4) for uplink and another one for downlink.

Eugene Grosbein
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Repeating kernel panic within dummynet

2011-07-11 Thread Vlad Galu

On Jul 11, 2011, at 1:51 PM, Eugene Grosbein wrote:

> 11.07.2011 18:45, Vlad Galu пишет:
>> 
>> On Jul 11, 2011, at 1:42 PM, Eugene Grosbein wrote:
>> 
>>> Hi!
>>> 
>>> My FreeBSD 8.2/amd64 routers use dummynet heavily
>>> and keep panic with the *same* KDB backtrace:
>>> 
>>> dummynet: bad switch -256!
> 
> Forgot to mention that I use io_fast dummynet mode
> and have increased pipe lengths:
> 
> net.inet.ip.dummynet.pipe_slot_limit=1000
> net.inet.ip.dummynet.io_fast=1
> 
> Distinct pipes do really use long lengths.
> 
>>> Sadly, I cannot add options INVARIANTS to the kernel because it makes my 
>>> mpd-based
>>> routers to panic very often (every 2-3 hours) due to famous 'dangling 
>>> pointer'
>>> problem - PPPoE user disconnects, its ngXXX interface got removed, then its 
>>> traffic
>>> goes out various system queues (netisr, dummynet etc.) and another kind of 
>>> panic
>>> occurs due to INVARIANTS' references to non-existent ifp.
>> 
>> Hi Eugene,
>> If your ISR threads aren't already bound to CPUs, you can bind them and try 
>> using INVARIANTS.
> 
> Please explain how to bind them. I have 4-core boxes with 4 NICs grouped to 2 
> laggs,
> one lagg(4) for uplink and another one for downlink.
> 

net.isr.bindthreads=1

I'm not sure how and if that would help your particular setup, but it did so in 
Adrian Minta's recent netgraph/mpd experiments. According to an off-list chat I 
had with him, the machine would panic unless the ISRs were bound.

> Eugene Grosbein

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Repeating kernel panic within dummynet

2011-07-11 Thread Eugene Grosbein
11.07.2011 19:02, Vlad Galu пишет:

> net.isr.bindthreads=1
> 
> I'm not sure how and if that would help your particular setup, but it did so 
> in Adrian Minta's recent netgraph/mpd experiments. According to an off-list 
> chat I had with him, the machine would panic unless the ISRs were bound.

I disable ISR parallelism for my mpd routers using:

net.isr.direct=1
net.isr.direct_force=1

At the other hand, there are other queues where traffic got delayed, not ISR 
only.
Dummynet itself is an example. The router still panices with INVARIANTS too 
often.

Eugene Grosbein

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Repeating kernel panic within dummynet

2011-07-11 Thread Vlad Galu

On Jul 11, 2011, at 1:42 PM, Eugene Grosbein wrote:

> Hi!
> 
> My FreeBSD 8.2/amd64 routers use dummynet heavily
> and keep panic with the *same* KDB backtrace:
> 
> dummynet: bad switch -256!
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x0
> fault code  = supervisor read instruction, page not present
> instruction pointer = 0x20:0x0
> stack pointer   = 0x28:0xff81229d9a10
> frame pointer   = 0x28:0xff81229d9a40
> code segment= base 0x0, limit 0xf, type 0x1b
>= DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 0 (dummynet)
> trap number = 12
> panic: page fault
> cpuid = 0
> KDB: stack backtrace:
> db_trace_self_wrapper() at 0x801aaaca = db_trace_self_wrapper+0x2a
> kdb_backtrace() at 0x80329667 = kdb_backtrace+0x37
> panic() at 0x802f6cb7 = panic+0x187
> trap_fatal() at 0x804d8b50 = trap_fatal+0x290
> trap_pfault() at 0x804d8f2f = trap_pfault+0x28f
> trap() at 0x804d940f = trap+0x3df
> calltrap() at 0x804c0b44 = calltrap+0x8
> --- trap 0xc, rip = 0, rsp = 0xff81229d9a10, rbp = 0xff81229d9a40 ---
> uart_z8530_class() at 0
> mb_dtor_pack() at 0x802e4787 = mb_dtor_pack+0x37
> uma_zfree_arg() at 0x8049ba5a = uma_zfree_arg+0x3a
> m_freem() at 0x803556a7 = m_freem+0x37
> dummynet_send() at 0x803e909d = dummynet_send+0x2d
> dummynet_task() at 0x803e93c6 = dummynet_task+0x1c6
> taskqueue_run_locked() at 0x80335a65 = taskqueue_run_locked+0x85
> taskqueue_thread_loop() at 0x80335bfe = taskqueue_thread_loop+0x4e
> fork_exit() at 0x802ca4bf = fork_exit+0x11f
> fork_trampoline() at 0x804c108e = fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xff81229d9d00, rbp = 0 ---
> Uptime: 2d5h17m39s
> Dumping 4087 MB (4 chunks)
>  chunk 0: 1MB (150 pages) ... ok
>  chunk 1: 3575MB (915072 pages) 3559 3543 3527 3511 3495 3479
> 
> 
> It does not finish writing dump and hangs until IPMI watchdog reboots the box.
> I've tried to use debug.minidump=1 but it still hangs while crashdumps is 
> generating
> and stops responding to Ctrl-Alt-ESC meantime.
> 
> Sadly, I cannot add options INVARIANTS to the kernel because it makes my 
> mpd-based
> routers to panic very often (every 2-3 hours) due to famous 'dangling pointer'
> problem - PPPoE user disconnects, its ngXXX interface got removed, then its 
> traffic
> goes out various system queues (netisr, dummynet etc.) and another kind of 
> panic
> occurs due to INVARIANTS' references to non-existent ifp.

Hi Eugene,
If your ISR threads aren't already bound to CPUs, you can bind them and try 
using INVARIANTS.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/152036: [libc] getifaddrs(3) returns truncated sockaddrs for netmasks

2011-07-11 Thread Sergey Kandaurov
The following reply was made to PR kern/152036; it has been noted by GNATS.

From: Sergey Kandaurov 
To: bug-follo...@freebsd.org, kby...@gmail.com
Cc:  
Subject: Re: kern/152036: [libc] getifaddrs(3) returns truncated sockaddrs for 
netmasks
Date: Mon, 11 Jul 2011 17:59:47 +0400

 [Some thoughts and testing...]
 This is rather a kernel bug, i.e. this is not a getifaddrs() bug.
 This is confirmed by (undocumented) ioctl SIOCGIFNETMASK.
 I found that the bug is manifested for ip4, and not for lladdr, ipv6.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: RFC 6296 (NPT v6)

2011-07-11 Thread Sergey Matveychuk

10.07.2011 7:13, Rémy Sanchez wrote:

Hi,

I was wondering if they were anyone currently implementing NPTv6 for FreeBSD ?

If nobody is, since I need this feature and that the RFC is quite simple, I
think I'll implement it (or run out of time trying to). However, it looks like
you can't divert IPv6, and then I don't know what would be the best option to


IPv6 patch for divert(4) was committed in HEAD a couple weeks ago by 
glebius@ (r223593).


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


MFC Re: soreceive_stream: issues with O_NONBLOCK

2011-07-11 Thread Andrew Boyer
On Jul 8, 2011, at 6:51 AM, Andre Oppermann wrote:

> On 07.07.2011 21:24, Mikolaj Golub wrote:
>> 
>> On Thu, 07 Jul 2011 12:47:15 +0200 Andre Oppermann wrote:
>> 
>>  AO>  Please try this patch:
>>  AO>   http://people.freebsd.org/~andre/soreceive_stream.diff-20110707
>> 
>> It works for me. No issues detected so far. Thanks.
> 
> Committed in r223863. Many thanks for testing!
> 
> -- 
> Andre

Hello Andre,
It appears that r197236 was never MFC'd, so soreceive_stream is still on by 
default in stable/8.  Would you be able to MFC it along with 223839 and 223863?

Thank you,
  Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


MFC of 218627 (SO_SETFIB 0)

2011-07-11 Thread Andrew Boyer
Would someone please MFC r218627 back to stable/8 and stable/7?  They are both 
affected.

Thank you,
  Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


ESP Raw Socket: Returned IP packet incorrect

2011-07-11 Thread Matthew Cini Sarreo
Hello all;

I have recently encountered a problem when using raw sockets on FreeBSD 8
(8.0-RELEASE) when using ESP raw sockets.

I have created a raw esp socket using:
socket(AF_INET, SOCK_RAW, 50);
which works fine. However, when there is a packet on the socket, recvfrom()
returns a packet where the length bytes in the IP header are incorrect; they
are swapped (MSB is placed in the LSB and vice-versa)

tcpdump shows the following:

tcpdump: listening on le0, link-type EN10MB (Ethernet), capture size 96
bytes
15:00:53.993810 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ESP
(50), length 120)
10.0.251.228 > 10.0.252.231: ESP(spi=0xa0534f17,seq=0x3), length 100
0x:  4500 0078  4000 4032 2d88 0a00 fbe4
0x0010:  0a00 fce7 a053 4f17  0003 6885 8abd
0x0020:   5ded 44dc 842f 3081 8fa3 bde4 2265
0x0030:  7438 2bf4 049c 664b 7dc4 44ef 1f6f 5e7d
0x0040:  b8c1 482f 8c3b f488 a19a 3d9a d5fe ed9d
0x0050:  b1c2


However, recvfrom() returns the following buffer:
4500 6400  0040 4032 2D88 0A00 FBE4
0A00 FCE7 A053 4F17  0003 6885 8ABD
 5DED 44DC 842F 3081 8FA3 BDE4 2265
7438 2BF4 049C 664B 7DC4 44EF 1F6F 5E7D
B8C1 482F 8C3B F488 A19A 3D9A D5FE ED9D
B1C2

As it is easy to see, the length is not correct (bytes 2 and 3 are 0x6400
instead of 0x0064) and it does not correspond to the value returned by
recvfrom().

Is this a known issue? Am I missing some options for raw sockets that are
required for FreeBSD? I have attempted this on a socket to a TUN interface
(not with an ESP socket) and the buffer had the proper length; it seems to
only happen with ESP. This code runs fine on multiple Linux distributions
and on Windows; it was only noticed with FreeBSD. Could it be that there is
some other ESP application running and interfering (I have not installed
any; don't know if there are by default and I'm quite new to any of the
BSDs)?

Any help would be much appreciated.
Matt
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ESP Raw Socket: Returned IP packet incorrect

2011-07-11 Thread Michael Tüxen
On Jul 11, 2011, at 5:26 PM, Matthew Cini Sarreo wrote:

> Hello all;
> 
> I have recently encountered a problem when using raw sockets on FreeBSD 8
> (8.0-RELEASE) when using ESP raw sockets.
> 
> I have created a raw esp socket using:
> socket(AF_INET, SOCK_RAW, 50);
> which works fine. However, when there is a packet on the socket, recvfrom()
> returns a packet where the length bytes in the IP header are incorrect; they
> are swapped (MSB is placed in the LSB and vice-versa)
> 
> tcpdump shows the following:
> 
> tcpdump: listening on le0, link-type EN10MB (Ethernet), capture size 96
> bytes
> 15:00:53.993810 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ESP
> (50), length 120)
>10.0.251.228 > 10.0.252.231: ESP(spi=0xa0534f17,seq=0x3), length 100
>0x:  4500 0078  4000 4032 2d88 0a00 fbe4
>0x0010:  0a00 fce7 a053 4f17  0003 6885 8abd
>0x0020:   5ded 44dc 842f 3081 8fa3 bde4 2265
>0x0030:  7438 2bf4 049c 664b 7dc4 44ef 1f6f 5e7d
>0x0040:  b8c1 482f 8c3b f488 a19a 3d9a d5fe ed9d
>0x0050:  b1c2
> 
> 
> However, recvfrom() returns the following buffer:
> 4500 6400  0040 4032 2D88 0A00 FBE4
> 0A00 FCE7 A053 4F17  0003 6885 8ABD
>  5DED 44DC 842F 3081 8FA3 BDE4 2265
> 7438 2BF4 049C 664B 7DC4 44EF 1F6F 5E7D
> B8C1 482F 8C3B F488 A19A 3D9A D5FE ED9D
> B1C2
> 
> As it is easy to see, the length is not correct (bytes 2 and 3 are 0x6400
> instead of 0x0064) and it does not correspond to the value returned by
> recvfrom().
> 
> Is this a known issue? Am I missing some options for raw sockets that are
> required for FreeBSD? I have attempted this on a socket to a TUN interface
> (not with an ESP socket) and the buffer had the proper length; it seems to
> only happen with ESP. This code runs fine on multiple Linux distributions
> and on Windows; it was only noticed with FreeBSD. Could it be that there is
> some other ESP application running and interfering (I have not installed
> any; don't know if there are by default and I'm quite new to any of the
> BSDs)?
I think Linux provides the tot_len field in network byte order whereas
FreeBSD provides it in host byte order. At least they expect it that way
when using a send call.

So you must take care of this in the source code of the application by
using an #ifdef...

Best regards
Michael
> 
> Any help would be much appreciated.
> Matt
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


RE: bce packet loss

2011-07-11 Thread David Christensen
> I'm running 8.1 and at least on the bce hosts, it looks like flow
> control
> isn't supported, it was added on 4/30/2010:
> 
> http://svnweb.freebsd.org/base/head/sys/dev/bce/if_bce.c?r1=206268&r2=20
> 7411
> 
> In my 8.1 sources I still see this comment, which was removed in the
> above
> commit:
>  /* ToDo: Enable flow control support in brgphy and bge. */

This really applies to whether the user can set flow control
manually.  By default the NIC should auto-negotiate link speed
and flow-control which is the most common case.  For example, 
you can't set RX flow control and disable TX flow control with
ifconfig using the current implementation, though it is possible
in Linux with ethtool.

> So at least on the bce hosts (and bge it seems), I do not have flow
> control available on the NIC.  

Flow control will be set according to auto-negotiation results.
For most cases that means flow control will be enabled since
both sides normally support it.

> The sysctl stats do show that it's
> received
> "XON/XOFF" frames, which I assume are flow control messages, but there's
> no indication that the NIC does anything with them.

There won't be any indication in the driver since flow control
is managed in hardware.  You'd need a wire capture to see that
bce(4) has stopped sending frames in response to receiving an
XOFF flow control frame or started sending frames in response
to receiving an XON flow control frame.

> >> We are running 8.1, am I correct in that flow control is not
> implemented
> >> there?  We do have an 8.2-STABLE image from a month or so ago that we
> >> are
> >> testing with zfs v28, might that implement flow control?
> >
> > Flow control will depend on the NIC driver implementation.  Older
> > versions of the bce(4) firmware will rarely generate pause frames
> > (frames would be dropped by firmware but statistics should show
> > the frame drop occurring) and should always honor pause frames
> > from the link partner when flow control is enabled.
> 
> I think my nics probably lack it.  I am also guessing that if any
> high-traffic host ignores flow control frames, that's going to screw up
> other hosts as well since the one causing the buffers to fill is not
> going
> to throttle and the overflow will continue, correct?

Flow control is asymmetric and operates independently in both
directions.  If the traffic source ignores flow control frames
or did not auto-negotiate flow control then it can certainly
overwhelm the switch or traffic sink's buffers, causing frame
drop and retransmits.

> 
> >>
> >> Although reading this:
> >>
> >> http://en.wikipedia.org/wiki/Ethernet_flow_control
> >>
> >> It sounds like flow control is not terribly optimal since it forces
> the
> >> host to block all traffic.  Not sure if this means drops are
> eliminated,
> >> reduced or shuffled around.

Frame drops should be eliminated, though congestion could
spread upstream to other devices which don't have flow control
and result in frame drops and retransmits there.

> > When congestion is detected the switch should buffer up to a certain
> > limit (say 80% of full) and then start sending pause frames to avoid
> > dropping frames.  This will affect all hosts connecting through the
> > switch so congestion at one host can spread to other hosts (see
> >
> http://www.ieee802.org/3/cm_study/public/september04/thaler_3_0904.pdf).
> 
> Wow.  I did not catch that.  I do recall something about the flow
> control
> frames being multicast - so every host gets them and pauses.  That's...
> interesting, isn't it?

Pause frames are multicast frames but they are only transmitted
between link partners (NIC to switch) and never sent further in
the network.  Flow control is intended to be a local behavior but
the link indicates it can have an unintended global effect.

Dave

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


RE: bce packet loss

2011-07-11 Thread Charles Sprickman

On Mon, 11 Jul 2011, David Christensen wrote:


I'm running 8.1 and at least on the bce hosts, it looks like flow
control
isn't supported, it was added on 4/30/2010:

http://svnweb.freebsd.org/base/head/sys/dev/bce/if_bce.c?r1=206268&r2=20
7411

In my 8.1 sources I still see this comment, which was removed in the
above
commit:
 /* ToDo: Enable flow control support in brgphy and bge. */


This really applies to whether the user can set flow control
manually.  By default the NIC should auto-negotiate link speed
and flow-control which is the most common case.  For example,
you can't set RX flow control and disable TX flow control with
ifconfig using the current implementation, though it is possible
in Linux with ethtool.


OK, well that explains alot.  I've had it hammered into my brain over the 
years that for servers it's always best to set link speed and duplex 
manually at both ends to remove any possible issues with link negotiation. 
This advice was from back when FE was still new, and I recall 
autonegotiation causing issues, I believe specifically with some vintage 
Cisco switches.



So at least on the bce hosts (and bge it seems), I do not have flow
control available on the NIC.


Flow control will be set according to auto-negotiation results.
For most cases that means flow control will be enabled since
both sides normally support it.


It sounds like I'm causing myself trouble here by not letting everything
autonegotiate.  I'll move things to auto and see what happens.B


The sysctl stats do show that it's
received
"XON/XOFF" frames, which I assume are flow control messages, but there's
no indication that the NIC does anything with them.


There won't be any indication in the driver since flow control
is managed in hardware.  You'd need a wire capture to see that
bce(4) has stopped sending frames in response to receiving an
XOFF flow control frame or started sending frames in response
to receiving an XON flow control frame.


Ah.  I was hoping for something in the ifconfig output.  I'll see if 
tcpdump and wireshark can tell me anything about this host.


One the one host (w/bce) I just set to full auto, the switch claims to 
have negotiated 1000FD w/flow control (this specifically shows as 
"auto+enabled" on the switch side).


I see that the "sysctl dev.bce.1" tree has some info, and I can see that 
the NIC is receiving flow control frames:


dev.bce.1.stat_XonPauseFramesReceived: 16638
dev.bce.1.stat_XoffPauseFramesReceived: 17239

These lines are a bit puzzling though:

dev.bce.1.stat_FlowControlDone: 0
dev.bce.1.stat_XoffStateEntered: 0


We are running 8.1, am I correct in that flow control is not

implemented

there?  We do have an 8.2-STABLE image from a month or so ago that we
are
testing with zfs v28, might that implement flow control?


Flow control will depend on the NIC driver implementation.  Older
versions of the bce(4) firmware will rarely generate pause frames
(frames would be dropped by firmware but statistics should show
the frame drop occurring) and should always honor pause frames
from the link partner when flow control is enabled.


I think my nics probably lack it.  I am also guessing that if any
high-traffic host ignores flow control frames, that's going to screw up
other hosts as well since the one causing the buffers to fill is not
going
to throttle and the overflow will continue, correct?


Flow control is asymmetric and operates independently in both
directions.  If the traffic source ignores flow control frames
or did not auto-negotiate flow control then it can certainly
overwhelm the switch or traffic sink's buffers, causing frame
drop and retransmits.


I ran a quick scp of a large file to another host with 100Mb connectivity
and those xon/xoff counters incremented, but they were doing that
previously.  I assume that confirms the switch is at least asking for a
pause. I still saw about 5000 dropped ingress packets on the switch, but I 
assume that could be due to some other host filling the buffers.






Although reading this:

http://en.wikipedia.org/wiki/Ethernet_flow_control

It sounds like flow control is not terribly optimal since it forces

the

host to block all traffic.  Not sure if this means drops are

eliminated,

reduced or shuffled around.


Frame drops should be eliminated, though congestion could
spread upstream to other devices which don't have flow control
and result in frame drops and retransmits there.


When congestion is detected the switch should buffer up to a certain
limit (say 80% of full) and then start sending pause frames to avoid
dropping frames.  This will affect all hosts connecting through the
switch so congestion at one host can spread to other hosts (see


http://www.ieee802.org/3/cm_study/public/september04/thaler_3_0904.pdf).

Wow.  I did not catch that.  I do recall something about the flow
control
frames being multicast - so every host gets them and pauses.  That's...
interesting, isn't it?


Pause frames are multicast f

Re: bce packet loss

2011-07-11 Thread Doug Barton
On 07/11/2011 21:09, Charles Sprickman wrote:
> I've had it hammered into my brain over the years that for servers it's
> always best to set link speed and duplex manually at both ends to remove
> any possible issues with link negotiation.

That hasn't been the right thing to do for at least 8 years or so,
probably 10 or more.

Yes, back in the 90's when all of this stuff was still new it was not
uncommon to have autonegotiation issues, but any even sort of modern
hardware (on either side of the link) will do better with auto than not.


hth,

Doug

-- 

Nothin' ever doesn't change, but nothin' changes much.
-- OK Go

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: bce packet loss

2011-07-11 Thread Charles Sprickman

On Mon, 11 Jul 2011, Doug Barton wrote:


On 07/11/2011 21:09, Charles Sprickman wrote:

I've had it hammered into my brain over the years that for servers it's
always best to set link speed and duplex manually at both ends to remove
any possible issues with link negotiation.


That hasn't been the right thing to do for at least 8 years or so,
probably 10 or more.

Yes, back in the 90's when all of this stuff was still new it was not
uncommon to have autonegotiation issues, but any even sort of modern
hardware (on either side of the link) will do better with auto than not.


Some of us still work at places where the hardware is 10 years old, you 
know. :)


I do still see fixed setups in service provider handoffs - for example 
this colo, Level3 and Hurricane.  Also all our metro ethernet stuff 
specifies a fixed configuration.


From what I can gather, this seems to be the standard practice in that 
space, but then again you're supposed to be plugging into equipment that 
wouldn't have the buffer issues that a $450 Dell switch would have.


The rule I recall is never do autoneg on one side and fixed on the other, 
that more often than not will end up in a duplex mismatch.


Charles



hth,

Doug

--

Nothin' ever doesn't change, but nothin' changes much.
-- OK Go

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: bce packet loss

2011-07-11 Thread Doug Barton
On 07/11/2011 22:47, Charles Sprickman wrote:
> On Mon, 11 Jul 2011, Doug Barton wrote:
> 
>> On 07/11/2011 21:09, Charles Sprickman wrote:
>>> I've had it hammered into my brain over the years that for servers it's
>>> always best to set link speed and duplex manually at both ends to remove
>>> any possible issues with link negotiation.
>>
>> That hasn't been the right thing to do for at least 8 years or so,
>> probably 10 or more.
>>
>> Yes, back in the 90's when all of this stuff was still new it was not
>> uncommon to have autonegotiation issues, but any even sort of modern
>> hardware (on either side of the link) will do better with auto than not.
> 
> Some of us still work at places where the hardware is 10 years old, you
> know. :)

True ... hence my careful specification of "sort of modern." :)

> I do still see fixed setups in service provider handoffs - for example
> this colo, Level3 and Hurricane.  Also all our metro ethernet stuff
> specifies a fixed configuration.
> 
> From what I can gather, this seems to be the standard practice in that
> space, but then again you're supposed to be plugging into equipment that
> wouldn't have the buffer issues that a $450 Dell switch would have.

Well one could also say that this sort of thing tends to result from
the, "There is a knob, I MUST twist it!" syndrome.

> The rule I recall is never do autoneg on one side and fixed on the
> other, that more often than not will end up in a duplex mismatch.

Yes, that's definitely true, and I should have mentioned it. Whatever
you do on one side (auto/manual) you must also do on the other.


Doug

-- 

Nothin' ever doesn't change, but nothin' changes much.
-- OK Go

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"