date:20110201

Re: taps in rc.config

2011-02-01 Thread Randy Bush

cloned_interfaces="tap0 tap1 tap2 tap3 tap4 tap5 tap6 tap7 tap8 tap9"
ifconfig_tap0=147.28.224.41/30
ifconfig_tap1=147.28.224.45/30
ifconfig_tap2=147.28.224.49/30
ifconfig_tap3=147.28.224.53/30
ifconfig_tap4=147.28.224.57/30
ifconfig_tap5=147.28.224.61/30
ifconfig_tap6=147.28.224.65/30
ifconfig_tap7=147.28.224.69/30
ifconfig_tap8=147.28.224.73/30
ifconfig_tap9=147.28.224.77/30
autobridge_interfaces=bridge0
autobridge_bridge0="tap* igb0"

gets me no bridge.  do i need a cloned interface for it?

igb0: flags=8843 metric 0 mtu 1500
options=13b
ether 00:30:48:d6:6c:22
inet 198.180.152.11 netmask 0xffc0 broadcast 198.180.152.63
inet 198.180.152.30 netmask 0x broadcast 198.180.152.30
inet 198.180.152.31 netmask 0x broadcast 198.180.152.31
inet 198.180.152.32 netmask 0x broadcast 198.180.152.32
inet 198.180.152.33 netmask 0x broadcast 198.180.152.33
inet 198.180.152.34 netmask 0x broadcast 198.180.152.34
inet 198.180.152.35 netmask 0x broadcast 198.180.152.35
inet 198.180.152.36 netmask 0x broadcast 198.180.152.36
inet 198.180.152.37 netmask 0x broadcast 198.180.152.37
inet 198.180.152.38 netmask 0x broadcast 198.180.152.38
inet 198.180.152.39 netmask 0x broadcast 198.180.152.39
media: Ethernet autoselect (1000baseT )
status: active
igb1: flags=8843 metric 0 mtu 1500
options=13b
ether 00:30:48:d6:6c:23
inet 10.0.0.2 netmask 0xff00 broadcast 10.0.0.255
media: Ethernet autoselect (1000baseT )
status: active
lo0: flags=8049 metric 0 mtu 16384
options=3
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 
inet6 ::1 prefixlen 128 
inet 127.0.0.1 netmask 0xff00 
nd6 options=3
tap0: flags=8843 metric 0 mtu 1500
ether 00:bd:b8:25:00:00
inet 147.28.224.41 netmask 0xfffc broadcast 147.28.224.43
tap1: flags=8843 metric 0 mtu 1500
ether 00:bd:bc:25:00:01
inet 147.28.224.45 netmask 0xfffc broadcast 147.28.224.47
tap2: flags=8843 metric 0 mtu 1500
ether 00:bd:c1:25:00:02
inet 147.28.224.49 netmask 0xfffc broadcast 147.28.224.51
tap3: flags=8843 metric 0 mtu 1500
ether 00:bd:c5:25:00:03
inet 147.28.224.53 netmask 0xfffc broadcast 147.28.224.55
tap4: flags=8843 metric 0 mtu 1500
ether 00:bd:c9:25:00:04
inet 147.28.224.57 netmask 0xfffc broadcast 147.28.224.59
tap5: flags=8843 metric 0 mtu 1500
ether 00:bd:cd:25:00:05
inet 147.28.224.61 netmask 0xfffc broadcast 147.28.224.63
tap6: flags=8843 metric 0 mtu 1500
ether 00:bd:d1:25:00:06
inet 147.28.224.65 netmask 0xfffc broadcast 147.28.224.67
tap7: flags=8843 metric 0 mtu 1500
ether 00:bd:d5:25:00:07
inet 147.28.224.69 netmask 0xfffc broadcast 147.28.224.71
tap8: flags=8843 metric 0 mtu 1500
ether 00:bd:d9:25:00:08
inet 147.28.224.73 netmask 0xfffc broadcast 147.28.224.75
tap9: flags=8843 metric 0 mtu 1500
ether 00:bd:dd:25:00:09
inet 147.28.224.77 netmask 0xfffc broadcast 147.28.224.79
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: taps in rc.config

2011-02-01 Thread Julian Elischer


On 1/31/11 11:43 PM, Randy Bush wrote:

1/ wow does that (dynamips ciscos) actually run on BSD?

yep


2/ "why?"

so we can have a routing research topology testbed of real cisco and
real juniper code.


first you need to create them right?
ifconfig tap0 create 192.168.3.1/28 up

I think you do:
in rc.conf:
cloned_interfaces="tap0 tap1 tap2 tap3"
ifconfig_tap0=192.168.3.1/28
ifconfig_tap1=192.168.4.1/28
ifconfig_tap2=192.168.5.1/28
ifconfig_tap3=192.168.6.1/28

but I may not be remembering right.

thanks.  and what binds them to a particular ether so they respond to
arp?


does the dynamips thingy know how to read the /dev side of a tap device?
if so then I think you can use the bridging facilities,
you should be able to use teh autobridge config settings..
look in /etc/defaults/rc.conf for an example..
it shows tap/if examples..
man rc.conf
for more details..
remember don't change that file... put your changes in /etc/rc.conf

randy



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: taps in rc.config

2011-02-01 Thread batcilla itself

2011/2/1 Randy Bush 

> cloned_interfaces="*bridge0* tap0 tap1 tap2 tap3 tap4 tap5 tap6 tap7 tap8
> tap9"
> ifconfig_tap0=147.28.224.41/30
> ifconfig_tap1=147.28.224.45/30
> ifconfig_tap2=147.28.224.49/30
> ifconfig_tap3=147.28.224.53/30
> ifconfig_tap4=147.28.224.57/30
> ifconfig_tap5=147.28.224.61/30
> ifconfig_tap6=147.28.224.65/30
> ifconfig_tap7=147.28.224.69/30
> ifconfig_tap8=147.28.224.73/30
> ifconfig_tap9=147.28.224.77/30
> autobridge_interfaces=bridge0
> autobridge_bridge0="tap* igb0"
>
> gets me no bridge.  do i need a cloned interface for it?
>
Yes, it should be in cloned_interfaces list.

>
> igb0: flags=8843 metric 0 mtu 1500
>options=13b
>ether 00:30:48:d6:6c:22
>inet 198.180.152.11 netmask 0xffc0 broadcast 198.180.152.63
>inet 198.180.152.30 netmask 0x broadcast 198.180.152.30
>inet 198.180.152.31 netmask 0x broadcast 198.180.152.31
>inet 198.180.152.32 netmask 0x broadcast 198.180.152.32
>inet 198.180.152.33 netmask 0x broadcast 198.180.152.33
>inet 198.180.152.34 netmask 0x broadcast 198.180.152.34
>inet 198.180.152.35 netmask 0x broadcast 198.180.152.35
>inet 198.180.152.36 netmask 0x broadcast 198.180.152.36
>inet 198.180.152.37 netmask 0x broadcast 198.180.152.37
>inet 198.180.152.38 netmask 0x broadcast 198.180.152.38
>inet 198.180.152.39 netmask 0x broadcast 198.180.152.39
>media: Ethernet autoselect (1000baseT )
>status: active
> igb1: flags=8843 metric 0 mtu 1500
>options=13b
>ether 00:30:48:d6:6c:23
>inet 10.0.0.2 netmask 0xff00 broadcast 10.0.0.255
>media: Ethernet autoselect (1000baseT )
>status: active
> lo0: flags=8049 metric 0 mtu 16384
>options=3
>inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
>inet6 ::1 prefixlen 128
>inet 127.0.0.1 netmask 0xff00
>nd6 options=3
> tap0: flags=8843 metric 0 mtu 1500
>ether 00:bd:b8:25:00:00
>inet 147.28.224.41 netmask 0xfffc broadcast 147.28.224.43
> tap1: flags=8843 metric 0 mtu 1500
>ether 00:bd:bc:25:00:01
>inet 147.28.224.45 netmask 0xfffc broadcast 147.28.224.47
> tap2: flags=8843 metric 0 mtu 1500
>ether 00:bd:c1:25:00:02
>inet 147.28.224.49 netmask 0xfffc broadcast 147.28.224.51
> tap3: flags=8843 metric 0 mtu 1500
>ether 00:bd:c5:25:00:03
>inet 147.28.224.53 netmask 0xfffc broadcast 147.28.224.55
> tap4: flags=8843 metric 0 mtu 1500
>ether 00:bd:c9:25:00:04
>inet 147.28.224.57 netmask 0xfffc broadcast 147.28.224.59
> tap5: flags=8843 metric 0 mtu 1500
>ether 00:bd:cd:25:00:05
>inet 147.28.224.61 netmask 0xfffc broadcast 147.28.224.63
> tap6: flags=8843 metric 0 mtu 1500
>ether 00:bd:d1:25:00:06
>inet 147.28.224.65 netmask 0xfffc broadcast 147.28.224.67
> tap7: flags=8843 metric 0 mtu 1500
>ether 00:bd:d5:25:00:07
>inet 147.28.224.69 netmask 0xfffc broadcast 147.28.224.71
> tap8: flags=8843 metric 0 mtu 1500
>ether 00:bd:d9:25:00:08
>inet 147.28.224.73 netmask 0xfffc broadcast 147.28.224.75
> tap9: flags=8843 metric 0 mtu 1500
>ether 00:bd:dd:25:00:09
>inet 147.28.224.77 netmask 0xfffc broadcast 147.28.224.79
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: taps in rc.config

2011-02-01 Thread batcilla itself

Hi Randy,

I believe this may help(found in /etc/defaults/rc.conf):

#autobridge_interfaces="bridge0"# List of bridges to check
#autobridge_bridge0="tap* em0"# Interface glob to automatically add
to the bridge

if used in addition to example below.

//batcilla


2011/2/1 Randy Bush 

> > 1/ wow does that (dynamips ciscos) actually run on BSD?
>
> yep
>
> > 2/ "why?"
>
> so we can have a routing research topology testbed of real cisco and
> real juniper code.
>
> > first you need to create them right?
> > ifconfig tap0 create 192.168.3.1/28 up
> >
> > I think you do:
> > in rc.conf:
> > cloned_interfaces="tap0 tap1 tap2 tap3"
> > ifconfig_tap0=192.168.3.1/28
> > ifconfig_tap1=192.168.4.1/28
> > ifconfig_tap2=192.168.5.1/28
> > ifconfig_tap3=192.168.6.1/28
> >
> > but I may not be remembering right.
>
> thanks.  and what binds them to a particular ether so they respond to
> arp?
>
> randy
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: taps in rc.config

2011-02-01 Thread Randy Bush

>> gets me no bridge.  do i need a cloned interface for it?
> Yes, it should be in cloned_interfaces list.

works perfectly.  thank you!!

randy
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

if_alloc() page fault, VirtualBox + VIMAGE

2011-02-01 Thread Monthadar Al Jaberi

Hi,

I am running FreeBSD Current (201010) as guest in VirtualBox on an
Ubuntu 10.04. FreeBSD is compiled with VIMAGE option.

I loaded a module (my own) that calls if_alloc(IFT_IEEE80211), but I
get a panic:

Kernel page fault with the following non-sleepable locks held:
exclusive rw ifnet_rw (ifnet_rw) r = 0 (0xc0fc8284) locked @
/usr/src/sys/net/if.c:414
KDB: stack backtrace:
db_trace_self_wrapper(c0cf3cdb,1,0,0,0,...) at db_trace_self_wrapper+0x26
kdb_backtrace(19e,1,,c0f9b194,c2fc9a1c,...) at kdb_backtrace+0x2a
_witness_debugger(c0cf6408,c2fc9a30,4,1,0,...) at _witness_debugger+0x25
witness_warn(5,0,c0d2c479,3,c4070d48,...) at witness_warn+0x1fe
trap(c2fc9abc) at trap+0x195
calltrap() at calltrap+0x6
--- trap 0xc, eip = 0xc0970999, esp = 0xc2fc9afc, ebp = 0xc2fc9b1c ---
ifindex_alloc_locked(c0d003cf,c2fc9b36,19e,19e,c15ab714,...) at
ifindex_alloc_locked+0x19
if_alloc(47,c4085a16,3,c0de9614,c32aa780,...) at if_alloc+0x85
wtap_attach(c31a7800,c40857c0,0,4,0,...) at wtap_attach+0x29
new_wtap(c32aa780,0,c2fc9bf0,c083ac9b,c3cbb200,...) at new_wtap+0x9b
wtap_ioctl(c3cbb200,80045701,c31edaa0,1,c3f90b40,...) at wtap_ioctl+0x36
devfs_ioctl_f(c3cfe3b8,80045701,c31edaa0,c3185d00,c3f90b40,...) at
devfs_ioctl_f+0x10b
kern_ioctl(c3f90b40,3,80045701,c31edaa0,fc9cec,...) at kern_ioctl+0x20d
ioctl(c3f90b40,c2fc9cec,c2fc9d28,c0cf5783,0,...) at ioctl+0x134
syscallenter(c3f90b40,c2fc9ce4,c2fc9ce4,0,0,...) at syscallenter+0x263
syscall(c2fc9d28) at syscall+0x34
Xint0x80_syscall() at Xint0x80_syscall+0x21
--- syscall (54, FreeBSD ELF32, ioctl), eip = 0x28181203, esp =
0xbfbfec3c, ebp = 0xbfbfec58 ---


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x18
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc0970999
stack pointer   = 0x28:0xc2fc9afc
frame pointer   = 0x28:0xc2fc9b1c
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1203 (ioctl)
panic: from debugger
cpuid = 0
Uptime: 21s
Physical memory: 495 MB
Dumping 55 MB: 40 24 8


Without VIMAGE option if_alloc returns fine, I thought if I dont
create any VNETs the system should behave like normal... what is the
problem?

Best regards,

-- 
//Monthadar Al Jaberi
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: panic: bufwrite: buffer is not busy???

2011-02-01 Thread John Baldwin

On Tuesday, February 01, 2011 12:53:36 am Eugene Grosbein wrote:
> On 31.01.2011 22:46, John Baldwin wrote:
> 
> >># gdb kernel
> >> GNU gdb 6.1.1 [FreeBSD]
> >> Copyright 2004 Free Software Foundation, Inc.
> >> GDB is free software, covered by the GNU General Public License, and you 
> >> are
> >> welcome to change it and/or distribute copies of it under certain 
> > conditions.
> >> Type "show copying" to see the conditions.
> >> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> >> This GDB was configured as "amd64-marcel-freebsd"...
> >> (gdb) l *0x803c1315
> >> 0x803c1315 is in ng_address_hook 
> > (/home/src/sys/netgraph/ng_base.c:3504).
> >> 3499 * Quick sanity check..
> >> 3500 * Since a hook holds a reference on it's node, once we 
> >> know
> >> 3501 * that the peer is still connected (even if invalid,) we 
> > know
> >> 3502 * that the peer node is present, though maybe invalid.
> >> 3503 */
> >> 3504if ((hook == NULL) ||
> >> 3505NG_HOOK_NOT_VALID(hook) ||
> >> 3506NG_HOOK_NOT_VALID(peer = NG_HOOK_PEER(hook)) ||
> >> 3507NG_NODE_NOT_VALID(peernode = NG_PEER_NODE(hook))) {
> >> 3508NG_FREE_ITEM(item);
> > 
> > Hmmm.  I think you might have a hardware problem.  Notice the fault 
> > address, 
> > it is 0x20030.  Can you do 'x/i '?
> 
> (gdb) x/i 0x803c1315
> 0x803c1315 :testb  $0x1,0x28(%rdx)

Hmm, offset is 0x28, so the original pointer would have been 0x20008,
which has two bits set.  That is a bit more of a stretch.

-- 
John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Bogus KASSERT() in tcp_output()?

2011-02-01 Thread John Baldwin

On Monday, January 31, 2011 9:40:09 pm Lawrence Stewart wrote:
> On 02/01/11 04:17, John Baldwin wrote:
> > Somewhat related fallout to the bug reported on security@ recently, I think 
> > this KASSERT() in tcp_output() is bogus:
> > 
> > 
> > KASSERT(len + hdrlen + ipoptlen == m_length(m, NULL),
> > ("%s: mbuf chain shorter than expected", __func__));
> > 
> > Specifically, just a few lines earlier in tcp_output() we set the packet 
> > header length to just 'len + hdrlen':
> > 
> > /*
> >  * Put TCP length in extended header, and then
> >  * checksum extended header and data.
> >  */
> > m->m_pkthdr.len = hdrlen + len; /* in6_cksum() need this */
> > 
> > Also, the ipoptions are stored in a separate mbuf chain in the in pcb 
> > (inp_options) that is passed as a separate argument to ip_output().  Given 
> > that, I would think that m_length() should not reflect ipoptlen since it 
> > should not include IP options in that chain?
> > 
> 
> There is some relevant prior discussion on src-committers@ for r212803
> between Andre and Bjoern.

I still don't see where ipoptlen bytes are reserved in the mbuf chain.  After
this block where 'm' is allocated and initialized:

/*
 * Grab a header mbuf, attaching a copy of data to
 * be transmitted, and initialize the header from
 * the template for sends on this connection.
 */
if (len) {
...
m->m_len = hdrlen;
...
if (len <= MHLEN - hdrlen - max_linkhdr) {
...
m->m_len += len;
} else {
m->m_next = m_copy(mb, moff, (int)len);
...
}
...
} else {
...
m->m_len = hdrlen;
}

The length of the mbuf chain headed by 'm' is clearly hdrlen + len.

At no point anywhere do we do any sort of m_prepend() or other operation to
allocate space in the mbuf chain for the IP options.  They are merged in in
ip_output().  I think the only reason this KASSERT() isn't firing in HEAD is
that IP options are rarely used?

Is there an easy way to test a connection with IP options enabled with this
KASSERT() enabled?

-- 
John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: if_alloc() page fault, VirtualBox + VIMAGE

2011-02-01 Thread John Baldwin

On Tuesday, February 01, 2011 9:38:16 am Monthadar Al Jaberi wrote:
> Hi,
> 
> I am running FreeBSD Current (201010) as guest in VirtualBox on an
> Ubuntu 10.04. FreeBSD is compiled with VIMAGE option.
> 
> I loaded a module (my own) that calls if_alloc(IFT_IEEE80211), but I
> get a panic:

Did you compile your module with VIMAGE enabled?

-- 
John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: if_alloc() page fault, VirtualBox + VIMAGE

2011-02-01 Thread Bjoern A. Zeeb


On Tue, 1 Feb 2011, Monthadar Al Jaberi wrote:


Hi,

I am running FreeBSD Current (201010) as guest in VirtualBox on an
Ubuntu 10.04. FreeBSD is compiled with VIMAGE option.

I loaded a module (my own) that calls if_alloc(IFT_IEEE80211), but I
get a panic:


see discussions and patch on freebsd-virtualization@




Kernel page fault with the following non-sleepable locks held:
exclusive rw ifnet_rw (ifnet_rw) r = 0 (0xc0fc8284) locked @
/usr/src/sys/net/if.c:414
KDB: stack backtrace:
db_trace_self_wrapper(c0cf3cdb,1,0,0,0,...) at db_trace_self_wrapper+0x26
kdb_backtrace(19e,1,,c0f9b194,c2fc9a1c,...) at kdb_backtrace+0x2a
_witness_debugger(c0cf6408,c2fc9a30,4,1,0,...) at _witness_debugger+0x25
witness_warn(5,0,c0d2c479,3,c4070d48,...) at witness_warn+0x1fe
trap(c2fc9abc) at trap+0x195
calltrap() at calltrap+0x6
--- trap 0xc, eip = 0xc0970999, esp = 0xc2fc9afc, ebp = 0xc2fc9b1c ---
ifindex_alloc_locked(c0d003cf,c2fc9b36,19e,19e,c15ab714,...) at
ifindex_alloc_locked+0x19
if_alloc(47,c4085a16,3,c0de9614,c32aa780,...) at if_alloc+0x85
wtap_attach(c31a7800,c40857c0,0,4,0,...) at wtap_attach+0x29
new_wtap(c32aa780,0,c2fc9bf0,c083ac9b,c3cbb200,...) at new_wtap+0x9b
wtap_ioctl(c3cbb200,80045701,c31edaa0,1,c3f90b40,...) at wtap_ioctl+0x36
devfs_ioctl_f(c3cfe3b8,80045701,c31edaa0,c3185d00,c3f90b40,...) at
devfs_ioctl_f+0x10b
kern_ioctl(c3f90b40,3,80045701,c31edaa0,fc9cec,...) at kern_ioctl+0x20d
ioctl(c3f90b40,c2fc9cec,c2fc9d28,c0cf5783,0,...) at ioctl+0x134
syscallenter(c3f90b40,c2fc9ce4,c2fc9ce4,0,0,...) at syscallenter+0x263
syscall(c2fc9d28) at syscall+0x34
Xint0x80_syscall() at Xint0x80_syscall+0x21
--- syscall (54, FreeBSD ELF32, ioctl), eip = 0x28181203, esp =
0xbfbfec3c, ebp = 0xbfbfec58 ---


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x18
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc0970999
stack pointer   = 0x28:0xc2fc9afc
frame pointer   = 0x28:0xc2fc9b1c
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1203 (ioctl)
panic: from debugger
cpuid = 0
Uptime: 21s
Physical memory: 495 MB
Dumping 55 MB: 40 24 8


Without VIMAGE option if_alloc returns fine, I thought if I dont
create any VNETs the system should behave like normal... what is the
problem?

Best regards,




--
Bjoern A. Zeeb You have to have visions!
 Going to jail sucks --  All my daemons like it!
  http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: if_alloc() page fault, VirtualBox + VIMAGE

2011-02-01 Thread Monthadar Al Jaberi

On Tue, Feb 1, 2011 at 3:49 PM, John Baldwin  wrote:
> On Tuesday, February 01, 2011 9:38:16 am Monthadar Al Jaberi wrote:
>> Hi,
>>
>> I am running FreeBSD Current (201010) as guest in VirtualBox on an
>> Ubuntu 10.04. FreeBSD is compiled with VIMAGE option.
>>
>> I loaded a module (my own) that calls if_alloc(IFT_IEEE80211), but I
>> get a panic:
>
> Did you compile your module with VIMAGE enabled?

How do I check that? My makefile is:

# Note: It is important to make sure you include the 
makefile after declaring the KMOD and SRCS variables.
.PATH:  ${.CURDIR}/wtap_hal

# Declare Name of kernel module
KMOD=  wtap

# Enumerate Source files for kernel module
SRCS=  if_wtap_module.c if_wtap.c if_medium.c hal.c

# Include kernel module makefile
.include 

>
> --
> John Baldwin
>

Thnx Bjoern, I will check the mailing list

br

-- 
//Monthadar Al Jaberi
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Bogus KASSERT() in tcp_output()?

2011-02-01 Thread Bjoern A. Zeeb


On Tue, 1 Feb 2011, John Baldwin wrote:


On Monday, January 31, 2011 9:40:09 pm Lawrence Stewart wrote:

On 02/01/11 04:17, John Baldwin wrote:

Somewhat related fallout to the bug reported on security@ recently, I think
this KASSERT() in tcp_output() is bogus:


KASSERT(len + hdrlen + ipoptlen == m_length(m, NULL),
("%s: mbuf chain shorter than expected", __func__));

Specifically, just a few lines earlier in tcp_output() we set the packet
header length to just 'len + hdrlen':

/*
 * Put TCP length in extended header, and then
 * checksum extended header and data.
 */
m->m_pkthdr.len = hdrlen + len; /* in6_cksum() need this */

Also, the ipoptions are stored in a separate mbuf chain in the in pcb
(inp_options) that is passed as a separate argument to ip_output().  Given
that, I would think that m_length() should not reflect ipoptlen since it
should not include IP options in that chain?



There is some relevant prior discussion on src-committers@ for r212803
between Andre and Bjoern.


I still don't see where ipoptlen bytes are reserved in the mbuf chain.  After
this block where 'm' is allocated and initialized:

/*
 * Grab a header mbuf, attaching a copy of data to
 * be transmitted, and initialize the header from
 * the template for sends on this connection.
 */
if (len) {
...
m->m_len = hdrlen;
...
if (len <= MHLEN - hdrlen - max_linkhdr) {
...
m->m_len += len;
} else {
m->m_next = m_copy(mb, moff, (int)len);
...
}
...
} else {
...
m->m_len = hdrlen;
}

The length of the mbuf chain headed by 'm' is clearly hdrlen + len.


It is.



At no point anywhere do we do any sort of m_prepend() or other operation to
allocate space in the mbuf chain for the IP options.  They are merged in in
ip_output().  I think the only reason this KASSERT() isn't firing in HEAD is
that IP options are rarely used?


Right and probably reason why I also hit it with IPSec as result of

 718 #ifdef IPSEC
 719 ipoptlen += ipsec_optlen;
 720 #endif

which wasn't because of ipsec_optlen really, I had just stopping
looking too soon back last year.



Is there an easy way to test a connection with IP options enabled with this
KASSERT() enabled?


Yes, see patch at [1], and using my modified KASSERT still I get...
which btw sounds wrong to me as well btw as I wouldn't expect ipoptlen
to be 4 here given the test case.

# ./tcpconnect client 127.0.0.1 12345 1 ipopt

panic: tcp_output: mbuf chain shorter than expected: 0 + 60 + 4 - 0 != 60
cpuid = 2
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
tcp_output() at tcp_output+0x1d01
tcp_usr_connect() at tcp_usr_connect+0x15f
soconnect() at soconnect+0x14f
kern_connect() at kern_connect+0x12e
connect() at connect+0x41
syscallenter() at syscallenter+0x1cb
syscall() at syscall+0x4c
Xfast_syscall() at Xfast_syscall+0xe2
--- syscall (98, FreeBSD ELF64, connect), rip = 0x80072934c, rsp = 
0x7fffe9d8, rbp = 0x3 ---

/bz

References:

[1] http://people.freebsd.org/~bz/20110201-01-tcpconnect-ipopt.diff

--
Bjoern A. Zeeb You have to have visions!
 Going to jail sucks --  All my daemons like it!
  http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: if_alloc() page fault, VirtualBox + VIMAGE

2011-02-01 Thread Bjoern A. Zeeb


On Tue, 1 Feb 2011, Monthadar Al Jaberi wrote:


On Tue, Feb 1, 2011 at 3:49 PM, John Baldwin  wrote:

On Tuesday, February 01, 2011 9:38:16 am Monthadar Al Jaberi wrote:

Hi,

I am running FreeBSD Current (201010) as guest in VirtualBox on an
Ubuntu 10.04. FreeBSD is compiled with VIMAGE option.

I loaded a module (my own) that calls if_alloc(IFT_IEEE80211), but I
get a panic:


Did you compile your module with VIMAGE enabled?


How do I check that? My makefile is:

# Note: It is important to make sure you include the 
makefile after declaring the KMOD and SRCS variables.
.PATH:  ${.CURDIR}/wtap_hal

# Declare Name of kernel module
KMOD=  wtap

# Enumerate Source files for kernel module
SRCS=  if_wtap_module.c if_wtap.c if_medium.c hal.c

# Include kernel module makefile
.include 



--
John Baldwin



Thnx Bjoern, I will check the mailing list


Never mind; I was clearly not awake enough.  The patch etc. is for
virtualbox on a FreeBSD with VIMAGE.  But people will be able to help
you with your module there as well if it's related to a VIMAGE kernel.

/bz

--
Bjoern A. Zeeb You have to have visions!
 Going to jail sucks --  All my daemons like it!
  http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: panic: bufwrite: buffer is not busy???

2011-02-01 Thread Eugene Grosbein

On 31.01.2011 14:20, Julian Elischer wrote:

> replace with:
> 
> 3504if ((hook == NULL) ||
> 3505NG_HOOK_NOT_VALID(hook) ||
>  ((peer = NG_HOOK_PEER(hook)) == NULL) ||
> 3506NG_HOOK_NOT_VALID(peer) ||
>  ((peernode = NG_PEER_NODE(hook)) == NULL) ||
> 3507NG_NODE_NOT_VALID(peernode)) {
>  if (peer)
>kassert((peernode != NULL), ("peer node NULL 
> wile peer hook exists"));
> 3508NG_FREE_ITEM(item);

This day I have updated panicing router to RELENG_8 and combined changes 
supposed
by Julian and Gleb. After 8 hours it has just paniced again and could not finish
to write crashdump again:

Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address   = 0x63
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x803d4ccd
stack pointer   = 0x28:0xff80ebffc600
frame pointer   = 0x28:0xff80ebffc680
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 2390 (mpd5)
trap number = 12
panic: page fault
cpuid = 3
Uptime: 8h3m51s
Dumping 4087 MB (3 chunks)
  chunk 0: 1MB (150 pages) ... ok
  chunk 1: 3575MB (915088 pages) 3559 3543panic: bufwrite: buffer is not busy???
cpuid = 3
Uptime: 8h3m52s
Automatic reboot in 15 seconds - press a key on the console to abort

# gdb kernel
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
(gdb) l *0x803d4ccd
0x803d4ccd is in ng_pppoe_disconnect (netgraph.h:191).
186 int line);
187
188 static __inline void
189 _chkhook(hook_p hook, char *file, int line)
190 {
191 if (hook->hk_magic != HK_MAGIC) {
192 printf("Accessing freed hook ");
193 dumphook(hook, file, line);
194 }
195 hook->lastline = line;
(gdb) x/i 0x803d4ccd
0x803d4ccd :   cmpl   $0x78573011,0x64(%rbx)


Here is a patch I've applied to the tree:

--- sys/netgraph/ng_base.c.orig 2011-02-01 12:34:09.0 +0600
+++ sys/netgraph/ng_base.c  2011-02-01 12:00:17.0 +0600
@@ -1643,10 +1643,8 @@
node_p *destp, hook_p *lasthook)
 {
charfullpath[NG_PATHSIZ];
-   char   *nodename, *path, pbuf[2];
+   char   *nodename, *path;
node_p  node, oldnode;
-   char   *cp;
-   hook_p hook = NULL;
 
/* Initialize */
if (destp == NULL) {
@@ -1664,11 +1662,6 @@
TRAP_ERROR();
return EINVAL;
}
-   if (path == NULL) {
-   pbuf[0] = '.';  /* Needs to be writable */
-   pbuf[1] = '\0';
-   path = pbuf;
-   }
 
/*
 * For an absolute address, jump to the starting node.
@@ -1690,41 +1683,41 @@
NG_NODE_REF(node);
}
 
+   if (path == NULL) {
+   if (lasthook != NULL)
+   *lasthook = NULL;
+   *destp = node;
+   return (0);
+   }
+
/*
 * Now follow the sequence of hooks
-* XXX
-* We actually cannot guarantee that the sequence
-* is not being demolished as we crawl along it
-* without extra-ordinary locking etc.
-* So this is a bit dodgy to say the least.
-* We can probably hold up some things by holding
-* the nodelist mutex for the time of this
-* crawl if we wanted.. At least that way we wouldn't have to
-* worry about the nodes disappearing, but the hooks would still
-* be a problem.
+*
+* XXXGL: The path may demolish as we go the sequence, but if
+* we hold the topology mutex at critical places, then, I hope,
+* we would always have valid pointers in hand, although the
+* path behind us may no longer exist.
 */
-   for (cp = path; node != NULL && *cp != '\0'; ) {
+   for (;;) {
+   hook_p hook;
char *segment;
 
/*
 * Break out the next path segment. Replace the dot we just
-* found with a NUL; "cp" points to the next segment (or the
+* found with a NUL; "path" points to the next segment (or the
 * NUL at the end).
 */
-   for (segment = cp; *cp != '\0';

Re: panic: bufwrite: buffer is not busy???

2011-02-01 Thread Eugene Grosbein

On 02.02.2011 00:30, Eugene Grosbein wrote:

> Fatal trap 12: page fault while in kernel mode
> cpuid = 3; apic id = 06
> fault virtual address   = 0x63
> fault code  = supervisor read data, page not present
> instruction pointer = 0x20:0x803d4ccd
> stack pointer   = 0x28:0xff80ebffc600
> frame pointer   = 0x28:0xff80ebffc680
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 2390 (mpd5)
> trap number = 12
> panic: page fault
> cpuid = 3
> Uptime: 8h3m51s
> Dumping 4087 MB (3 chunks)
>   chunk 0: 1MB (150 pages) ... ok
>   chunk 1: 3575MB (915088 pages) 3559 3543panic: bufwrite: buffer is not 
> busy???
> cpuid = 3
> Uptime: 8h3m52s
> Automatic reboot in 15 seconds - press a key on the console to abort
> 
> # gdb kernel
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
> (gdb) l *0x803d4ccd
> 0x803d4ccd is in ng_pppoe_disconnect (netgraph.h:191).
> 186 int line);
> 187
> 188 static __inline void
> 189 _chkhook(hook_p hook, char *file, int line)
> 190 {
> 191 if (hook->hk_magic != HK_MAGIC) {
> 192 printf("Accessing freed hook ");
> 193 dumphook(hook, file, line);
> 194 }
> 195 hook->lastline = line;
> (gdb) x/i 0x803d4ccd
> 0x803d4ccd :   cmpl   $0x78573011,0x64(%rbx)

Forgot to mention, this time kernel has options NETGRAPH_DEBUG.

Eugene Grosbein
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: panic: bufwrite: buffer is not busy???

2011-02-01 Thread Gleb Smirnoff

On Wed, Feb 02, 2011 at 12:30:20AM +0600, Eugene Grosbein wrote:
E> On 31.01.2011 14:20, Julian Elischer wrote:
E> 
E> > replace with:
E> > 
E> > 3504if ((hook == NULL) ||
E> > 3505NG_HOOK_NOT_VALID(hook) ||
E> >  ((peer = NG_HOOK_PEER(hook)) == NULL) ||
E> > 3506NG_HOOK_NOT_VALID(peer) ||
E> >  ((peernode = NG_PEER_NODE(hook)) == NULL) ||
E> > 3507NG_NODE_NOT_VALID(peernode)) {
E> >  if (peer)
E> >kassert((peernode != NULL), ("peer node 
NULL wile peer hook exists"));
E> > 3508NG_FREE_ITEM(item);
E> 
E> This day I have updated panicing router to RELENG_8 and combined changes 
supposed
E> by Julian and Gleb. After 8 hours it has just paniced again and could not 
finish
E> to write crashdump again:
E> 
E> Fatal trap 12: page fault while in kernel mode
E> cpuid = 3; apic id = 06
E> fault virtual address   = 0x63
E> fault code  = supervisor read data, page not present
E> instruction pointer = 0x20:0x803d4ccd
E> stack pointer   = 0x28:0xff80ebffc600
E> frame pointer   = 0x28:0xff80ebffc680
E> code segment= base 0x0, limit 0xf, type 0x1b
E> = DPL 0, pres 1, long 1, def32 0, gran 1
E> processor eflags= interrupt enabled, resume, IOPL = 0
E> current process = 2390 (mpd5)
E> trap number = 12
E> panic: page fault
E> cpuid = 3
E> Uptime: 8h3m51s
E> Dumping 4087 MB (3 chunks)
E>   chunk 0: 1MB (150 pages) ... ok
E>   chunk 1: 3575MB (915088 pages) 3559 3543panic: bufwrite: buffer is not 
busy???
E> cpuid = 3
E> Uptime: 8h3m52s
E> Automatic reboot in 15 seconds - press a key on the console to abort
E> 
E> # gdb kernel
E> GNU gdb 6.1.1 [FreeBSD]
E> Copyright 2004 Free Software Foundation, Inc.
E> GDB is free software, covered by the GNU General Public License, and you are
E> welcome to change it and/or distribute copies of it under certain conditions.
E> Type "show copying" to see the conditions.
E> There is absolutely no warranty for GDB.  Type "show warranty" for details.
E> This GDB was configured as "amd64-marcel-freebsd"...
E> (gdb) l *0x803d4ccd
E> 0x803d4ccd is in ng_pppoe_disconnect (netgraph.h:191).
E> 186 int line);
E> 187
E> 188 static __inline void
E> 189 _chkhook(hook_p hook, char *file, int line)
E> 190 {
E> 191 if (hook->hk_magic != HK_MAGIC) {
E> 192 printf("Accessing freed hook ");
E> 193 dumphook(hook, file, line);
E> 194 }
E> 195 hook->lastline = line;
E> (gdb) x/i 0x803d4ccd
E> 0x803d4ccd :   cmpl   $0x78573011,0x64(%rbx)

This looks like ng_pppoe_disconnect() was called with NULL argument.

Can you add KDB_TRACE option to kernel? Your boxes for some reason can't
dump core, but with this option we will have at least trace.

-- 
Totus tuus, Glebius.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: panic: bufwrite: buffer is not busy???

2011-02-01 Thread John Baldwin

On Tuesday, February 01, 2011 1:30:20 pm Eugene Grosbein wrote:
> On 31.01.2011 14:20, Julian Elischer wrote:
> 
> > replace with:
> > 
> > 3504if ((hook == NULL) ||
> > 3505NG_HOOK_NOT_VALID(hook) ||
> >  ((peer = NG_HOOK_PEER(hook)) == NULL) ||
> > 3506NG_HOOK_NOT_VALID(peer) ||
> >  ((peernode = NG_PEER_NODE(hook)) == NULL) ||
> > 3507NG_NODE_NOT_VALID(peernode)) {
> >  if (peer)
> >kassert((peernode != NULL), ("peer node NULL 
> > wile peer hook exists"));
> > 3508NG_FREE_ITEM(item);
> 
> This day I have updated panicing router to RELENG_8 and combined changes 
> supposed
> by Julian and Gleb. After 8 hours it has just paniced again and could not 
> finish
> to write crashdump again:
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 3; apic id = 06
> fault virtual address   = 0x63
> fault code  = supervisor read data, page not present
> instruction pointer = 0x20:0x803d4ccd
> stack pointer   = 0x28:0xff80ebffc600
> frame pointer   = 0x28:0xff80ebffc680
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 2390 (mpd5)
> trap number = 12
> panic: page fault
> cpuid = 3
> Uptime: 8h3m51s
> Dumping 4087 MB (3 chunks)
>   chunk 0: 1MB (150 pages) ... ok
>   chunk 1: 3575MB (915088 pages) 3559 3543panic: bufwrite: buffer is not 
> busy???
> cpuid = 3
> Uptime: 8h3m52s
> Automatic reboot in 15 seconds - press a key on the console to abort
> 
> # gdb kernel
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
> (gdb) l *0x803d4ccd
> 0x803d4ccd is in ng_pppoe_disconnect (netgraph.h:191).
> 186 int line);
> 187
> 188 static __inline void
> 189 _chkhook(hook_p hook, char *file, int line)
> 190 {
> 191 if (hook->hk_magic != HK_MAGIC) {
> 192 printf("Accessing freed hook ");
> 193 dumphook(hook, file, line);
> 194 }
> 195 hook->lastline = line;
> (gdb) x/i 0x803d4ccd
> 0x803d4ccd :   cmpl   $0x78573011,0x64(%rbx)

So %rbx (hook) was -1 here.  Perhaps the locking is insufficient for whatever
structure contains the hook pointer?

-- 
John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: panic: bufwrite: buffer is not busy???

2011-02-01 Thread Eugene Grosbein

On 02.02.2011 00:50, Gleb Smirnoff wrote:

> This looks like ng_pppoe_disconnect() was called with NULL argument.
> 
> Can you add KDB_TRACE option to kernel? Your boxes for some reason can't
> dump core, but with this option we will have at least trace.

Of course. I was pretty sure I have this option but it's commented out :-(

Eugene Grosbein
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Sean Bruno

On Fri, 2011-01-28 at 08:10 -0800, Mike Tancsa wrote:
> On 1/23/2011 10:21 AM, Mike Tancsa wrote:
> > On 1/21/2011 4:21 AM, Jan Koum wrote:
> > One other thing I noticed is that when the nic is in its hung state, the
> > WOL option is gone ?
> > 
> > e.g
> > 
> > em1: flags=8843 metric 0 mtu 1500
> > options=19b
> > ether 00:15:17:ed:68:a4
> > 
> > vs
> > 
> > 
> > em1: flags=8843 metric 0 mtu 1500
> > 
> > options=219b
> > ether 00:15:17:ed:68:a4
> 
> 
> Another hang last night :(
> 
> Whats really strange is that the WOL_MAGIC and TSO4 got turned back on
> somehow ? I had explicitly turned it off, but when the NIC was in its
> bad state
> 
> em1: flags=8843 metric 0 mtu 1500
> options=2198
> 
> ... its back on along with TSO?  Not sure if its coincidence or a side
> effect or what.  For now, I have had to re-purpose this nic to something
> else.
> 
> debug info shows
> 
> Jan 28 00:25:10 backup3 kernel: Interface is RUNNING and INACTIVE
> Jan 28 00:25:10 backup3 kernel: em1: hw tdh = 625, hw tdt = 625
> Jan 28 00:25:10 backup3 kernel: em1: hw rdh = 903, hw rdt = 903
> Jan 28 00:25:10 backup3 kernel: em1: Tx Queue Status = 0
> Jan 28 00:25:10 backup3 kernel: em1: TX descriptors avail = 1024
> Jan 28 00:25:10 backup3 kernel: em1: Tx Descriptors avail failure = 0
> Jan 28 00:25:10 backup3 kernel: em1: RX discarded packets = 0
> Jan 28 00:25:10 backup3 kernel: em1: RX Next to Check = 903
> Jan 28 00:25:10 backup3 kernel: em1: RX Next to Refresh = 904
> Jan 28 00:25:27 backup3 kernel: em1: link state changed to DOWN
> Jan 28 00:25:30 backup3 kernel: em1: link state changed to UP
> 
> 
>   ---Mike


I'm trying to get some more testing done regarding my suggestions around
the OACTIVE assertions in the driver.  More or less, it looks like
intense periods of activity can push the driver into the OACTIVE hold
off state and the logic isn't quite right in igb(4) or em(4) to handle
it.

I suspect that something like this modification to igb(4) may be
required for em(4).

Comments?

Sean
--- p4/freebsd_7/src/sys/dev/e1000/if_igb.c	2010-12-23 11:06:17.127417000 -0800
+++ p4/ybsd_7/src/sys/dev/e1000/if_igb.c	2010-12-23 11:28:50.476993000 -0800
@@ -784,10 +784,14 @@
 		return;
 
 	/* Call cleanup if number of TX descriptors low */
+#if 0
 	if (txr->tx_avail <= IGB_TX_CLEANUP_THRESHOLD)
 		igb_txeof(txr);
+#endif
 
 	while (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) {
+		if (txr->tx_avail <= IGB_TX_CLEANUP_THRESHOLD)
+			igb_txeof(txr);
 		if (txr->tx_avail <= IGB_TX_OP_THRESHOLD) {
 			ifp->if_drv_flags |= IFF_DRV_OACTIVE;
 			break;
@@ -1162,10 +1166,10 @@
 		IGB_TX_LOCK(txr);
 		if (igb_txeof(txr))
 			more = TRUE;
-		if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
-			igb_start_locked(txr, ifp);
+		/*if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) Pointless as igb_start_locked() checks this right off the bat*/
+		igb_start_locked(txr, ifp);
 		IGB_TX_UNLOCK(txr);
-		if (more) {
+		if (more || (ifp->if_drv_flags & IFF_DRV_OACTIVE)) {
 			taskqueue_enqueue(que->tq, &que->que_task);
 			return;
 		}
@@ -1361,7 +1370,7 @@
 
 no_calc:
 	/* Schedule a clean task if needed*/
-	if (more_tx || more_rx) 
+	if (more_tx || more_rx || (ifp->if_drv_flags & IFF_DRV_OACTIVE))
 		taskqueue_enqueue(que->tq, &que->que_task);
 	else
 		/* Reenable this interrupt */
@@ -1535,6 +1545,14 @@
 	if (m_head->m_flags & M_VLANTAG)
 		cmd_type_len |= E1000_ADVTXD_DCMD_VLE;
 
+/*
+ * We just did this in before invocation, seems completely 
+ * redundant, igb_handle_queue -> igb_txeof
+ * Pretty sure this is impossible as we check for the 
+ * IGB_TX_CLEANUP_THRESHOLD in igb_start_locked() which happens
+ * before this func in invoked
+ */
+#if 0
 /*
  * Force a cleanup if number of TX descriptors
  * available hits the threshold
@@ -1547,6 +1565,7 @@
 			return (ENOBUFS);
 		}
 	}
+#endif
 
 	/*
  * Map the packet for DMA.
 
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Jack Vogel

At this point I'm open to any ideas, this sounds like a good one Sean,
thanks.
Mike, you want to test this ?

Jack


On Tue, Feb 1, 2011 at 11:56 AM, Sean Bruno  wrote:

> On Fri, 2011-01-28 at 08:10 -0800, Mike Tancsa wrote:
> > On 1/23/2011 10:21 AM, Mike Tancsa wrote:
> > > On 1/21/2011 4:21 AM, Jan Koum wrote:
> > > One other thing I noticed is that when the nic is in its hung state,
> the
> > > WOL option is gone ?
> > >
> > > e.g
> > >
> > > em1: flags=8843 metric 0 mtu
> 1500
> > >
> options=19b
> > > ether 00:15:17:ed:68:a4
> > >
> > > vs
> > >
> > >
> > > em1: flags=8843 metric 0 mtu
> 1500
> > >
> > >
> options=219b
> > > ether 00:15:17:ed:68:a4
> >
> >
> > Another hang last night :(
> >
> > Whats really strange is that the WOL_MAGIC and TSO4 got turned back on
> > somehow ? I had explicitly turned it off, but when the NIC was in its
> > bad state
> >
> > em1: flags=8843 metric 0 mtu 1500
> > options=2198
> >
> > ... its back on along with TSO?  Not sure if its coincidence or a side
> > effect or what.  For now, I have had to re-purpose this nic to something
> > else.
> >
> > debug info shows
> >
> > Jan 28 00:25:10 backup3 kernel: Interface is RUNNING and INACTIVE
> > Jan 28 00:25:10 backup3 kernel: em1: hw tdh = 625, hw tdt = 625
> > Jan 28 00:25:10 backup3 kernel: em1: hw rdh = 903, hw rdt = 903
> > Jan 28 00:25:10 backup3 kernel: em1: Tx Queue Status = 0
> > Jan 28 00:25:10 backup3 kernel: em1: TX descriptors avail = 1024
> > Jan 28 00:25:10 backup3 kernel: em1: Tx Descriptors avail failure = 0
> > Jan 28 00:25:10 backup3 kernel: em1: RX discarded packets = 0
> > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Check = 903
> > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Refresh = 904
> > Jan 28 00:25:27 backup3 kernel: em1: link state changed to DOWN
> > Jan 28 00:25:30 backup3 kernel: em1: link state changed to UP
> >
> >
> >   ---Mike
>
>
> I'm trying to get some more testing done regarding my suggestions around
> the OACTIVE assertions in the driver.  More or less, it looks like
> intense periods of activity can push the driver into the OACTIVE hold
> off state and the logic isn't quite right in igb(4) or em(4) to handle
> it.
>
> I suspect that something like this modification to igb(4) may be
> required for em(4).
>
> Comments?
>
> Sean
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Mike Tancsa

On 2/1/2011 3:05 PM, Jack Vogel wrote:
> At this point I'm open to any ideas, this sounds like a good one Sean,
> thanks.
> Mike, you want to test this ?

Sure, I am feeling lucky ;-)  If someone generates the appropriate em
diffs for me, I will apply on the box that sees this issue the most.

---Mike

> 
> Jack
> 
> 
> On Tue, Feb 1, 2011 at 11:56 AM, Sean Bruno  wrote:
> 
>> On Fri, 2011-01-28 at 08:10 -0800, Mike Tancsa wrote:
>>> On 1/23/2011 10:21 AM, Mike Tancsa wrote:
 On 1/21/2011 4:21 AM, Jan Koum wrote:
 One other thing I noticed is that when the nic is in its hung state,
>> the
 WOL option is gone ?

 e.g

 em1: flags=8843 metric 0 mtu
>> 1500

>> options=19b
 ether 00:15:17:ed:68:a4

 vs


 em1: flags=8843 metric 0 mtu
>> 1500


>> options=219b
 ether 00:15:17:ed:68:a4
>>>
>>>
>>> Another hang last night :(
>>>
>>> Whats really strange is that the WOL_MAGIC and TSO4 got turned back on
>>> somehow ? I had explicitly turned it off, but when the NIC was in its
>>> bad state
>>>
>>> em1: flags=8843 metric 0 mtu 1500
>>> options=2198
>>>
>>> ... its back on along with TSO?  Not sure if its coincidence or a side
>>> effect or what.  For now, I have had to re-purpose this nic to something
>>> else.
>>>
>>> debug info shows
>>>
>>> Jan 28 00:25:10 backup3 kernel: Interface is RUNNING and INACTIVE
>>> Jan 28 00:25:10 backup3 kernel: em1: hw tdh = 625, hw tdt = 625
>>> Jan 28 00:25:10 backup3 kernel: em1: hw rdh = 903, hw rdt = 903
>>> Jan 28 00:25:10 backup3 kernel: em1: Tx Queue Status = 0
>>> Jan 28 00:25:10 backup3 kernel: em1: TX descriptors avail = 1024
>>> Jan 28 00:25:10 backup3 kernel: em1: Tx Descriptors avail failure = 0
>>> Jan 28 00:25:10 backup3 kernel: em1: RX discarded packets = 0
>>> Jan 28 00:25:10 backup3 kernel: em1: RX Next to Check = 903
>>> Jan 28 00:25:10 backup3 kernel: em1: RX Next to Refresh = 904
>>> Jan 28 00:25:27 backup3 kernel: em1: link state changed to DOWN
>>> Jan 28 00:25:30 backup3 kernel: em1: link state changed to UP
>>>
>>>
>>>   ---Mike
>>
>>
>> I'm trying to get some more testing done regarding my suggestions around
>> the OACTIVE assertions in the driver.  More or less, it looks like
>> intense periods of activity can push the driver into the OACTIVE hold
>> off state and the logic isn't quite right in igb(4) or em(4) to handle
>> it.
>>
>> I suspect that something like this modification to igb(4) may be
>> required for em(4).
>>
>> Comments?
>>
>> Sean
>>
> 


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Sean Bruno

On Tue, 2011-02-01 at 12:05 -0800, Jack Vogel wrote:
> At this point I'm open to any ideas, this sounds like a good one Sean,
> thanks.
> Mike, you want to test this ?
> 
> Jack
> 
> 
> On Tue, Feb 1, 2011 at 11:56 AM, Sean Bruno 
> wrote:
> 
> On Fri, 2011-01-28 at 08:10 -0800, Mike Tancsa wrote:
> > On 1/23/2011 10:21 AM, Mike Tancsa wrote:
> > > On 1/21/2011 4:21 AM, Jan Koum wrote:
> > > One other thing I noticed is that when the nic is in its
> hung state, the
> > > WOL option is gone ?
> > >
> > > e.g
> > >
> > > em1: flags=8843
> metric 0 mtu 1500
> > >
> options=19b
> > > ether 00:15:17:ed:68:a4
> > >
> > > vs
> > >
> > >
> > > em1: flags=8843
> metric 0 mtu 1500
> > >
> > >
> 
> options=219b
> > > ether 00:15:17:ed:68:a4
> >
> >
> > Another hang last night :(
> >
> > Whats really strange is that the WOL_MAGIC and TSO4 got
> turned back on
> > somehow ? I had explicitly turned it off, but when the NIC
> was in its
> > bad state
> >
> > em1: flags=8843
> metric 0 mtu 1500
> >
> options=2198
> >
> > ... its back on along with TSO?  Not sure if its coincidence
> or a side
> > effect or what.  For now, I have had to re-purpose this nic
> to something
> > else.
> >
> > debug info shows
> >
> > Jan 28 00:25:10 backup3 kernel: Interface is RUNNING and
> INACTIVE
> > Jan 28 00:25:10 backup3 kernel: em1: hw tdh = 625, hw tdt =
> 625
> > Jan 28 00:25:10 backup3 kernel: em1: hw rdh = 903, hw rdt =
> 903
> > Jan 28 00:25:10 backup3 kernel: em1: Tx Queue Status = 0
> > Jan 28 00:25:10 backup3 kernel: em1: TX descriptors avail =
> 1024
> > Jan 28 00:25:10 backup3 kernel: em1: Tx Descriptors avail
> failure = 0
> > Jan 28 00:25:10 backup3 kernel: em1: RX discarded packets =
> 0
> > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Check = 903
> > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Refresh =
> 904
> > Jan 28 00:25:27 backup3 kernel: em1: link state changed to
> DOWN
> > Jan 28 00:25:30 backup3 kernel: em1: link state changed to
> UP
> >
> >
> >   ---Mike
> 
> 
> 
> I'm trying to get some more testing done regarding my
> suggestions around
> the OACTIVE assertions in the driver.  More or less, it looks
> like
> intense periods of activity can push the driver into the
> OACTIVE hold
> off state and the logic isn't quite right in igb(4) or em(4)
> to handle
> it.
> 
> I suspect that something like this modification to igb(4) may
> be
> required for em(4).
> 
> Comments?
> 
> Sean
> 


Does the logic I've implemented look sane?

Sean

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Jack Vogel

Looks good, except I don't like code #if 0'd out, I'll make an if_em.c to
try and
send it shortly.

Jack


On Tue, Feb 1, 2011 at 12:19 PM, Sean Bruno  wrote:

> On Tue, 2011-02-01 at 12:05 -0800, Jack Vogel wrote:
> > At this point I'm open to any ideas, this sounds like a good one Sean,
> > thanks.
> > Mike, you want to test this ?
> >
> > Jack
> >
> >
> > On Tue, Feb 1, 2011 at 11:56 AM, Sean Bruno 
> > wrote:
> >
> > On Fri, 2011-01-28 at 08:10 -0800, Mike Tancsa wrote:
> > > On 1/23/2011 10:21 AM, Mike Tancsa wrote:
> > > > On 1/21/2011 4:21 AM, Jan Koum wrote:
> > > > One other thing I noticed is that when the nic is in its
> > hung state, the
> > > > WOL option is gone ?
> > > >
> > > > e.g
> > > >
> > > > em1: flags=8843
> > metric 0 mtu 1500
> > > >
> >
> options=19b
> > > > ether 00:15:17:ed:68:a4
> > > >
> > > > vs
> > > >
> > > >
> > > > em1: flags=8843
> > metric 0 mtu 1500
> > > >
> > > >
> >
> options=219b
> > > > ether 00:15:17:ed:68:a4
> > >
> > >
> > > Another hang last night :(
> > >
> > > Whats really strange is that the WOL_MAGIC and TSO4 got
> > turned back on
> > > somehow ? I had explicitly turned it off, but when the NIC
> > was in its
> > > bad state
> > >
> > > em1: flags=8843
> > metric 0 mtu 1500
> > >
> > options=2198
> > >
> > > ... its back on along with TSO?  Not sure if its coincidence
> > or a side
> > > effect or what.  For now, I have had to re-purpose this nic
> > to something
> > > else.
> > >
> > > debug info shows
> > >
> > > Jan 28 00:25:10 backup3 kernel: Interface is RUNNING and
> > INACTIVE
> > > Jan 28 00:25:10 backup3 kernel: em1: hw tdh = 625, hw tdt =
> > 625
> > > Jan 28 00:25:10 backup3 kernel: em1: hw rdh = 903, hw rdt =
> > 903
> > > Jan 28 00:25:10 backup3 kernel: em1: Tx Queue Status = 0
> > > Jan 28 00:25:10 backup3 kernel: em1: TX descriptors avail =
> > 1024
> > > Jan 28 00:25:10 backup3 kernel: em1: Tx Descriptors avail
> > failure = 0
> > > Jan 28 00:25:10 backup3 kernel: em1: RX discarded packets =
> > 0
> > > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Check = 903
> > > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Refresh =
> > 904
> > > Jan 28 00:25:27 backup3 kernel: em1: link state changed to
> > DOWN
> > > Jan 28 00:25:30 backup3 kernel: em1: link state changed to
> > UP
> > >
> > >
> > >   ---Mike
> >
> >
> >
> > I'm trying to get some more testing done regarding my
> > suggestions around
> > the OACTIVE assertions in the driver.  More or less, it looks
> > like
> > intense periods of activity can push the driver into the
> > OACTIVE hold
> > off state and the logic isn't quite right in igb(4) or em(4)
> > to handle
> > it.
> >
> > I suspect that something like this modification to igb(4) may
> > be
> > required for em(4).
> >
> > Comments?
> >
> > Sean
> >
>
>
> Does the logic I've implemented look sane?
>
> Sean
>
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Jack Vogel

Mike, just to remind me, are you running these 82574 adapters with MSIX ?

Jack


On Tue, Feb 1, 2011 at 12:37 PM, Jack Vogel  wrote:

> Looks good, except I don't like code #if 0'd out, I'll make an if_em.c to
> try and
> send it shortly.
>
> Jack
>
>
>
> On Tue, Feb 1, 2011 at 12:19 PM, Sean Bruno  wrote:
>
>> On Tue, 2011-02-01 at 12:05 -0800, Jack Vogel wrote:
>> > At this point I'm open to any ideas, this sounds like a good one Sean,
>> > thanks.
>> > Mike, you want to test this ?
>> >
>> > Jack
>> >
>> >
>> > On Tue, Feb 1, 2011 at 11:56 AM, Sean Bruno 
>> > wrote:
>> >
>> > On Fri, 2011-01-28 at 08:10 -0800, Mike Tancsa wrote:
>> > > On 1/23/2011 10:21 AM, Mike Tancsa wrote:
>> > > > On 1/21/2011 4:21 AM, Jan Koum wrote:
>> > > > One other thing I noticed is that when the nic is in its
>> > hung state, the
>> > > > WOL option is gone ?
>> > > >
>> > > > e.g
>> > > >
>> > > > em1: flags=8843
>> > metric 0 mtu 1500
>> > > >
>> >
>> options=19b
>> > > > ether 00:15:17:ed:68:a4
>> > > >
>> > > > vs
>> > > >
>> > > >
>> > > > em1: flags=8843
>> > metric 0 mtu 1500
>> > > >
>> > > >
>> >
>> options=219b
>> > > > ether 00:15:17:ed:68:a4
>> > >
>> > >
>> > > Another hang last night :(
>> > >
>> > > Whats really strange is that the WOL_MAGIC and TSO4 got
>> > turned back on
>> > > somehow ? I had explicitly turned it off, but when the NIC
>> > was in its
>> > > bad state
>> > >
>> > > em1: flags=8843
>> > metric 0 mtu 1500
>> > >
>> > options=2198
>> > >
>> > > ... its back on along with TSO?  Not sure if its coincidence
>> > or a side
>> > > effect or what.  For now, I have had to re-purpose this nic
>> > to something
>> > > else.
>> > >
>> > > debug info shows
>> > >
>> > > Jan 28 00:25:10 backup3 kernel: Interface is RUNNING and
>> > INACTIVE
>> > > Jan 28 00:25:10 backup3 kernel: em1: hw tdh = 625, hw tdt =
>> > 625
>> > > Jan 28 00:25:10 backup3 kernel: em1: hw rdh = 903, hw rdt =
>> > 903
>> > > Jan 28 00:25:10 backup3 kernel: em1: Tx Queue Status = 0
>> > > Jan 28 00:25:10 backup3 kernel: em1: TX descriptors avail =
>> > 1024
>> > > Jan 28 00:25:10 backup3 kernel: em1: Tx Descriptors avail
>> > failure = 0
>> > > Jan 28 00:25:10 backup3 kernel: em1: RX discarded packets =
>> > 0
>> > > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Check = 903
>> > > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Refresh =
>> > 904
>> > > Jan 28 00:25:27 backup3 kernel: em1: link state changed to
>> > DOWN
>> > > Jan 28 00:25:30 backup3 kernel: em1: link state changed to
>> > UP
>> > >
>> > >
>> > >   ---Mike
>> >
>> >
>> >
>> > I'm trying to get some more testing done regarding my
>> > suggestions around
>> > the OACTIVE assertions in the driver.  More or less, it looks
>> > like
>> > intense periods of activity can push the driver into the
>> > OACTIVE hold
>> > off state and the logic isn't quite right in igb(4) or em(4)
>> > to handle
>> > it.
>> >
>> > I suspect that something like this modification to igb(4) may
>> > be
>> > required for em(4).
>> >
>> > Comments?
>> >
>> > Sean
>> >
>>
>>
>> Does the logic I've implemented look sane?
>>
>> Sean
>>
>>
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Mike Tancsa

On 2/1/2011 3:55 PM, Jack Vogel wrote:
> Mike, just to remind me, are you running these 82574 adapters with MSIX ?

Yes. Board is an Intel MB (S3420GPX). 8G RAM, AMD64. Kernel from a few
days ago


0(backup3)# vmstat -i | grep em1
irq257: em1:rx 0   113712958159
irq258: em1:tx 096623551135
irq259: em1:link 488  0
0(backup3)# grep ^em1 /var/run/dmesg.boot
em1:  port 0x2000-0x201f mem
0xb410-0xb411,0xb412-0xb4123fff irq 16 at device 0.0 on pci10
em1: Using MSIX interrupts with 3 vectors
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: Ethernet address: 00:15:17:ed:68:a4
em1: link state changed to UP
0(backup3)#
em1@pci0:10:0:0:class=0x02 card=0x34ec8086 chip=0x10d38086
rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
class  = network
subclass   = ethernet
cap 01[c8] = powerspec 2  supports D0 D3  current D0
cap 05[d0] = MSI supports 1 message, 64 bit
cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 0003[140] = Serial 1 001517ed68a4

---Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Jack Vogel

But you aren't defining EM_MULTIQUEUE are you? (its not on by default)

Jack


On Tue, Feb 1, 2011 at 1:05 PM, Mike Tancsa  wrote:

> On 2/1/2011 3:55 PM, Jack Vogel wrote:
> > Mike, just to remind me, are you running these 82574 adapters with MSIX ?
>
> Yes. Board is an Intel MB (S3420GPX). 8G RAM, AMD64. Kernel from a few
> days ago
>
>
> 0(backup3)# vmstat -i | grep em1
> irq257: em1:rx 0   113712958159
> irq258: em1:tx 096623551135
> irq259: em1:link 488  0
> 0(backup3)# grep ^em1 /var/run/dmesg.boot
> em1:  port 0x2000-0x201f mem
> 0xb410-0xb411,0xb412-0xb4123fff irq 16 at device 0.0 on pci10
> em1: Using MSIX interrupts with 3 vectors
> em1: [ITHREAD]
> em1: [ITHREAD]
> em1: [ITHREAD]
> em1: Ethernet address: 00:15:17:ed:68:a4
> em1: link state changed to UP
> 0(backup3)#
> em1@pci0:10:0:0:class=0x02 card=0x34ec8086 chip=0x10d38086
> rev=0x00 hdr=0x00
>vendor = 'Intel Corporation'
>device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
>class  = network
>subclass   = ethernet
>cap 01[c8] = powerspec 2  supports D0 D3  current D0
>cap 05[d0] = MSI supports 1 message, 64 bit
>cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
> ecap 0003[140] = Serial 1 001517ed68a4
>
> ---Mike
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

A flood of bacula traffic causes igb interface to go offline.

2011-02-01 Thread Mike Carlson


Hey net@,

I have a FreeBSD 8.2-RC2 system running on a HP DL180 G6, using the 
onboard Intel controller, and it is our primary Bacula storage node and 
director node.


We have 96 clients that are scheduled to run at 8:30pm. After about 9 - 
10 minutes of activity (mrtg graphs show about 50-60MB/sec incoming 
traffic), the igb1 interface is no longer able to communicate with the 
Cisco switch.


The interesting part is, the interface is still "up", there is nothing 
in the kernel message buffer, and nothing relevant in the log file (just 
syslogd and ldap errors because they cannot reach their respective 
network servers). The system only responds to the network until I either 
reboot, or run 'ifconfig igb1 down ;  ifconfig igb1 up'. There is no 
firewall loaded/configured.


Thankfully, I have a KVM over IP, so when this happens I can at least 
run script(1) and capture some useful information.

ifconfig igb1
igb1: flags=8843 metric 0 mtu 1500

options=1bb

ether 1c:c1:de:e9:fb:af
inet 128.15.136.105 netmask 0xff00 broadcast 128.15.136.255
inet 128.15.136.108 netmask 0xff00 broadcast 128.15.136.255
inet 128.15.136.102 netmask 0xff00 broadcast 128.15.136.255
media: Ethernet autoselect (1000baseT )
status: active

I can ping the internal IP (but I realize that is probably a useless
test...)
root@write /etc]> ping 128.15.136.105
PING 128.15.136.105 (128.15.136.105): 56 data bytes
64 bytes from 128.15.136.105: icmp_seq=0 ttl=64 time=0.024 ms
64 bytes from 128.15.136.105: icmp_seq=1 ttl=64 time=0.015 ms
^C
--- 128.15.136.105 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.015/0.019/0.024/0.005 ms

Attempting to ping the router:
root@write /etc]> ping 128.15.136.254
PING 128.15.136.254 (128.15.136.254): 56 data bytes
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
ping: sendto: Host is down
^C
--- 128.15.136.254 ping statistics ---
9 packets transmitted, 0 packets received, 100.0% packet loss


The only thing that seems to solve this problem is to either reboot, or
do an "ifconfig down/up":

root@write /etc]> ifconfig igb1 down
root@write /etc]> ifconfig igb1
root@write /etc]> ping 128.15.136.254
PING 128.15.136.254 (128.15.136.254): 56 data bytes
64 bytes from 128.15.136.254: icmp_seq=1 ttl=255 time=1.015 ms
64 bytes from 128.15.136.254: icmp_seq=2 ttl=255 time=0.217 ms
64 bytes from 128.15.136.254: icmp_seq=3 ttl=255 time=0.278 ms
64 bytes from 128.15.136.254: icmp_seq=4 ttl=255 time=0.238 ms
^C
--- 128.15.136.254 ping statistics ---
5 packets transmitted, 4 packets received, 20.0% packet loss
round-trip min/avg/max/stddev = 0.217/0.437/1.015/0.334 ms

I was able to run tcpdump during all of this, and it *nothing* between 
the system and the switch until I run ifconfig igb1 down/up, and then 
you see the CDP and Tree Spanning traffic.


The networking team here has told me there are no errors on the switch, 
or the port I am on, and they even moved me from one port to another, 
but this is still happening on a fairly regular basis now that I've 
added more backup clients.


Is this a possible bug with my hardware and the intel driver? I have a 
pcap file and more system information that might provide a lot more 
information, but I don't want to send that out to a mailing list.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Mike Tancsa

On 2/1/2011 4:17 PM, Jack Vogel wrote:
> But you aren't defining EM_MULTIQUEUE are you? (its not on by default)

Nope. Everything is the default wrt to the em driver. Nothing odd in
loader.conf

0(backup3)% grep -v ^# /boot/loader.conf
ahci_load="YES"
siis_load="YES"
if_em_load="YES"
coretemp_load="YES"
comconsole_speed="115200"# Set the current serial console speed
console="comconsole,vidconsole"   # A comma separated list of
console(s)
aesni_load="YES"
cryptodev_load="YES"


---Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Mike Tancsa

On 2/1/2011 4:43 PM, Jack Vogel wrote:
> To those who are going to test, here is the if_em.c, based on head, with my
> changes, I have to leave for the afternoon, and have not had a chance to
> build
> this, but it should work. I will check back in the later evening.
> 
> Any blatant problems Sean, feel free to fix them :)

My boxes are RELENG_7 and RELENG_8. Apart from manually hand editing
those pesky sysctl changes out, is there a better way to generate
RELENG_8 and 7 diffs ?

---Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Sean Bruno

On Tue, 2011-02-01 at 13:43 -0800, Jack Vogel wrote:
> To those who are going to test, here is the if_em.c, based on head,
> with my
> changes, I have to leave for the afternoon, and have not had a chance
> to build
> this, but it should work. I will check back in the later evening.
> 
> Any blatant problems Sean, feel free to fix them :)
> 
> Jack
> 


I suspect that line 1490 should be:
if (more_rx || (ifp->if_drv_flags & IFF_DRV_OACTIVE)) {


Sean

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Sean Bruno

On Tue, 2011-02-01 at 13:51 -0800, Mike Tancsa wrote:
> On 2/1/2011 4:43 PM, Jack Vogel wrote:
> > To those who are going to test, here is the if_em.c, based on head, with my
> > changes, I have to leave for the afternoon, and have not had a chance to
> > build
> > this, but it should work. I will check back in the later evening.
> > 
> > Any blatant problems Sean, feel free to fix them :)
> 
> My boxes are RELENG_7 and RELENG_8. Apart from manually hand editing
> those pesky sysctl changes out, is there a better way to generate
> RELENG_8 and 7 diffs ?
> 
>   ---Mike


Not at the moment.

sean

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Current state of FreeBSD routing

2011-02-01 Thread Markus Oestreicher

Hi there!

After a few hours of reading list archives and source code I need some
clarification on the current state of FreeBSD forwarding capabilities.

Given the following setup:
- Quad Core CPU
- Intel 82576 NIC (igb)
- 8.2-RELEASE
- Router with BGP full table

1) Queues:
Card and driver seem to have support for multiple TX/RX queues.
How many cores will it use for RX / TX per NIC?

2) Fastforwarding vs multiple netisr:
In the past (6.x) using fastforwarding=1 was the best option for dedicated 
routers.
I found "multiple netisr" added to 8.0. Can that help with routing on multiple 
cores?
Any experience from using it in production?

3) lagg:
I found lagg(4) mostly mentioned on home user setups.
Any experience with using lagg in high-pps environments? (>100k pps)
Will lagg play nicely together with multiple netisr routing or fastforwarding?
How much overhead will it add versus a single connection?

Thanks a lot

Markus


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Bogus KASSERT() in tcp_output()?

2011-02-01 Thread Andre Oppermann


On 01.02.2011 18:37, Bjoern A. Zeeb wrote:

On Tue, 1 Feb 2011, John Baldwin wrote:


On Monday, January 31, 2011 9:40:09 pm Lawrence Stewart wrote:

On 02/01/11 04:17, John Baldwin wrote:

Somewhat related fallout to the bug reported on security@ recently, I think
this KASSERT() in tcp_output() is bogus:


KASSERT(len + hdrlen + ipoptlen == m_length(m, NULL),
("%s: mbuf chain shorter than expected", __func__));

Specifically, just a few lines earlier in tcp_output() we set the packet
header length to just 'len + hdrlen':

/*
* Put TCP length in extended header, and then
* checksum extended header and data.
*/
m->m_pkthdr.len = hdrlen + len; /* in6_cksum() need this */

Also, the ipoptions are stored in a separate mbuf chain in the in pcb
(inp_options) that is passed as a separate argument to ip_output(). Given
that, I would think that m_length() should not reflect ipoptlen since it
should not include IP options in that chain?



There is some relevant prior discussion on src-committers@ for r212803
between Andre and Bjoern.


I still don't see where ipoptlen bytes are reserved in the mbuf chain. After
this block where 'm' is allocated and initialized:

/*
* Grab a header mbuf, attaching a copy of data to
* be transmitted, and initialize the header from
* the template for sends on this connection.
*/
if (len) {
...
m->m_len = hdrlen;
...
if (len <= MHLEN - hdrlen - max_linkhdr) {
...
m->m_len += len;
} else {
m->m_next = m_copy(mb, moff, (int)len);
...
}
...
} else {
...
m->m_len = hdrlen;
}

The length of the mbuf chain headed by 'm' is clearly hdrlen + len.


It is.



At no point anywhere do we do any sort of m_prepend() or other operation to
allocate space in the mbuf chain for the IP options. They are merged in in
ip_output(). I think the only reason this KASSERT() isn't firing in HEAD is
that IP options are rarely used?


Right and probably reason why I also hit it with IPSec as result of

718 #ifdef IPSEC
719 ipoptlen += ipsec_optlen;
720 #endif

which wasn't because of ipsec_optlen really, I had just stopping
looking too soon back last year.


IPSEC and TCP is very sub-optimal at the moment. The size of the IPSEC header/
overhead is calculated per packet including a full lookup into the SADB. After
the discussions at EuroBSDCon I attempted to solve that in a better way but
didn't finish. Have to dig it up again.


Is there an easy way to test a connection with IP options enabled with this
KASSERT() enabled?


Yes, see patch at [1], and using my modified KASSERT still I get...
which btw sounds wrong to me as well btw as I wouldn't expect ipoptlen
to be 4 here given the test case.


Byte swap issue?


# ./tcpconnect client 127.0.0.1 12345 1 ipopt

panic: tcp_output: mbuf chain shorter than expected: 0 + 60 + 4 - 0 != 60
cpuid = 2
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
tcp_output() at tcp_output+0x1d01
tcp_usr_connect() at tcp_usr_connect+0x15f
soconnect() at soconnect+0x14f
kern_connect() at kern_connect+0x12e
connect() at connect+0x41
syscallenter() at syscallenter+0x1cb
syscall() at syscall+0x4c
Xfast_syscall() at Xfast_syscall+0xe2
--- syscall (98, FreeBSD ELF64, connect), rip = 0x80072934c, rsp = 
0x7fffe9d8, rbp = 0x3 ---

/bz

References:

[1] http://people.freebsd.org/~bz/20110201-01-tcpconnect-ipopt.diff


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Chris Peiffer

On Tue, Feb 01, 2011 at 04:51:37PM -0500, Mike Tancsa wrote:
> On 2/1/2011 4:43 PM, Jack Vogel wrote:
> > To those who are going to test, here is the if_em.c, based on head, with my
> > changes, I have to leave for the afternoon, and have not had a chance to
> > build
> > this, but it should work. I will check back in the later evening.
> > 

Did this get sent to the list? I didn't get this quoted message and I
can't find it in the archives. 

If someone could post the current revision of if_em.c that would be
great; we are also very eager to test. 

Thanks.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Bogus KASSERT() in tcp_output()?

2011-02-01 Thread Andre Oppermann


On 01.02.2011 14:29, John Baldwin wrote:

On Monday, January 31, 2011 9:40:09 pm Lawrence Stewart wrote:

On 02/01/11 04:17, John Baldwin wrote:

Somewhat related fallout to the bug reported on security@ recently, I think


What was the bug reported to security@?


this KASSERT() in tcp_output() is bogus:

KASSERT(len + hdrlen + ipoptlen == m_length(m, NULL),
("%s: mbuf chain shorter than expected", __func__));

Specifically, just a few lines earlier in tcp_output() we set the packet
header length to just 'len + hdrlen':


Yes.  ipoptlen should not be added in this comparison as the space for
the ipoptions is not reserved in the header mbuf.  The value of ipoptlen
is used for earlier checks to make sure the overall packet length does
not exceed 64K.


/*
 * Put TCP length in extended header, and then
 * checksum extended header and data.
 */
m->m_pkthdr.len = hdrlen + len; /* in6_cksum() need this */

Also, the ipoptions are stored in a separate mbuf chain in the in pcb
(inp_options) that is passed as a separate argument to ip_output().  Given
that, I would think that m_length() should not reflect ipoptlen since it
should not include IP options in that chain?


Correct.


There is some relevant prior discussion on src-committers@ for r212803
between Andre and Bjoern.


I have a pile of other patches to TCP and other parts laying around. Many
of them fixes for long standing PR's. After EuroBSDCon I got stalled and
lost bit of interest because of a severe reviewer shortage (neither Lawrence,
Qing or Björn responded with reviews to my requests due to severe time shortage
on their part). I then got sidetracked with other work.


I still don't see where ipoptlen bytes are reserved in the mbuf chain.  After
this block where 'm' is allocated and initialized:

/*
 * Grab a header mbuf, attaching a copy of data to
 * be transmitted, and initialize the header from
 * the template for sends on this connection.
 */
if (len) {
...
m->m_len = hdrlen;
...
if (len<= MHLEN - hdrlen - max_linkhdr) {
...
m->m_len += len;
} else {
m->m_next = m_copy(mb, moff, (int)len);
...
}
...
} else {
...
m->m_len = hdrlen;
}

The length of the mbuf chain headed by 'm' is clearly hdrlen + len.

At no point anywhere do we do any sort of m_prepend() or other operation to
allocate space in the mbuf chain for the IP options.  They are merged in in
ip_output().  I think the only reason this KASSERT() isn't firing in HEAD is
that IP options are rarely used?


Essentially non-existent on protocols other than ICMP (for ping).  In
most parts of the Internet IP options are filtered.  They haven't been
very useful except for record route perhaps.  That's of very limited
use though to being able to record only the eight most recent hops.


Is there an easy way to test a connection with IP options enabled with this
KASSERT() enabled?


--
Andre
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Mike Tancsa

On 2/1/2011 7:56 PM, Chris Peiffer wrote:
> 
> Did this get sent to the list? I didn't get this quoted message and I
> can't find it in the archives. 
> 
> If someone could post the current revision of if_em.c that would be
> great; we are also very eager to test. 
> 

Strange, it seems to be eaten/held by mailman ?  I posted the file to
http://www.tancsa.com/if_em.c


% md5 if_em.c
MD5 (if_em.c) = 0f2d48c7734496c2262f468cd1ab9117

% ident if_em.c
if_em.c:
 $FreeBSD: src/sys/dev/e1000/if_em.c,v 1.68 2011/01/19 18:20:11 jfv
Exp $

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Mike Tancsa

On 2/1/2011 8:44 PM, Mike Tancsa wrote:
> 
> % md5 if_em.c
> MD5 (if_em.c) = 0f2d48c7734496c2262f468cd1ab9117

Sorry, thats

MD5 (if_em.c) = 9cede4ab0d833e0f97172ed715e2b4e3

---Mike

-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Mike Tancsa

On 2/1/2011 5:03 PM, Sean Bruno wrote:
> On Tue, 2011-02-01 at 13:43 -0800, Jack Vogel wrote:
>> To those who are going to test, here is the if_em.c, based on head,
>> with my
>> changes, I have to leave for the afternoon, and have not had a chance
>> to build
>> this, but it should work. I will check back in the later evening.
>>
>> Any blatant problems Sean, feel free to fix them :)
>>
>> Jack
>>
> 
> 
> I suspect that line 1490 should be:
>   if (more_rx || (ifp->if_drv_flags & IFF_DRV_OACTIVE)) {
> 


I have hacked up a RELENG_8 version which I think is correct including
the above change

http://www.tancsa.com/if_em-8.c



--- if_em.c.orig2011-02-01 21:47:14.0 -0500
+++ if_em.c 2011-02-01 21:47:19.0 -0500
@@ -30,7 +30,7 @@
   POSSIBILITY OF SUCH DAMAGE.

 **/
-/*$FreeBSD: src/sys/dev/e1000/if_em.c,v 1.21.2.20 2011/01/22 01:37:53
jfv Exp $*/
+/*$FreeBSD$*/

 #ifdef HAVE_KERNEL_OPTION_HEADERS
 #include "opt_device_polling.h"
@@ -93,7 +93,7 @@
 /*
  *  Driver version:
  */
-char em_driver_version[] = "7.1.9";
+char em_driver_version[] = "7.1.9-test";

 /*
  *  PCI Device ID Table
@@ -927,11 +927,10 @@
if (!adapter->link_active)
return;

-/* Call cleanup if number of TX descriptors low */
-   if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
-   em_txeof(txr);
-
while (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) {
+   /* First cleanup if TX descriptors low */
+   if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
+   em_txeof(txr);
if (txr->tx_avail < EM_MAX_SCATTER) {
ifp->if_drv_flags |= IFF_DRV_OACTIVE;
break;
@@ -1411,8 +1410,7 @@
if (!drbr_empty(ifp, txr->br))
em_mq_start_locked(ifp, txr, NULL);
 #else
-   if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
-   em_start_locked(ifp, txr);
+   em_start_locked(ifp, txr);
 #endif
EM_TX_UNLOCK(txr);

@@ -1475,11 +1473,10 @@
struct ifnet*ifp = adapter->ifp;
struct tx_ring  *txr = adapter->tx_rings;
struct rx_ring  *rxr = adapter->rx_rings;
-   boolmore;
-

if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
-   more = em_rxeof(rxr, adapter->rx_process_limit, NULL);
+   boolmore_rx;
+   more_rx = em_rxeof(rxr, adapter->rx_process_limit, NULL);

EM_TX_LOCK(txr);
em_txeof(txr);
@@ -1487,12 +1484,10 @@
if (!drbr_empty(ifp, txr->br))
em_mq_start_locked(ifp, txr, NULL);
 #else
-   if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
-   em_start_locked(ifp, txr);
+   em_start_locked(ifp, txr);
 #endif
-   em_txeof(txr);
EM_TX_UNLOCK(txr);
-   if (more) {
+   if (more_rx || (ifp->if_drv_flags & IFF_DRV_OACTIVE)) {
taskqueue_enqueue(adapter->tq, &adapter->que_task);
return;
}
@@ -1604,7 +1599,6 @@
if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
em_start_locked(ifp, txr);
 #endif
-   em_txeof(txr);
E1000_WRITE_REG(&adapter->hw, E1000_IMS, txr->ims);
EM_TX_UNLOCK(txr);
 }
@@ -3730,17 +3724,17 @@
txr->queue_status = EM_QUEUE_HUNG;

 /*
- * If we have enough room, clear IFF_DRV_OACTIVE
+ * If we have a minimum free, clear IFF_DRV_OACTIVE
  * to tell the stack that it is OK to send packets.
  */
-if (txr->tx_avail > EM_TX_CLEANUP_THRESHOLD) {
+if (txr->tx_avail > EM_MAX_SCATTER)
 ifp->if_drv_flags &= ~IFF_DRV_OACTIVE;
-   /* Disable watchdog if all clean */
-if (txr->tx_avail == adapter->num_tx_desc) {
-   txr->queue_status = EM_QUEUE_IDLE;
-   return (FALSE);
-   }
-}
+
+   /* Disable watchdog if all clean */
+   if (txr->tx_avail == adapter->num_tx_desc) {
+   txr->queue_status = EM_QUEUE_IDLE;
+   return (FALSE);
+   }

return (TRUE);
 }
@@ -5064,8 +5058,8 @@
char namebuf[QUEUE_NAME_LEN];

/* Driver Statistics */
-   SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "link_irq",
-   CTLFLAG_RD, &adapter->link_irq, 0,
+   SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "link_irq",
+   CTLFLAG_RD, &adapter->link_irq,0,
"Link MSIX IRQ Handled");
SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "mbuf_alloc_fail",
 CTLFLAG_RD, &adapter->mbuf_alloc_failed

Re: kern/154443: [bridge] Kernel module bridgestp.ko missing after upgrade (if_bridge.ko depend on it)

2011-02-01 Thread linimon

Old Synopsis: Kernel module bridgestp.ko missing after upgrade (if_bridge.ko 
depend on it)
New Synopsis: [bridge] Kernel module bridgestp.ko missing after upgrade 
(if_bridge.ko depend on it)

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Wed Feb 2 04:20:20 UTC 2011
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=154443
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver, 82574L chip, and possibly ASPM

2011-02-01 Thread Jack Vogel

To those who are going to test, here is the if_em.c, based on head, with my
changes, I have to leave for the afternoon, and have not had a chance to
build
this, but it should work. I will check back in the later evening.

Any blatant problems Sean, feel free to fix them :)

Jack


On Tue, Feb 1, 2011 at 12:37 PM, Jack Vogel  wrote:

> Looks good, except I don't like code #if 0'd out, I'll make an if_em.c to
> try and
> send it shortly.
>
> Jack
>
>
>
> On Tue, Feb 1, 2011 at 12:19 PM, Sean Bruno  wrote:
>
>> On Tue, 2011-02-01 at 12:05 -0800, Jack Vogel wrote:
>> > At this point I'm open to any ideas, this sounds like a good one Sean,
>> > thanks.
>> > Mike, you want to test this ?
>> >
>> > Jack
>> >
>> >
>> > On Tue, Feb 1, 2011 at 11:56 AM, Sean Bruno 
>> > wrote:
>> >
>> > On Fri, 2011-01-28 at 08:10 -0800, Mike Tancsa wrote:
>> > > On 1/23/2011 10:21 AM, Mike Tancsa wrote:
>> > > > On 1/21/2011 4:21 AM, Jan Koum wrote:
>> > > > One other thing I noticed is that when the nic is in its
>> > hung state, the
>> > > > WOL option is gone ?
>> > > >
>> > > > e.g
>> > > >
>> > > > em1: flags=8843
>> > metric 0 mtu 1500
>> > > >
>> >
>> options=19b
>> > > > ether 00:15:17:ed:68:a4
>> > > >
>> > > > vs
>> > > >
>> > > >
>> > > > em1: flags=8843
>> > metric 0 mtu 1500
>> > > >
>> > > >
>> >
>> options=219b
>> > > > ether 00:15:17:ed:68:a4
>> > >
>> > >
>> > > Another hang last night :(
>> > >
>> > > Whats really strange is that the WOL_MAGIC and TSO4 got
>> > turned back on
>> > > somehow ? I had explicitly turned it off, but when the NIC
>> > was in its
>> > > bad state
>> > >
>> > > em1: flags=8843
>> > metric 0 mtu 1500
>> > >
>> > options=2198
>> > >
>> > > ... its back on along with TSO?  Not sure if its coincidence
>> > or a side
>> > > effect or what.  For now, I have had to re-purpose this nic
>> > to something
>> > > else.
>> > >
>> > > debug info shows
>> > >
>> > > Jan 28 00:25:10 backup3 kernel: Interface is RUNNING and
>> > INACTIVE
>> > > Jan 28 00:25:10 backup3 kernel: em1: hw tdh = 625, hw tdt =
>> > 625
>> > > Jan 28 00:25:10 backup3 kernel: em1: hw rdh = 903, hw rdt =
>> > 903
>> > > Jan 28 00:25:10 backup3 kernel: em1: Tx Queue Status = 0
>> > > Jan 28 00:25:10 backup3 kernel: em1: TX descriptors avail =
>> > 1024
>> > > Jan 28 00:25:10 backup3 kernel: em1: Tx Descriptors avail
>> > failure = 0
>> > > Jan 28 00:25:10 backup3 kernel: em1: RX discarded packets =
>> > 0
>> > > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Check = 903
>> > > Jan 28 00:25:10 backup3 kernel: em1: RX Next to Refresh =
>> > 904
>> > > Jan 28 00:25:27 backup3 kernel: em1: link state changed to
>> > DOWN
>> > > Jan 28 00:25:30 backup3 kernel: em1: link state changed to
>> > UP
>> > >
>> > >
>> > >   ---Mike
>> >
>> >
>> >
>> > I'm trying to get some more testing done regarding my
>> > suggestions around
>> > the OACTIVE assertions in the driver.  More or less, it looks
>> > like
>> > intense periods of activity can push the driver into the
>> > OACTIVE hold
>> > off state and the logic isn't quite right in igb(4) or em(4)
>> > to handle
>> > it.
>> >
>> > I suspect that something like this modification to igb(4) may
>> > be
>> > required for em(4).
>> >
>> > Comments?
>> >
>> > Sean
>> >
>>
>>
>> Does the logic I've implemented look sane?
>>
>> Sean
>>
>>
>
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Current state of FreeBSD routing

2011-02-01 Thread Sergey Kandaurov

On 2 February 2011 02:11, Markus Oestreicher  wrote:
> Hi there!
>
> After a few hours of reading list archives and source code I need some
> clarification on the current state of FreeBSD forwarding capabilities.
>
> Given the following setup:
> - Quad Core CPU
> - Intel 82576 NIC (igb)
> - 8.2-RELEASE
> - Router with BGP full table
>
> 1) Queues:
> Card and driver seem to have support for multiple TX/RX queues.
> How many cores will it use for RX / TX per NIC?

That depends on how many cpu cores you have.
e.g. with several 82576 NICs installed I have 8 queues per each port.
So it looks like
# vmstat -ia | grep igb7
irq320: igb7:que 0 0  0
irq321: igb7:que 1 0  0
irq322: igb7:que 2 0  0
irq323: igb7:que 3 0  0
irq324: igb7:que 4 0  0
irq325: igb7:que 5 0  0
irq326: igb7:que 6 0  0
irq327: igb7:que 7 0  0
irq328: igb7:link  0  0

With Quad Core CPU you will have 4 queues.

>
> 2) Fastforwarding vs multiple netisr:
> In the past (6.x) using fastforwarding=1 was the best option for dedicated 
> routers.
> I found "multiple netisr" added to 8.0. Can that help with routing on 
> multiple cores?
> Any experience from using it in production?
>
> 3) lagg:
> I found lagg(4) mostly mentioned on home user setups.
> Any experience with using lagg in high-pps environments? (>100k pps)
> Will lagg play nicely together with multiple netisr routing or fastforwarding?
> How much overhead will it add versus a single connection?
>
> Thanks a lot
>
> Markus

-- 
wbr,
pluknet
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Current state of FreeBSD routing

2011-02-01 Thread Eugene Grosbein

On 02.02.2011 05:11, Markus Oestreicher wrote:

> 2) Fastforwarding vs multiple netisr:
> In the past (6.x) using fastforwarding=1 was the best option for dedicated 
> routers.
> I found "multiple netisr" added to 8.0. Can that help with routing on 
> multiple cores?

Yes, it allows more even distribution of input traffic processing over cores.

> Any experience from using it in production?

It helps greatly but I was forced to disable it for mpd-based router
where there are many dynamically born/destroyed network interfaces.

I suspect it increases possibility of kernel panic in such configuration
due to famous 'dangling pointer' problem: an interface ngXXX got destroyed
while packets received from it reside in netisr queues. Then kernel might
panic while processing these packets if needs to check incoming interface,
f.e. due to ipfw antispoofing rules.

> 3) lagg:
> I found lagg(4) mostly mentioned on home user setups.
> Any experience with using lagg in high-pps environments? (>100k pps)

Works fine for me.

> Will lagg play nicely together with multiple netisr routing or fastforwarding?
> How much overhead will it add versus a single connection?

Unnoticed.

Eugene Grosbein
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Current state of FreeBSD routing

2011-02-01 Thread Julian Elischer


On 2/1/11 10:10 PM, Eugene Grosbein wrote:

On 02.02.2011 05:11, Markus Oestreicher wrote:


2) Fastforwarding vs multiple netisr:
In the past (6.x) using fastforwarding=1 was the best option for dedicated 
routers.
I found "multiple netisr" added to 8.0. Can that help with routing on multiple 
cores?

Yes, it allows more even distribution of input traffic processing over cores.


Any experience from using it in production?

It helps greatly but I was forced to disable it for mpd-based router
where there are many dynamically born/destroyed network interfaces.

I suspect it increases possibility of kernel panic in such configuration
due to famous 'dangling pointer' problem: an interface ngXXX got destroyed
while packets received from it reside in netisr queues. Then kernel might
panic while processing these packets if needs to check incoming interface,
f.e. due to ipfw antispoofing rules.


workaround for that may be to delay ng interface destruction by 2 
seconds or something.


I'll think about it..


3) lagg:
I found lagg(4) mostly mentioned on home user setups.
Any experience with using lagg in high-pps environments? (>100k pps)

Works fine for me.


Will lagg play nicely together with multiple netisr routing or fastforwarding?
How much overhead will it add versus a single connection?

Unnoticed.

Eugene Grosbein
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

43 matches

Mail list logo