Re: kern/118975: [bge] [patch] Broadcom 5906 not handled by FreeBSD

2008-01-02 Thread Benjamin Close
The following reply was made to PR kern/118975; it has been noted by GNATS.

From: Benjamin Close <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Cc:  
Subject: Re: kern/118975: [bge] [patch] Broadcom 5906 not handled by FreeBSD
Date: Wed, 02 Jan 2008 18:42:34 +1030

 The -Current portion of this patch works like a charm under 20070102
 -Current !
 
 Cheers,
 Benjamin
 (benjsc)
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: if_ral regression

2008-01-02 Thread Sepherosa Ziehau
On Jan 1, 2008 9:32 PM, Dag-Erling Smørgrav <[EMAIL PROTECTED]> wrote:
> "Sepherosa Ziehau" <[EMAIL PROTECTED]> writes:
> > I don't whether following thingies will fix your problem:
> > [...]
>
> Can you provide a diff?

http://people.freebsd.org/~sephe/rt2560_test.diff

Hope it will have some effect.

Best Regards,
sephe

-- 
Live Free or Die
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: if_ral regression

2008-01-02 Thread Sepherosa Ziehau
On Jan 2, 2008 10:38 AM, Weongyo Jeong <[EMAIL PROTECTED]> wrote:
>
> > Even with these in place in dfly, I still have strange TX performance
> > regression in sta mode (drop from 20Mb/s to 3Mb/s under very well
> > condition) on certain hardwares after 20sec~30sec TCP_STREAM netperf
> > testing; didn't have enough time to dig, however, all of the tested
> > hardwares stayed connected during testing (I usually run netperf
> > stream test for 12 hours or more).
>
> I also saw some regression in TX performance during porting malo(4).

Have you tried to turn bgscan off completely?  My problem seems to be
hardware (I suspect rf) related.  The TX performance regression does
not happen with UDP stream which only uses 802.11 ack, i.e. hardware
seems to have touble to switch between data RX and data TX at high
freq.

> Problems were fixed after removing following lines in *_start:
>
> /*
>  * Cancel any background scan.
>  */
> if (ic->ic_flags & IEEE80211_F_SCAN)
> ieee80211_cancel_scan(ic);
>
> and (optionally)
>
> if (m->m_flags & M_TXCB)
> ...
> ieee80211_process_callback(ni, m, 0);   /* XXX status?
> ...

I don't think you can remove TXCB processing in drivers :)

Best Regards,
sephe

-- 
Live Free or Die
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: bikeshed for all!

2008-01-02 Thread Randall Stewart

Alfred Perlstein wrote:

* Julian Elischer <[EMAIL PROTECTED]> [071212 15:13] wrote:


Alfred Perlstein wrote:


try using "instance".

"Oh I'm going to use the FOO routing instance."


what do Juniper call it?



"Instance" and "vrf".



VRF is the same thing we call it at Cisco :-)

R
--
Randall Stewart
NSSTG - Cisco Systems Inc.
803-345-0369  803-317-4952 (cell)
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


ath0 Ierrs

2008-01-02 Thread Randy Bush
i seem to be loggin massive errors on an ath in hostap mode with
only two wireless clients.

mtu is set low as the tun0 ppoe over ntt B Flets on vr0 recommends
it.  wireless on the two clients is set to mtu of 1454 too.

seeking pointers on how to debug.

randy

---

# netstat -i
NameMtu Network   Address  Ipkts IerrsOpkts Oerrs  Coll
vr01500   00:00:24:c8:b3:28 20300694 0  9586690 0 0
vr11500   00:00:24:c8:b3:290 0 8495 0 0
vr21454   00:00:24:c8:b3:2a 1470 014571 0 0
vr31454   00:00:24:c8:b3:2b   144471 0   289752 0 0
ath0   1454   00:0b:6b:83:59:25  9333193 9765589 19873060  1729   0
lo0   1638496360 096360 0 0
lo0   16384 fe80:6::1 fe80:6::10 -0 - -
lo0   16384 localhost ::1  0 -0 - -
lo0   16384 your-net  localhost96362 -96362 - -
bridg  1500   6a:fe:c7:ad:96:89  9476742 0 20046340 0 0
bridg  1500 192.168.0.0   soek014920 -82683 - -
tun0   1454 20247165 0  9582931 0 0
tun0   1454 210.138.216.5 50.216.138.210.bn   306221 -   180432 - -

config is

   ..
   ||
   |   b ---ath0|
   |   r| 192.168.0.0/24
ext iij|   i --- vr1|
PPP/NAT ---|vr0--- d| LAN hosts &
  WAN  |   g --- vr2|
   |   e| DHCP Clients
   |   0 --- vr3|
   ||
   `'

# ifconfig -a
vr0: flags=8843 metric 0 mtu 1500
options=b
ether 00:00:24:c8:b3:28
media: Ethernet autoselect (100baseTX )
status: active
vr1: flags=8943 metric 0 mtu 
1500
options=9
ether 00:00:24:c8:b3:29
media: Ethernet autoselect (none)
status: no carrier
vr2: flags=8943 metric 0 mtu 
1454
options=9
ether 00:00:24:c8:b3:2a
media: Ethernet autoselect (100baseTX )
status: active
vr3: flags=8943 metric 0 mtu 
1454
options=9
ether 00:00:24:c8:b3:2b
media: Ethernet autoselect (100baseTX )
status: active
ath0: flags=8943 metric 0 mtu 
1454
ether 00:0b:6b:83:59:25
media: IEEE 802.11 Wireless Ethernet autoselect  (autoselect 
)
status: associated
ssid rgnet-aden channel 11 (2462 Mhz 11g) bssid 00:0b:6b:83:59:25
authmode OPEN privacy ON deftxkey 1 wepkey 1:104-bit txpower 31.5
scanvalid 60 bgscan bgscanintvl 300 bgscanidle 250 roam:rssi11g 7
roam:rate11g 5 protmode CTS burst dtimperiod 1
lo0: flags=8049 metric 0 mtu 16384
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x6 
inet6 ::1 prefixlen 128 
inet 127.0.0.1 netmask 0xff00 
bridge0: flags=8843 metric 0 mtu 1500
ether 6a:fe:c7:ad:96:89
inet 192.168.0.1 netmask 0xff00 broadcast 192.168.0.255
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: ath0 flags=143
ifmaxaddr 0 port 5 priority 128 path cost 370370
member: vr3 flags=143
ifmaxaddr 0 port 4 priority 128 path cost 20
member: vr2 flags=143
ifmaxaddr 0 port 3 priority 128 path cost 20
member: vr1 flags=143
ifmaxaddr 0 port 2 priority 128 path cost 55
tun0: flags=8051 metric 0 mtu 1454
inet 210.138.216.50 --> 210.149.34.66 netmask 0x 
Opened by PID 566

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: if_ral regression

2008-01-02 Thread Dag-Erling Smørgrav
"Sepherosa Ziehau" <[EMAIL PROTECTED]> writes:
> http://people.freebsd.org/~sephe/rt2560_test.diff

Thank you, I'll try that.

Could you explain what the RT2560_BBP_BUSY loop is about?

DES
-- 
Dag-Erling Smørgrav - [EMAIL PROTECTED]
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: if_ral regression

2008-01-02 Thread Dag-Erling Smørgrav
"Sepherosa Ziehau" <[EMAIL PROTECTED]> writes:
> http://people.freebsd.org/~sephe/rt2560_test.diff
>
> Hope it will have some effect.

I built a new kernel with the patch applied, and it seems to help,
though it's a bit early to say for sure.

DES
-- 
Dag-Erling Smørgrav - [EMAIL PROTECTED]
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Routing SMP benefit

2008-01-02 Thread Andre Oppermann

Tiffany Snyder wrote:

Hi Andre,
are those numbers for small (64 bytes) packets? Good job on pushing
the base numbers higher on the same HW.


Yes, 64 bytes.  Haven't measured lately, but I assume PCI-E hardware
instead of PCI-X could push quite some more.


What piqued my attention was the note that our forwarding
performance doesn't scale with multiple CPUs. Which means there's a lot of
work to be done :-) Have we taken a look at OpenSolaris' Surya
(http://www.opensolaris.org/os/community/networking/surya-design.pdf)
project? They allow multiple readers/single writer on the radix_node_head
(and not a mutex as we do) and we may be able to do the same to gain some
parallelism. There are other things in Surya that exploit multiple CPUs.
It's definitely worth a read. DragonFlyBSD seems to achieve parallelism by
classifying packet as flows and then redirecting the flows to different
CPUs. OpenSolaris also does something similar. We can definitely think along
those lines.


So far the PPS rate limit has primarily been the cache miss penalties
on the packet access.  Multiple CPUs can help here of course for bi-
directional traffic.  Hardware based packet header cache prefetching as
done by some embedded MIPS based network processors at least doubles the
performance.  Intel has something like this for a couple of chipset and
network chip combinations.  We don't support that feature yet though.

Many of the things you mention here are planned for FreeBSD 8.0 in the
same or different form.  Work in progress is the separation of the ARP
table from kernel routing table.  If we can prevent references to radix
nodes generally almost all locking can be done away with.  Instead only
a global rmlock (read-mostly) could govern the entire routing table.
Obtaining the rmlock for reading is essentially free.  Table changes
are very infrequent compared to lookups (like 700,000 to 300-400) in
default free Internet routing.  The radix trie nodes are rather big
and could use some more trimming to make the fit a single cache line.
I've already removed some stuff a couple of years ago and more can be
done.

It's very important to keep this in mind: "profile, don't speculate".

For example while the DragonFly model may seem good in theory it so
far did not show itself to be faster.  Back when I had the Agilent N2X
network tester DFBSD was the poorest performer in the test set of
FreeBSD, OpenBSD and DFBSD.

Haven't tested Solaris yet, and neither retested the others, but until
that is done we should not jump to conclusions yet.  At the time we
were more than two to three times faster than the other BSDs.


NOTE:
1) I said  multiple instead of dual CPUs on purpose.
2) I mentioned OpenSolaris and DragonFlyBSD as examples and to acknowledge
the work they are doing and to show that FreeBSD is far behind and is losing
it's lustre on continuing to be the networking platform  of choice.


Like I said.  Don't jump to conclusions without real testing and
profiling.  Reality may turn out to be different from theory. ;)

--
Andre


Thanks,

Tiffany.


On 12/29/05, Andre Oppermann <[EMAIL PROTECTED] > wrote:


Markus Oestreicher wrote:

Currently running a few routers on 5-STABLE I have read the
recent changes in the network stack with interest.

You should run 6.0R.  It contains many improvements over 5-STABLE.


A few questions come to my mind:

- Can a machine that mainly routes packets between two em(4)
interfaces benefit from a second CPU and SMP kernel? Can both
CPUs process packets from the same interface in parallel?

My testing has shown that a machine can benefit from it but not
much in the forwarding performance.  The main benefit is the
prevention of lifelock if you have very high packet loads.  The
second CPU on SMP keeps on doing all userland tasks and running
routing protocols.  Otherwise your BGP sessions or OSPF hellos
would stop and remove you from the routing cloud.


- From reading the lists it appears that net.isr.direct
and net.ip.fastforwarding are doing similar things. Should
they be used together or rather not?

net.inet.ip.fastforwarding has precedence over net.isr.direct and
enabling both at the same doesn't gain you anything.  Fastforwarding
is about 30% faster than all other methods available, including
polling.  On my test machine with two em(4) and an AMD Opteron 852
(2.6GHz) I can route 580'000 pps with zero packet loss on -CURRENT.
An upcoming optimization that will go into -CURRENT in the next
few days pushes that to 714'000 pps.  Futher optimizations are
underway to make a stock kernel do close to or above 1'000'000 pps
on the same hardware.

--
Andre
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to " [EMAIL PROTECTED]"


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscrib

Re: Routing SMP benefit

2008-01-02 Thread Bruce M. Simpson

Andre Oppermann wrote:

So far the PPS rate limit has primarily been the cache miss penalties
on the packet access.  Multiple CPUs can help here of course for bi-
directional traffic.  Hardware based packet header cache prefetching as
done by some embedded MIPS based network processors at least doubles the
performance.  Intel has something like this for a couple of chipset and
network chip combinations.  We don't support that feature yet though.


What sort of work is needed in order to support header prefetch?



Many of the things you mention here are planned for FreeBSD 8.0 in the
same or different form.  Work in progress is the separation of the ARP
table from kernel routing table.  If we can prevent references to radix
nodes generally almost all locking can be done away with.  Instead only
a global rmlock (read-mostly) could govern the entire routing table.
Obtaining the rmlock for reading is essentially free.


This is exactly what I'm thinking, this feels like the right way forward.

A single rwlock should be fine, route table updates should generally 
only be happening from one process, and thus a single thread, at any 
given time.



Table changes
are very infrequent compared to lookups (like 700,000 to 300-400) in
default free Internet routing.  The radix trie nodes are rather big
and could use some more trimming to make the fit a single cache line.
I've already removed some stuff a couple of years ago and more can be
done.

It's very important to keep this in mind: "profile, don't speculate".

Beware though that functionality isn't sacrificed at the expense of this.

For example it would be very, very useful to be able to merge the 
multicast routing implementation with the unicast -- with the proviso of 
course that mBGP requires that RPF can be performed with a separate set 
of FIB entries from the unicast FIB.


Of course if next-hops themselves are held in a container separately 
referenced

from the radix node, such as a simple linked list as per the OpenBSD code.

If we ensure the parent radix trie node object fits in a cache line, 
then that's fine.


[I am looking at some stuff in the dynamic/ad-hoc/mesh space which is 
really going to need support for multipath similar to this.]


later
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Routing SMP benefit

2008-01-02 Thread Andre Oppermann

Bruce M. Simpson wrote:

Andre Oppermann wrote:

So far the PPS rate limit has primarily been the cache miss penalties
on the packet access.  Multiple CPUs can help here of course for bi-
directional traffic.  Hardware based packet header cache prefetching as
done by some embedded MIPS based network processors at least doubles the
performance.  Intel has something like this for a couple of chipset and
network chip combinations.  We don't support that feature yet though.


What sort of work is needed in order to support header prefetch?


Extracting the documentation out of Intel for a first step.  It's
called Direct Cache Access (DCA).  At least in the Linux implementation
it has been intermingled with I/OAT which is an asynchronous memory
controller based DMA copy mechanism.  Don't know if they really have
to be together.  The idea of DCA is to cause the memory controller
upon DMA'ing a packet into main memory to also load it into the
CPU cache(s) right away.  For packet forwarding the first 128 bytes
are sufficient.  For server applications and TCP it may be beneficial
to prefetch the whole packet.  May cause some considerable cache
pollution though depending on usage.

Some pointers:

http://www.stanford.edu/group/comparch/papers/huggahalli05.pdf
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=tree;f=drivers/dca;hb=HEAD
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=tree;f=drivers/dma;hb=HEAD
http://download.intel.com/technology/comms/perfnet/download/ServerNetworkIOAccel.pdf
http://download.intel.com/design/network/prodbrf/317796.pdf


Many of the things you mention here are planned for FreeBSD 8.0 in the
same or different form.  Work in progress is the separation of the ARP
table from kernel routing table.  If we can prevent references to radix
nodes generally almost all locking can be done away with.  Instead only
a global rmlock (read-mostly) could govern the entire routing table.
Obtaining the rmlock for reading is essentially free.


This is exactly what I'm thinking, this feels like the right way forward.

A single rwlock should be fine, route table updates should generally 
only be happening from one process, and thus a single thread, at any 
given time.


rmlocks are even faster and the change to use ratio is also quite right.


Table changes
are very infrequent compared to lookups (like 700,000 to 300-400) in
default free Internet routing.  The radix trie nodes are rather big
and could use some more trimming to make the fit a single cache line.
I've already removed some stuff a couple of years ago and more can be
done.

It's very important to keep this in mind: "profile, don't speculate".

Beware though that functionality isn't sacrificed at the expense of this.

For example it would be very, very useful to be able to merge the 
multicast routing implementation with the unicast -- with the proviso of 
course that mBGP requires that RPF can be performed with a separate set 
of FIB entries from the unicast FIB.


Of course if next-hops themselves are held in a container separately 
referenced

from the radix node, such as a simple linked list as per the OpenBSD code.


Haven't looked at the multicast code so I can't comment.  The other
stuff is just talk so far.  No work in progress, at least from my side.

If we ensure the parent radix trie node object fits in a cache line, 
then that's fine.


[I am looking at some stuff in the dynamic/ad-hoc/mesh space which is 
really going to need support for multipath similar to this.]


I was looking at some parallel forwarding table for fastforward
that is highly optimized for IPv4 and cache efficiency.  It was
supposed to be 8-bit stride based (256-ary) with SSE based multi
segment longest prefix match updates.  Never managed to this past
the design state though.  And it's not one of the pressing issues.

The radix trie is pretty efficient though for being architecture
independent.  Even though the depth and variety in destination
addresses matters it never really turned out to become bottleneck
in my profile at the time.  It does have its limitations though
becoming more apparent at very high PPS and very large routing
tables as in the DFZ.

--
Andre

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: if_ral regression

2008-01-02 Thread Sepherosa Ziehau
On Jan 3, 2008 12:16 AM, Dag-Erling Smørgrav <[EMAIL PROTECTED]> wrote:
> "Sepherosa Ziehau" <[EMAIL PROTECTED]> writes:
> > http://people.freebsd.org/~sephe/rt2560_test.diff
>
> Thank you, I'll try that.
>
> Could you explain what the RT2560_BBP_BUSY loop is about?

bbp read involves one write to RT2560_BBPCSR, while this register
should not be touched until its RT2560_BBP_BUSY bit is off (as
according to Ralink's linux driver)

Best Regards,
sephe

-- 
Live Free or Die
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


unp_connect() locking problems with early returns

2008-01-02 Thread James Juran
There are two early returns in unp_connect() that need to re-acquire the
UNP global lock before returning.  This program will trigger a panic on
a WITNESS-enabled system.  I tested on the December snapshot of
CURRENT-8.0, but the same problem occurs in RELENG_7.

#include 
#include 
#include 
#include 
#include 

int main(void)
{
int s;
struct sockaddr_un un;

s = socket(PF_LOCAL, SOCK_STREAM, 0);
if (s == -1)
{
perror("socket");
exit(1);
}

memset(&un, 0, sizeof(un));
un.sun_family = AF_UNIX;
if ((connect(s, (struct sockaddr *)&un, 2)) == -1)
{
perror("connect");
exit(1);
}
return 0;
}

I believe this patch will fix the problem, but unfortunately I do not
have time to test it.  Could someone please try this out?  Instead of
this approach, it may be possible to move the unlocking to after the
early returns are done, but I have not analyzed what impact this would
have.

Index: uipc_usrreq.c
===
RCS file: /home/ncvs/src/sys/kern/uipc_usrreq.c,v
retrieving revision 1.210
diff -u -p -r1.210 uipc_usrreq.c
--- uipc_usrreq.c   1 Jan 2008 01:46:42 -   1.210
+++ uipc_usrreq.c   3 Jan 2008 02:53:51 -
@@ -1129,13 +1129,16 @@ unp_connect(struct socket *so, struct so
KASSERT(unp != NULL, ("unp_connect: unp == NULL"));
 
len = nam->sa_len - offsetof(struct sockaddr_un, sun_path);
-   if (len <= 0)
+   if (len <= 0) {
+   UNP_GLOBAL_WLOCK();
return (EINVAL);
+   }
strlcpy(buf, soun->sun_path, len + 1);
 
UNP_PCB_LOCK(unp);
if (unp->unp_flags & UNP_CONNECTING) {
UNP_PCB_UNLOCK(unp);
+   UNP_GLOBAL_WLOCK();
return (EALREADY);
}
unp->unp_flags |= UNP_CONNECTING;


-- 
James Juran
Lead Secure Systems Engineer
BAE Systems Information Technology
Information Assurance Group
XTS Operating Systems
[EMAIL PROTECTED]

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"