Debugging em(4) driver

2010-11-13 Thread Patrick Mahan

Good afternoon,

I am trying to run down a root cause of a link failure between two of my
HP Proliant DL350's.  pciconf shows them as the 82571EB chip.  These are
a 4 port card on the HP.

We are doing some routing code in the kernel and have a call to our entry
function in the forwarding path in ip_input().  We have WITNESS and
INVARIANTS enable in our kernel configuration.

Our current testbed has these two HP's em3 ports connected via a ethernet
crossover cable.  I am generating traffic using 'iperf -u -b 20M' and these
links show up as 1000Mbps, so I should not be saturating the interface.

The traffic is fine for a while, but then we start to fail to see anymore 
traffic.
A ping started before generating the traffic just stops almost as soon as
the iperf traffic begins.

I cannot find any error messages indicating that there is a problem with
em(4).  Is there anything I can use to debug this issue?  If not, then
I guess I will need to put in some debugging in the xmit path of em(4).

Another issue that sometimes raises it head, especially if I have a lot of
printf's occurring on a per-packet basis, is I get a double deallocation
panic from uma_dbg_free().  When this has occurred I have freed the packet
using m_freem() (Since we don't want the packet to be forwarded).  But then
it looks like em(4) gets an interrupt and frees the packet at
em_init_locked()+0x91f.  Which may be in the receive handler code.

Any and all help/ideas will be appreciated.

Output of 'ifconfig em3'
em3: flags=8843 metric 0 mtu 1500
options=19b
ether 00:1f:29:5f:c6:aa
inet 172.16.13.30 netmask 0xff00 broadcast 172.16.13.255
media: Ethernet autoselect (1000baseT )
status: active

Output of 'netstat -I em3'
NameMtu Network   Address  Ipkts IerrsOpkts Oerrs  Coll
em31500   00:1f:29:5f:c6:aa11099 8932211298 0 0
em31500 172.16.13.0   172.16.13.30 11096 -11296 - -


pciconf -lv shows -

e...@pci0:21:0:0:class=0x02 card=0x704b103c chip=0x10bc8086 rev=0x06 
hdr=0x00

vendor = 'Intel Corporation'
device = '82571EB Gigabit Ethernet Controller (Copper)'
class  = network
subclass   = ethernet
e...@pci0:21:0:1:class=0x02 card=0x704b103c chip=0x10bc8086 rev=0x06 
hdr=0x00

vendor = 'Intel Corporation'
device = '82571EB Gigabit Ethernet Controller (Copper)'
class  = network
subclass   = ethernet
e...@pci0:22:0:0:class=0x02 card=0x704b103c chip=0x10bc8086 rev=0x06 
hdr=0x00

vendor = 'Intel Corporation'
device = '82571EB Gigabit Ethernet Controller (Copper)'
class  = network
subclass   = ethernet
e...@pci0:22:0:1:class=0x02 card=0x704b103c chip=0x10bc8086 rev=0x06 
hdr=0x00

vendor = 'Intel Corporation'
device = '82571EB Gigabit Ethernet Controller (Copper)'
class  = network
subclass   = ethernet

Thanks,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Debugging em(4) driver

2010-11-13 Thread Patrick Mahan



On 11/13/2010 02:27 PM, Ryan Stone wrote:

It looks to me that you're getting a ton of input drops.  That's
presumably the cause of your issue.  You can get the em driver to
print debug information to the console by running:

# sysctl dev.em.3.stats=1
# sysctl.dev.em.3.debug=1

The output should be available in dmesg and /var/log/messages

Hopefully that can shed some light on the nature of the drops.


Ryan,

Thanks for the tip.  But I see I forgot to mention this was FreeBSD 8.0.
The em(4) driver is actually the one found in FreeBSD 8.1 as we needed
the AltQ fixes.

However, I do not see these sysctl's in the code or when I do a

'sysctl dev.em.3'

It looks like between 8.0 and 8.1 there was a change?  I now see a if_lem.c
which has the sysctl you are referring too.

Here is the output of my sysctl

npxk3# sysctl dev.em.3.stats=1
sysctl: unknown oid 'dev.em.3.stats'
npxk3# sysctl dev.em.3
dev.em.3.%desc: Intel(R) PRO/1000 Network Connection 7.0.5
dev.em.3.%driver: em
dev.em.3.%location: slot=0 function=1
dev.em.3.%pnpinfo: vendor=0x8086 device=0x10bc subvendor=0x103c subdevice=0x704b 
class=0x02

dev.em.3.%parent: pci22
dev.em.3.nvm: -1
dev.em.3.rx_int_delay: 0
dev.em.3.tx_int_delay: 66
dev.em.3.rx_abs_int_delay: 66
dev.em.3.tx_abs_int_delay: 66
dev.em.3.rx_processing_limit: 100
dev.em.3.link_irq: 0
dev.em.3.mbuf_alloc_fail: 0
dev.em.3.cluster_alloc_fail: 0
dev.em.3.dropped: 0
dev.em.3.tx_dma_fail: 0
dev.em.3.fc_high_water: 30720
dev.em.3.fc_low_water: 29220
dev.em.3.mac_stats.excess_coll: 0
dev.em.3.mac_stats.symbol_errors: 0
dev.em.3.mac_stats.sequence_errors: 0
dev.em.3.mac_stats.defer_count: 0
dev.em.3.mac_stats.missed_packets: 0
dev.em.3.mac_stats.recv_no_buff: 0
dev.em.3.mac_stats.recv_errs: 0
dev.em.3.mac_stats.crc_errs: 0
dev.em.3.mac_stats.alignment_errs: 0
dev.em.3.mac_stats.coll_ext_errs: 0
dev.em.3.mac_stats.rx_overruns: 0
dev.em.3.mac_stats.watchdog_timeouts: 0
dev.em.3.mac_stats.xon_recvd: 0
dev.em.3.mac_stats.xon_txd: 0
dev.em.3.mac_stats.xoff_recvd: 0
dev.em.3.mac_stats.xoff_txd: 0
dev.em.3.mac_stats.total_pkts_recvd: 58365716
dev.em.3.mac_stats.good_pkts_recvd: 58365716
dev.em.3.mac_stats.bcast_pkts_recvd: 9
dev.em.3.mac_stats.mcast_pkts_recvd: 0
dev.em.3.mac_stats.rx_frames_64: 16
dev.em.3.mac_stats.rx_frames_65_127: 5612
dev.em.3.mac_stats.rx_frames_128_255: 10355
dev.em.3.mac_stats.rx_frames_256_511: 29103556
dev.em.3.mac_stats.rx_frames_512_1023: 6633
dev.em.3.mac_stats.rx_frames_1024_1522: 29239544
dev.em.3.mac_stats.good_octets_recvd: 0
dev.em.3.mac_stats.good_octest_txd: 0
dev.em.3.mac_stats.total_pkts_txd: 165551
dev.em.3.mac_stats.good_pkts_txd: 165551
dev.em.3.mac_stats.bcast_pkts_txd: 8
dev.em.3.mac_stats.mcast_pkts_txd: 2
dev.em.3.mac_stats.tx_frames_64: 19
dev.em.3.mac_stats.tx_frames_65_127: 5573
dev.em.3.mac_stats.tx_frames_128_255: 10348
dev.em.3.mac_stats.tx_frames_256_511: 3308
dev.em.3.mac_stats.tx_frames_512_1023: 6680
dev.em.3.mac_stats.tx_frames_1024_1522: 139623
dev.em.3.mac_stats.tso_txd: 0
dev.em.3.mac_stats.tso_ctx_fail: 0
dev.em.3.interrupts.asserts: 0
dev.em.3.interrupts.rx_pkt_timer: 0
dev.em.3.interrupts.rx_abs_timer: 0
dev.em.3.interrupts.tx_pkt_timer: 0
dev.em.3.interrupts.tx_abs_timer: 0
dev.em.3.interrupts.tx_queue_empty: 0
dev.em.3.interrupts.tx_queue_min_thresh: 0
dev.em.3.interrupts.rx_desc_min_thresh: 0
dev.em.3.interrupts.rx_overrun: 0
dev.em.3.host.breaker_tx_pkt: 0
dev.em.3.host.host_tx_pkt_discard: 0
dev.em.3.host.rx_pkt: 0
dev.em.3.host.breaker_rx_pkts: 0
dev.em.3.host.breaker_rx_pkt_drop: 0
dev.em.3.host.tx_good_pkt: 0
dev.em.3.host.breaker_tx_pkt_drop: 0
dev.em.3.host.rx_good_bytes: 0
dev.em.3.host.tx_good_bytes: 0
dev.em.3.host.length_errors: 0
dev.em.3.host.serdes_violation_pkt: 0
dev.em.3.host.header_redir_missed: 0

Thanks,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: routed source code

2010-11-13 Thread Patrick Mahan



On 11/13/2010 05:23 PM, Milen Dzhumerov wrote:

Hi all,

We're investigating some ways to perform symbolic execution of distributed systems and 
we're looking for real-world programs to test. The "routed" daemon[1] which is 
included with FreeBSD seemed like a good candidate and I was wondering whether anyone can 
point me to its implementation location in the source code repositories.

Thanks,
Milen



Milen,

routed resides in /sbin, so look in /usr/src/sbin/routed.

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Debugging em(4) driver

2010-11-14 Thread Patrick Mahan



On 11/13/2010 09:08 PM, Jack Vogel wrote:

The stats changed quite a bit for 8.1, they are much more informative now,
and they can be collected from anywhere not just the console.

I don't quite understand what you are trying to do, debug an em problem
or just debug a problematic situation by using em??

Mike is right, the driver in HEAD has some significant fixes and would be
the best thing to use.



We were noticing a lot of slow down on the link between the two HPs.  I would
start a normal ping between the two boxes, start the traffic generation and
I would see the ping's completely stop, even though I could see I was still
getting traffic on the em(4) interface.  Once the pings stop, we eventually
started seeing failure in our network app that needs to query it's peer.  It
would timeout trying to send a notification message to it's peer.

I wanted to see if there was a failure somewhere in the interface layer such
as dropping packets.  Plus I wanted to ensure the ringbuffer em(4) was using
wasn't starving for packets.

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Setting up a running FreeBSD/PCBSD system to enter kgdb on panic

2011-04-05 Thread Patrick Mahan


On 4/5/11 10:38 AM, fbsdm...@dnswatch.com wrote:
> 
> On Mon, April 4, 2011 5:07 pm, Eitan Adler wrote:
>> On Mon, Apr 4, 2011 at 7:35 PM, David Somayajulu
>>  wrote:
>>
>>> Hi All,
>>> Is there some way I can setup a running FreeBSD - (I use PCBSD7.2) - to
>>> break into kgdb when the system panics. I am trying to get a stack
>>> trace when "Fatal trap 12: page fault while in kernel mode" happens.
>>
>> debug.debugger_on_panic=1
> 
> Does this line go in C:\Windows\system32\win.ini?
> 

No, it's a sysctl line.  Issue as either root or via 'sudo' -

% sudo sysctl debug.debugger_on_panic=1

Assuming your kernel has been built with DDB/KDB enabled.

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Intel Pro/1000 PT Quad Port Bypass Server Adapter

2011-06-21 Thread Patrick Mahan
All,

We have a requirement for fail-to-wire that we are meeting by using these
types of bypass NIC's.  We have some from Silicom where they provided us
a modified em(4) driver, but now we have a few NICs coming that are straight
from Intel.  However the website doesn't list the correct driver(s) for
this card.

Will the current (or even the HEAD) em(4) driver work for this type of NIC?
I don't see anything in the sysctl for enabling/disabling the bypass mode.

Thanks for any help,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Usage of IFQ_DEQUEUE vs IFQ_DRV_DEQUEUE

2011-08-24 Thread Patrick Mahan
Can somebody confirm my assumption on the following:

  If I am supporting ALTQ in a driver, then I should use the
  IFQ_DRV_DEQUEUE() macro.  If I am not supporting ALTQ then
  it is okay IFQ_DEQUEUE() macro?  If not what's the difference?

Slightly confused...

Thanks,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Usage of IFQ_DEQUEUE vs IFQ_DRV_DEQUEUE

2011-08-26 Thread Patrick Mahan


On 8/24/11 11:42 AM, Sergey Kandaurov wrote:
> On 24 August 2011 22:12, Patrick Mahan  wrote:
>> Can somebody confirm my assumption on the following:
>>
>>  If I am supporting ALTQ in a driver, then I should use the
>>  IFQ_DRV_DEQUEUE() macro.  If I am not supporting ALTQ then
>>  it is okay IFQ_DEQUEUE() macro?  If not what's the difference?
>>
>> Slightly confused...
>>
> 
> Just in case, have you read man 9 altq? It has a good description of
> these macros.
> 

Sergey,

That is exactly what I was looking for.  Don't know how I missed that
(sometimes get caught up in looking at the source)

Thanks,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Any plans to upgrade the tftp client and server images for FreeBSD?

2009-12-30 Thread Patrick Mahan

Not sure if this is the correct list, but I am working as part of
a kernel team that is using FreeBSD 8.0 for it's base OS.

We have had a ongoing issue with our bootloader (u-boot) with it
being unable to tftp from the tftp server running on our FreeBSD
server.  We traced the issue down to the tftp code in u-boot was
using the 'blksize' option and was not handling the option nak
correctly.  Since we didn't want to have to require a change in
the bootloader, it was instead decided to fix the tftp server to
support RFC 2348.  After looking around the internet, we found that
the tftp server under NetBSD did support RFC 2348.  This made it
an easy port, one line change to the usr.bin/tftp/Makefile and a
slight change to libexec/tftpd.c (changed the name of an internal
function from 'sendfile' back to 'xmitfile').  It has been working
just fine for us.

So I have been tasked with asking if the FreeBSD developers would
like this code for future inclusion (or one of the current developers
could just grab it from NetBSD).

Reading the website it seems to contribute we need to be running -CURRENT
which is not currently possible (other reasons we are using 8.0.  This
is actually a recent upgrade as we were previously using FreeBSD 6.2).

So if this is something that could be useful, I have the code and a patch
to modify the original NetBSD code to contribute.

Also, if it is already done, then I was not able to view it (I tried the CVS and
SVN web source browser and did not see any changes related to adding RFC 2348
support.

Thanks for listening,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Anon port selection

2010-01-08 Thread Patrick Mahan

See inline -

Janne Huttunen wrote:

Hi!

The selection of anonymous port in FreeBSD seems to act
a bit weird (bug?). This was first observed on actual
use on FreeBSD 6.2, but I have verified that the it
behaves the same on a December snapshot of CURRENT too.

1. A process creates an UDP socket and sends a packet
   from it (at which point a local port is assigned
   for it).
2. Another process creates an UDP socket, sets
   SO_REUSEADDR (or SO_REUSEPORT) and sends a packet
   from it (at which point a local port is assigned
   for it).

Every now and then it happens that the second process
gets the same local port as the first one. If the
second process doesn't set the socket option this
won't happen. Note however, that the first process
does not have to cooperate in any way i.e. it does
not set any options.

Now, I'm fairly newbie when it comes to the FreeBSD
IP stack, but it seems to me that this phenomenon is
caused by the code in in_pcbconnect_setup(). If the
local port is zero the in_pcbbind_setup() is called
to select a port. That routine is called with the
local address set to the source address selected for
the outgoing packet, but when the port has been
selected, it is committed with INADDR_ANY as the
local address. Then when the second process in
in_pcbbind_setup() tries to check if the port is
already in use, it won't match the INADDR_ANY and
assigns the same port again.


Well it has been almost 20 years since I first ran across
this issue and was told back then that it was "as designed".
I believe you will see that this only happens when INADDR_ANY
is in effect.  If instead you use a specific IP address as
your source it should not happen.  I have not had a chance
to really go over the FreeBSD TCP/IP stack since the beginnings
of FreeBSD back in the early 90's (we were using basically the
same code for our product on a different architecture).

As an example of what the person was explaining he pointed to
the BIND code which expressly binds to each interface IP address
instead of too INADDR_ANY to prevent snooping.

I apologize if I am somewhat off base, having only re-entered
playing with FreeBSD in the last few months.

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Issues with em(4) device under FreeBSD 8.0

2010-02-18 Thread Patrick Mahan

All,

I have seen a few mentions on the mailing lists in regard to issues
with em(4) and FreeBSD 8.0 with regard to throughput.

We are also seeing similar issues on HP Proliant systems with
this HP GE interfaces.  Previously we were running FreeBSD 6.2 and
iperf was showing ~900 Mbits/sec between two directly connected
systems.  After the upgrade, iperf only shows around ~350 Mbits/sec.

This seems only to be happening on the HP's.  When we upgrade another
x86 box (privately built) we are seeing ~900 Mbits/sec even to
one of the HP systems.

I haven't seen anything yet to account for this behavior.  Has anyone
else seen similar issues?

Thanks,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


freebsd-net@freebsd.org

2010-02-23 Thread Patrick Mahan

>
>[12] 
>http://caia.swin.edu.au/newtcp/tools/caia_modularcc_v0.9.4_9.x.r203910.patch
>
>[13] http://caia.swin.edu.au/newtcp/tools/modularcc-readme-0.9.4.txt
>

I believe these are incorrect.  I find these documents at the following URLs:

[12] 
http://caia.swin.edu.au/urp/newtcp/tools/caia_modularcc_v0.9.4_9.x.r203910.patch
[13] http://caia.swin.edu.au/urp/newtcp/tools/modularcc-readme-0.9.4.txt

Thanks,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Multicast under FBSD 8.0

2010-06-22 Thread Patrick Mahan

All,

Hoping for a little insight as I am not a user of multicast nor
do I know much about the servers that use them.

In my day job, I am helping with the moving of my company's product
from FreeBSD 6.2 (i386) to FreeBSD 8.0 (amd64).  One of the daemons
wants to use 224.0.0.9 (routed? rip?) multicast group.

The problem is this worked fine on 6.2 but when we moved to 8.0
the daemon started reporting "Network unreachable" errors when
it was trying to send a packet out to the multicast group.

I tracked it down to the following in the routing table:

% netstat -nr

Routing tables

Internet:
DestinationGatewayFlagsRefs  Use  Netif Expire
default10.10.1.1  UG  00   bce0
10.10.0.0/16   link#5 U   3 1253   bce0
...
224.0.0.2  127.0.0.1  UH  00lo0
224.0.0.9  127.0.0.1  UH  00lo0

Notice that 224.0.0.9 has a route pointing to the loopback interface,
even though the code uses the IP_MULTICAST_IF socket option to specify
the interface.  If this entry does not exist or points to a true physical
interface, then there is no issue.

I did some research on this and found this code all changed in in_pcb.c
as part of revision 105629 for FreeBSD 7.2.  But I don't understand why
he change and why the loopback was no longer allowed.  I get asked
daily by the developers of this daemon for the reason, so I was hoping
to get some enlightment here.

Thanks for listening,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Multicast under FBSD 8.0

2010-06-25 Thread Patrick Mahan

Pierre,

The RIP source is all the BSD boxes in the current broadcast domain
that run our product.

The app does pick which interface to send the message out on and sets
that using the appropriate MULTICAST setsockoptions.

It is built for 8.0 (Or rather it is built using the 8.0 toolchain :-))
However, I believe this was something that was automatically configured
for each box under 6.2 as part of the normal configuration.  We have
since removed it and the app is now working fine.

I was being asked why the change occurred, and since multicast is not
something I am familiar with, I turned to this list.

Thanks,

Patrick

Pierre Lamy wrote:
Multicast traffic doesn't get routed in a traditional sense, it sort of 
gets repackaged for delivery to requesting recipients.


And 224/24 should never get retransmitted, it's for within a broadcast 
domain only.


Is the RIP source the BSD box itself? If so, the app should determine 
what interfaces to send on, and then use that. Can you recompile the 
daemon for 8?


Pierre

Patrick Mahan wrote:

All,

Hoping for a little insight as I am not a user of multicast nor
do I know much about the servers that use them.

In my day job, I am helping with the moving of my company's product
from FreeBSD 6.2 (i386) to FreeBSD 8.0 (amd64).  One of the daemons
wants to use 224.0.0.9 (routed? rip?) multicast group.

The problem is this worked fine on 6.2 but when we moved to 8.0
the daemon started reporting "Network unreachable" errors when
it was trying to send a packet out to the multicast group.

I tracked it down to the following in the routing table:

% netstat -nr

Routing tables

Internet:
DestinationGatewayFlagsRefs  Use  Netif 
Expire

default10.10.1.1  UG  00   bce0
10.10.0.0/16   link#5 U   3 1253   bce0
...
224.0.0.2  127.0.0.1  UH  00lo0
224.0.0.9  127.0.0.1  UH  00lo0

Notice that 224.0.0.9 has a route pointing to the loopback interface,
even though the code uses the IP_MULTICAST_IF socket option to specify
the interface.  If this entry does not exist or points to a true physical
interface, then there is no issue.

I did some research on this and found this code all changed in in_pcb.c
as part of revision 105629 for FreeBSD 7.2.  But I don't understand why
he change and why the loopback was no longer allowed.  I get asked
daily by the developers of this daemon for the reason, so I was hoping
to get some enlightment here.

Thanks for listening,

Patrick
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"






___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Looking for some education on ALTQ

2010-07-20 Thread Patrick Mahan

I am the first to admit I don't understand ALTQ and it's impact on QoS but
that said, I am trying to learn.

I have the following three systems.  All systems are running FreeBSD 8.0-p2 
Release.
I am attempting to learn how AltQ can be used to prioritized and setup banddwith
pipelines.

At the end of this message is the topology and all (I hope) the relevent info 
for
someone to help me understand what is happening.

I have setup AltQ on em0 on NPX3 with a queue set to run at 1.9Mbps.  However, 
in
testing by generating UDP traffic on NPX4 and having it received on NPX3 I am
seeing (what I believe) to be really low throughput scores for queue test7788
(see the pf.conf below).

pfctl -vv -s queue shows the bandwidth starts at 4.84 Kb/s, not 1.9 Mbps I was
expecting.

Am I

1. Not driving the datastream high enough to eat 1.9 Mbps (what should I run as
   my iperf -b value?

2. Miss configuring AltQ?

or

3. There is a bug in AltQ?

Thanks for the education -

Patrick


Network topology:


+--+   +--+
|  |   |  |
|   NPX4   |   |NPX3  |
| (em1)+ = +(em2) |
|  |   |  |
|  |   |(em0) |
+--+   +--+---+
  I
  I
  I
  I
  I
  I
  I
  I
   +--+---+
   |(em0) |
   |  |
   |NPX1  |
   |  |
   |  |
   +--+

NPX4:

  em1: 172.16.34.40/24

em1: flags=8843 metric 0 mtu 1500
options=19b
ether 00:1f:29:5f:c3:b8
inet 172.16.34.40 netmask 0xff00 broadcast 172.16.34.255
media: Ethernet autoselect (1000baseT )
status: active


NPX3:

  em2: 172.16.34.30/24

em2: flags=8843 metric 0 mtu 1500
options=19b
ether 00:1f:29:5f:c6:ab
inet 172.16.34.30 netmask 0xff00 broadcast 172.16.34.255
media: Ethernet autoselect (1000baseT )
status: active

  em0: 172.16.13.30/24

em0: flags=8843 metric 0 mtu 1500
options=19b
ether 00:1f:29:5f:c6:a9
inet 172.16.13.30 netmask 0xff00 broadcast 172.16.13.255
media: Ethernet autoselect (1000baseT )
status: active

NPX1:

  em0: 172.16.13.10/24

em0: flags=8843 metric 0 mtu 1500
options=19b
ether 00:1c:c4:47:1a:35
inet 172.16.13.10 netmask 0xff00 broadcast 172.16.13.255
media: Ethernet autoselect (1000baseT )
status: active

NPX4 IPv4 Routing table

npx4# netstat -nr
Routing tables

Internet:
DestinationGatewayFlagsRefs  Use  Netif Expire
default10.10.1.1  UG  0  248   bce0
10.10.0.0/16   link#5 U   4  1000928   bce0
10.10.20.44link#5 UHS 0   296522lo0
122.16.1.0/24  122.16.2.11UGS 040392em3
122.16.1.3 122.16.2.11UGHS0   251513em3
122.16.2.0/24  122.16.2.3 U   03em3
122.16.2.3 link#4 UHS 00lo0
127.0.0.0/8127.0.0.1  UR  00lo0
127.0.0.1  link#8 UH  0 45442955lo0
172.16.13.0/24 172.16.34.30   UGS 0  1754682em1
172.16.24.0/24 172.16.24.40   U   0  7805039em0
172.16.24.40   link#1 UHS 00lo0
172.16.34.0/24 172.16.34.40   U   010425em1
172.16.34.40   link#2 UHS 00lo0

NPX3 IPv4 Routing Table

npx3# netstat -nr
Routing tables

Internet:
DestinationGatewayFlagsRefs  Use  Netif Expire
default10.10.1.1  UG  00   bce0
10.10.0.0/16   link#5 U   521972   bce0
10.10.20.43link#5 UHS 0 5113lo0
127.0.0.0/8127.0.0.1  UR  00lo0
127.0.0.1  link#9 UH  0   310084lo0
172.16.13.0/24 link#1 U   0  1754798em0
172.16.13.30   link#1 UHS 00lo0
172.16.23.0/24 link#2 U   0  138em1
172.16.23.30   

AltQ throughput issues (long message)

2010-07-30 Thread Patrick Mahan

All,

I am looking for (again) some understanding of AltQ and how it works
w.r.t. packet through put.  I posted earlier this month regarding how
to initially configure AltQ (thanks to everyone's help) and now have
it working over the em(4) drive on a FreeBSD 8.0 platform (HP DL350 G5).

I had to bring the em(4) driver from the 8-Stable branch, but it is
working just fine so far (needed to add the drbr_needs_enqueue() to
if_var.h).

I have now gone back to trying to setup up one queue with a bandwith
of 1900 Kbs (1.9 Mbs).  I ran a test with 'iperf' using udp and setting
the bandwidth to 25 Mbs.  I then ran a test setting the queue bandwith
to 20 Mbs and running 'iperf' again using udp and 25 Mbs bandwith.

In both cases, the throughput only seems to be 89% of the requested
throughput.

Test 1
  AltQ queue bandwidth 1.9 Mbs, iperf -b 25M
  pfctl -vv -s queue reported:

queue root_em0 on em0 bandwidth 1Gb priority 0 cbq( wrr root ) {test7788}
  [ pkts:  28298  bytes:   42771988  dropped pkts:  0 bytes:  0 ]
  [ qlength:   0/ 50  borrows:  0  suspends:  0 ]
  [ measured:   140.8 packets/s, 1.70Mb/s ]
queue  test7788 on em0 bandwidth 1.90Mb cbq( default )
  [ pkts:  28298  bytes:   42771988  dropped pkts: 397077 bytes: 600380424 ]
  [ qlength:  50/ 50  borrows:  0  suspends:   3278 ]
  [ measured:   140.8 packets/s, 1.70Mb/s ]

  iperf reported

[ ID] Interval   Transfer Bandwidth   Jitter   Lost/Total Datagrams
[  3]  0.0-200.4 sec  39.7 MBytes  1.66 Mbits/sec  6.998 ms 397190/425533 (93%)

Test 2
  AltQ queue bandwidth 20 Mbs, iperf -b 25M
  pfctl -vv -s queue reported:

queue root_em0 on em0 bandwidth 1Gb priority 0 cbq( wrr root ) {test7788}
  [ pkts: 356702  bytes:  539329126  dropped pkts:  0 bytes:  0 ]
  [ qlength:   0/ 50  borrows:  0  suspends:  0 ]
  [ measured:  1500.2 packets/s, 18.15Mb/s ]
queue  test7788 on em0 bandwidth 20Mb cbq( default )
  [ pkts: 356702  bytes:  539329126  dropped pkts: 149198 bytes: 225587376 ]
  [ qlength:  46/ 50  borrows:  0  suspends:  39629 ]
  [ measured:  1500.2 packets/s, 18.15Mb/s ]

  iperf reported

[ ID] Interval   Transfer Bandwidth   Jitter   Lost/Total Datagrams
[  3]  0.0-240.0 sec505 MBytes  17.6 Mbits/sec  0.918 ms 150584/510637 (29%)

Why can AltQ not drive it at full bandwidth?  This is just some preliminary 
testing,
but I want to scale this up to use all available AltQ CBQ queues for various 
operations.


As always, my knowledge is always increased with I ask questions on
this list.

Thanks,

Patrick

=== Test Results ===
Network topology:


+--+   +--+
|  |   |  |
|   NPX8   |   |NPX3  |
| (em1)+ = +(em3) |
|  |   |  |
|  |   |(em0) |
+--+   +--+---+
  I
  I
  I
  I
  I
  I
  I
  I
   +--+---+
   |(em0) |
   |  |
   |NPX6  |
   |  |
   |  |
   +--+

NPX8:

  em1: 172.16.38.80/24

em1: flags=8843 metric 0 mtu 1500
options=19b
ether 00:1c:c4:48:93:10
inet 172.16.38.80 netmask 0xff00 broadcast 172.16.38.255
media: Ethernet autoselect (1000baseT )
status: active


NPX3:

  em3: 172.16.38.30/24

em3: flags=8843 metric 0 mtu 1500
options=19b
ether 00:1f:29:5f:c6:aa
inet 172.16.38.30 netmask 0xff00 broadcast 172.16.38.255
media: Ethernet autoselect (1000baseT )
status: active

  em0: 172.16.13.30/24

em0: flags=8843 metric 0 mtu 1500
options=19b
ether 00:1f:29:5f:c6:a9
inet 172.16.13.30 netmask 0xff00 broadcast 172.16.13.255
media: Ethernet autoselect (1000baseT )
status: active

NPX6:

  em0: 172.16.13.60/24

em0: flags=8843 metric 0 mtu 1500
options=19b
ether 00:1c:c4:48:95:d1
inet 172.16.13.60 netmask 0xff00 broadcast 172.16.13.255
media: Ethernet autoselect (1000baseT )
status: active

NPX8 IPv4 Routing table

npx8# netstat -nr
Routing tables

Internet:
DestinationGatewayFlagsRefs  Use  Ne

Re: AltQ throughput issues (long message)

2010-07-31 Thread Patrick Mahan


See my responses inline - PLM

On 07/30/2010 04:30 PM, Luigi Rizzo wrote:

On Fri, Jul 30, 2010 at 04:07:04PM -0700, Patrick Mahan wrote:

All,

I am looking for (again) some understanding of AltQ and how it works
w.r.t. packet through put.  I posted earlier this month regarding how
to initially configure AltQ (thanks to everyone's help) and now have
it working over the em(4) drive on a FreeBSD 8.0 platform (HP DL350 G5).

I had to bring the em(4) driver from the 8-Stable branch, but it is
working just fine so far (needed to add the drbr_needs_enqueue() to
if_var.h).

I have now gone back to trying to setup up one queue with a bandwith
of 1900 Kbs (1.9 Mbs).  I ran a test with 'iperf' using udp and setting
the bandwidth to 25 Mbs.  I then ran a test setting the queue bandwith
to 20 Mbs and running 'iperf' again using udp and 25 Mbs bandwith.

In both cases, the throughput only seems to be 89% of the requested
throughput.


part of it can be explained because AltQ counts the whole packet
(eg. 1514 bytes for a full frame) whereas iperf only considers the
UDP payload (e.g. 1470 bytes in your case).



Okay, but that only accounts for 3% and I am seeing around 11%, any
idea what might be accounting for the remaining 8%?


The other thing you should check is whether there is any extra
traffic going through the interface that competes for the bottleneck
bandwidth. You have such huge drop rates in your tests that i
would not be surprised if you had ICMP packets going around
trying to slow down the sender.


No extra traffic.  All machines have a bce0 (not shown in the diagram)
that acts as the management port on a 10.10.0.0 network.  Neither NPX8
nor NPX6 has forwarding enabled.  But that said NPX3 where the AltQ is
enabled does have ip forwarding enabled.  But should not be trying to
route any packets incoming on the bce0, but I will need to confirm.  Will
running pfctl before starting iperf for a few seconds be enough?  Should
I create a second queue to capture these possible 'extra' packets?

As for the em(4) interfaces, they are all connected directly to each other
using cat 6 cross-over cables, no hub/switch involved.

Where do you see the drop?  If you are looking at the end of the pfctl output
that is probably occurring after iperf has finished its run.  I have noticed
that the queue bandwidth numbers sharply decline after iperf has finished?




BTW have you tried dummynet in your config?



How would you suggest using dummynet?  Is it workable for a QoS solution?

Thanks as always,

Patrick


cheers
luigi



Test 1
   AltQ queue bandwidth 1.9 Mbs, iperf -b 25M
   pfctl -vv -s queue reported:

queue root_em0 on em0 bandwidth 1Gb priority 0 cbq( wrr root ) {test7788}
   [ pkts:  28298  bytes:   42771988  dropped pkts:  0 bytes:  0
   ]
   [ qlength:   0/ 50  borrows:  0  suspends:  0 ]
   [ measured:   140.8 packets/s, 1.70Mb/s ]
queue  test7788 on em0 bandwidth 1.90Mb cbq( default )
   [ pkts:  28298  bytes:   42771988  dropped pkts: 397077 bytes:
   600380424 ]
   [ qlength:  50/ 50  borrows:  0  suspends:   3278 ]
   [ measured:   140.8 packets/s, 1.70Mb/s ]

   iperf reported

[ ID] Interval   Transfer Bandwidth   Jitter   Lost/Total
Datagrams
[  3]  0.0-200.4 sec  39.7 MBytes  1.66 Mbits/sec  6.998 ms 397190/425533
(93%)

Test 2
   AltQ queue bandwidth 20 Mbs, iperf -b 25M
   pfctl -vv -s queue reported:

queue root_em0 on em0 bandwidth 1Gb priority 0 cbq( wrr root ) {test7788}
   [ pkts: 356702  bytes:  539329126  dropped pkts:  0 bytes:  0
   ]
   [ qlength:   0/ 50  borrows:  0  suspends:  0 ]
   [ measured:  1500.2 packets/s, 18.15Mb/s ]
queue  test7788 on em0 bandwidth 20Mb cbq( default )
   [ pkts: 356702  bytes:  539329126  dropped pkts: 149198 bytes:
   225587376 ]
   [ qlength:  46/ 50  borrows:  0  suspends:  39629 ]
   [ measured:  1500.2 packets/s, 18.15Mb/s ]

   iperf reported

[ ID] Interval   Transfer Bandwidth   Jitter   Lost/Total
Datagrams
[  3]  0.0-240.0 sec505 MBytes  17.6 Mbits/sec  0.918 ms 150584/510637
(29%)

Why can AltQ not drive it at full bandwidth?  This is just some preliminary
testing,
but I want to scale this up to use all available AltQ CBQ queues for
various operations.

As always, my knowledge is always increased with I ask questions on
this list.

Thanks,

Patrick

=== Test Results ===
Network topology:


+--+   +--+
|  |   |  |
|   NPX8   |   |NPX3  |
| (em1)+ = +(em3) |
|  |   |  |
|  |   |(em0) |
+--+   +--+---+
   I