Supermicro Bladeserver

2011-01-07 Thread Jack Vogel
I am trying to track down a problem being experienced at icir.org using
SuperMicro
bladeservers, the SERDES 82575 interfaces are having connectivity or perhaps
autoneg problems, resulting in link transitions and watchdog resets.

The closest hardware my org at Intel has is a Fujitsu server who's blades
also have
this device, but testing on that has failed to repro the problem.

I was wondering if anyone else out there has this hardware, if so could you
let me
know your experience, have you had problems or not, etc etc?

Thanks much for any information!

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em 7.1.9

2011-01-20 Thread Jack Vogel
NO, and i was rather irritated by a checkin that broke backward
compatibility without
even asking me first btw. That should be the only issue however, and it can
be fixed
by a define. I'll get there soon.

Jack


On Thu, Jan 20, 2011 at 9:00 AM, Mike Tancsa  wrote:

> Hi Jack,
>I was hoping to pull down the latest em drivers from HEAD to stable
> to
> see if it will help with the one issue I am seeing still as well as fix
> the tftp issue with small UDP packets (kern/152853. Apart from removing
> the sysctl changes (SVN rev 217556 on 2011-01-18 21:14:23Z by mdf) is
> there anything else that needs to be done in order to use this version
> of the driver on RELENG_8 ?
>
>---Mike
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: High interrupt rate on a PF box + performance

2011-01-27 Thread Jack Vogel
If you go to 8.2 and the latest driver you will get better stats also,
ahem...

Jack


On Thu, Jan 27, 2011 at 11:57 AM, Jeremy Chadwick
wrote:

> On Thu, Jan 27, 2011 at 08:39:40PM +0100, Damien Fleuriot wrote:
> >
> >
> > On 1/27/11 7:46 PM, Sergey Lobanov wrote:
> > > В сообщении от Пятница 28 января 2011 00:55:35 автор Damien Fleuriot
> написал:
> > >> On 1/27/11 6:41 PM, Vogel, Jack wrote:
> > >>> Jeremy is right, if you have a problem the first step is to try the
> > >>> latest code.
> > >>>
> > >>> However, when I look at the interrupts below I don't see what the
> problem
> > >>> is? The Broadcom seems to have about the same rate, it just doesn't
> have
> > >>> MSIX (multiple vectors).
> > >>>
> > >>> Jack
> > >>
> > >> My main concern is that the CPU %interrupt is quite high, also, we
> seem
> > >> to be experiencing input errors on the interfaces.
> > > Would you show igb tuning which is done in loader.conf and output of
> sysctl
> > > dev.igb.0?
> > > Did you rise number of igb descriptors such as:
> > > hw.igb.rxd=4096
> > > hw.igb.txd=4096 ?
> >
> > There is no tuning at all on our part in the loader's conf.
> >
> > Find below the sysctls:
> >
> > # sysctl -a |grep igb
> > dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 1.7.3
> > dev.igb.0.%driver: igb
> > dev.igb.0.%location: slot=0 function=0
> > dev.igb.0.%pnpinfo: vendor=0x8086 device=0x10d6 subvendor=0x8086
> > subdevice=0x145a class=0x02
> > dev.igb.0.%parent: pci14
> > dev.igb.0.debug: -1
> > dev.igb.0.stats: -1
> > dev.igb.0.flow_control: 3
> > dev.igb.0.enable_aim: 1
> > dev.igb.0.low_latency: 128
> > dev.igb.0.ave_latency: 450
> > dev.igb.0.bulk_latency: 1200
> > dev.igb.0.rx_processing_limit: 100
> > dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection version - 1.7.3
> > dev.igb.1.%driver: igb
> > dev.igb.1.%location: slot=0 function=1
> > dev.igb.1.%pnpinfo: vendor=0x8086 device=0x10d6 subvendor=0x8086
> > subdevice=0x145a class=0x02
> > dev.igb.1.%parent: pci14
> > dev.igb.1.debug: -1
> > dev.igb.1.stats: -1
> > dev.igb.1.flow_control: 3
> > dev.igb.1.enable_aim: 1
> > dev.igb.1.low_latency: 128
> > dev.igb.1.ave_latency: 450
> > dev.igb.1.bulk_latency: 1200
> > dev.igb.1.rx_processing_limit: 100
>
> I'm not aware of how to tune igb(4), so the advice Sergey gave you may
> be applicable.  You'll need to schedule downtime to adjust those
> tunables however (since a reboot will be requried).
>
> I also reviewed the munin graphs.  I don't see anything necessarily
> wrong.  However, you omitted yearly graphs for the network interfaces.
> Why I care about that:
>
> The pf state table (yearly) graph basically correlates with the CPU
> usage (yearly) graph, and I expect that the yearly network graphs would
> show a similar trend: an increase in your overall traffic over the
> course of a year.
>
> What I'm trying to figure out is what you're concerned about.  You are
> in fact pushing anywhere between 60-120MBytes/sec across these
> interfaces.  Given those numbers, I'm not surprised by the ""high""
> interrupt usage.
>
> Graphs of this nature usually indicate that you're hitting a
> "bottleneck" (for lack of better word) where you're simply doing "too
> much" with a single machine (given its network throughput).  The machine
> is spending a tremendous amount of CPU time handling network traffic,
> and equally as much with regards to the pf usage.
>
> If you want my opinion based on the information I have so far, it's
> this: you need to scale your infrastructure.  You can no longer rely on
> a single machine to handle this amount of traffic.
>
> As for the network errors you see -- to get low-level NIC and driver
> statistics, you'll need to run "sysctl dev.igb.X.stats=1" then run
> "dmesg" and look at the numbers shown (the sysctl command won't output
> anything itself).  This may help indicate where the packets are being
> lost.  You should also check the interface counters on the switch which
> these interfaces are connected to.  I sure hope it's a managed switch
> which can give you those statistics.
>
> Hope this helps, or at least acts as food for thought.
>
> --
> | Jeremy Chadwick   j...@parodius.com |
> | Parodius Networking   http://www.parodius.com/ |
> | UNIX Systems Administrator  Mountain View, CA, USA |
> | Making life hard for others since 1977.   PGP 4BD6C0CB |
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: High interrupt rate on a PF box + performance

2011-01-27 Thread Jack Vogel
On Thu, Jan 27, 2011 at 12:58 PM, Jeremy Chadwick
wrote:

>
> On Thu, Jan 27, 2011 at 09:38:22PM +0100, Damien Fleuriot wrote:
> > On 1/27/11 8:57 PM, Jeremy Chadwick wrote:
> >
>
> <...snipping out stuff...>
>
> > We're also considering moving to faster machines but I don't think that
> > will help much with our problem.
> >
> > I suppose additional CPU cores will be of no help at all, considering
> > the kernel is single threaded and runs on cpu0 only ?
>
> Kernel folks should be able to talk about this in detail, but my
> understanding is that the kernel itself supports multiple threads, but
> the question is whether or not the drivers or relevant "pieces" (e.g.
> igb(4) driver, pf, TCP stack, etc.) support SMP (multi-core/threading)
> or not.  I think this is referred to as something being "MPSAFE" or not.
>
>
The 8.X kernel is NOT single-threaded. Anything but. And the stack has
also been improved, I believe there are still bottlenecks but its far better
than the old days.

The igb driver in 8.2 creates up to 8 queues on the right hardware, they
are each auto-bound to a particular CPU.

The older version you are running had issues and hence multiqueue was
not enabled.  So, do upgrade once 8.2 is finalized :)

Cheers,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Jack Vogel
I don't test POLLING, sounds like its broken, I don't understand
why you think you need you need it?  This hardware supports
MSI why not use it?

Jack


2011/1/31 Lev Serebryakov 

> Hello, Freebsd-stable.
> You wrote 1 февраля 2011 г., 10:24:16:
>
> >   And all connections are reset. Before latest commits to driver
> > this system paniced in swi_clock. Now it works without panics, but
> > seems, that problem is not fixed completely.
>   I forgot to give one last pice of information: POLLING is in action.
> Without it single thread copy from this server via SMB eats one core
> of CPU completely.
>
> --
> // Black Lion AKA Lev Serebryakov 
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 with latest driver hangs again and again (without "Watchdog timeout" message!)

2011-02-23 Thread Jack Vogel
Anyone in net and stable that wants it, limits blocked it, so send me
personal email and I'll send to you.

Jack


On Wed, Feb 23, 2011 at 9:47 AM, Jack Vogel  wrote:

> Here is the 7.2.2 tarball. IMPORTANT: if you use this DO NOT try and put it
>
> into your kernel source tree, it will break that. What you must do is
> config the
> em driver OUT of your kernel, then untar this, build it standalone, and
> then
> load it.
>
> This is just a temporary thing, once I have data to decide on this change
> vs
> the earlier one it will get integrated.
>
> Jack
>
>
> 2011/2/23 Özkan KIRIK 
>
> Hi,
>>
>> How can we get 7.2.2. version of if_em driver ?
>> I wanna test it.
>>
>> I can help you for testing changes to em drivers.
>>
>>
>> Regards,
>> Ozkan KIRIK
>>
>> On Wed, Feb 23, 2011 at 1:36 PM, Lev Serebryakov 
>> wrote:
>> > Hello, Mike.
>> > You wrote 23 февраля 2011 г., 14:16:28:
>> >
>> >>>   Driver from "em driver, 82574L chip, and possibly ASPM" thread
>> >>>  doesn't help, really: it seems, that it decrease frequincy of hangs,
>> >> Looking at your sysctl output, you are not using the test drivers
>> posted
>> >> in that thread.
>> >  Yes, as it doesn't help, I've reverted to "stock" one.
>> >
>> >> If you want to try 7.1.9-test, you can download it at
>> >> http://www.tancsa.com/if_em-8.c for releng_8.
>> >  I've tried it. It has worked without hangs for 7-8 days, and after
>> > that hangs 2 times in 3 days with "7.1.9-test"  :(
>> >
>> > --
>> > // Black Lion AKA Lev Serebryakov 
>> >
>> > ___
>> > freebsd-stable@freebsd.org mailing list
>> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> > To unsubscribe, send any mail to "
>> freebsd-stable-unsubscr...@freebsd.org"
>> >
>>
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 with latest driver hangs again and again (without "Watchdogtimeout" message!)

2011-03-06 Thread Jack Vogel
Missed packets just mean that some temporary resource shortage or error
caused
the packet to be dropped. I don't believe this is indicative of a problem,
just let it
keep running, 2 days is good but 2 weeks is better :)

Thanks for testing it!

Jack


On Sun, Mar 6, 2011 at 4:37 AM, Özkan KIRIK  wrote:

> Hello,
>
> I've been testing the em.7.2.2 driver as kld. The system is up about 2
> days 6 hours.
> System has 4 em interfaces, Throughput is about 200Mbit/s. System
> didn't hang, but em2 has Input Errors.
>
> I saw that, dev.em.2.mac_stats.missed_packets is not zero? What could
> be the problem?
>
> # uname -r
> 8.2-RELEASE
>
> # sysctl dev.em.| grep miss
> dev.em.0.mac_stats.missed_packets: 0
> dev.em.1.mac_stats.missed_packets: 0
> dev.em.2.mac_stats.missed_packets: 5886
> dev.em.3.mac_stats.missed_packets: 0
>
> # netstat -nWI em2 | grep Link
> Name  Mtu Network   Address  Ipkts Ierrs Idrop
> Opkts Oerrs  Coll
> em2  1500   00:23:8b:89:e4:9e 267256324  5886 0
> 273081628 0 0
>
> # sysctl dev.em.2.
> dev.em.2.%desc: Intel(R) PRO/1000 Network Connection 7.2.2
> dev.em.2.%driver: em
> dev.em.2.%location: slot=0 function=0 handle=\_SB_.PCI0.P0P4.BR1E
> dev.em.2.%pnpinfo: vendor=0x8086 device=0x105e subvendor=0x108e
> subdevice=0x125e class=0x02
> dev.em.2.%parent: pci12
> dev.em.2.nvm: -1
> dev.em.2.debug: -1
> dev.em.2.rx_int_delay: 0
> dev.em.2.tx_int_delay: 66
> dev.em.2.rx_abs_int_delay: 66
> dev.em.2.tx_abs_int_delay: 66
> dev.em.2.rx_processing_limit: 100
> dev.em.2.flow_control: 3
> dev.em.2.eee_control: 0
> dev.em.2.link_irq: 0
> dev.em.2.mbuf_alloc_fail: 0
> dev.em.2.cluster_alloc_fail: 0
> dev.em.2.dropped: 0
> dev.em.2.tx_dma_fail: 0
> dev.em.2.rx_overruns: 7
> dev.em.2.watchdog_timeouts: 0
> dev.em.2.device_control: 1075577409
> dev.em.2.rx_control: 67141634
> dev.em.2.fc_high_water: 30720
> dev.em.2.fc_low_water: 29220
> dev.em.2.queue0.txd_head: 3025
> dev.em.2.queue0.txd_tail: 3025
> dev.em.2.queue0.tx_irq: 0
> dev.em.2.queue0.no_desc_avail: 0
> dev.em.2.queue0.rxd_head: 1826
> dev.em.2.queue0.rxd_tail: 1825
> dev.em.2.queue0.rx_irq: 0
> dev.em.2.mac_stats.excess_coll: 0
> dev.em.2.mac_stats.single_coll: 0
> dev.em.2.mac_stats.multiple_coll: 0
> dev.em.2.mac_stats.late_coll: 0
> dev.em.2.mac_stats.collision_count: 0
> dev.em.2.mac_stats.symbol_errors: 0
> dev.em.2.mac_stats.sequence_errors: 0
> dev.em.2.mac_stats.defer_count: 0
> dev.em.2.mac_stats.missed_packets: 5886
> dev.em.2.mac_stats.recv_no_buff: 3407
> dev.em.2.mac_stats.recv_undersize: 0
> dev.em.2.mac_stats.recv_fragmented: 0
> dev.em.2.mac_stats.recv_oversize: 0
> dev.em.2.mac_stats.recv_jabber: 0
> dev.em.2.mac_stats.recv_errs: 0
> dev.em.2.mac_stats.crc_errs: 0
> dev.em.2.mac_stats.alignment_errs: 0
> dev.em.2.mac_stats.coll_ext_errs: 0
> dev.em.2.mac_stats.xon_recvd: 0
> dev.em.2.mac_stats.xon_txd: 0
> dev.em.2.mac_stats.xoff_recvd: 0
> dev.em.2.mac_stats.xoff_txd: 0
> dev.em.2.mac_stats.total_pkts_recvd: 265358324
> dev.em.2.mac_stats.good_pkts_recvd: 265352438
> dev.em.2.mac_stats.bcast_pkts_recvd: 701728
> dev.em.2.mac_stats.mcast_pkts_recvd: 4076
> dev.em.2.mac_stats.rx_frames_64: 0
> dev.em.2.mac_stats.rx_frames_65_127: 140801982
> dev.em.2.mac_stats.rx_frames_128_255: 3553397
> dev.em.2.mac_stats.rx_frames_256_511: 3418754
> dev.em.2.mac_stats.rx_frames_512_1023: 8096866
> dev.em.2.mac_stats.rx_frames_1024_1522: 109481439
> dev.em.2.mac_stats.good_octets_recvd: 177455051448
> dev.em.2.mac_stats.good_octets_txd: 274861571704
> dev.em.2.mac_stats.total_pkts_txd: 270439410
> dev.em.2.mac_stats.good_pkts_txd: 270439410
> dev.em.2.mac_stats.bcast_pkts_txd: 194927
> dev.em.2.mac_stats.mcast_pkts_txd: 48
> dev.em.2.mac_stats.tx_frames_64: 23050855
> dev.em.2.mac_stats.tx_frames_65_127: 54156414
> dev.em.2.mac_stats.tx_frames_128_255: 4299280
> dev.em.2.mac_stats.tx_frames_256_511: 7837146
> dev.em.2.mac_stats.tx_frames_512_1023: 8272014
> dev.em.2.mac_stats.tx_frames_1024_1522: 172823701
> dev.em.2.mac_stats.tso_txd: 0
> dev.em.2.mac_stats.tso_ctx_fail: 0
> dev.em.2.interrupts.asserts: 283674059
> dev.em.2.interrupts.rx_pkt_timer: 33585
> dev.em.2.interrupts.rx_abs_timer: 0
> dev.em.2.interrupts.tx_pkt_timer: 11022
> dev.em.2.interrupts.tx_abs_timer: 22449
> dev.em.2.interrupts.tx_queue_empty: 0
> dev.em.2.interrupts.tx_queue_min_thresh: 0
> dev.em.2.interrupts.rx_desc_min_thresh: 0
> dev.em.2.interrupts.rx_overrun: 0
>
> Regards,
> Ozkan KIRIK
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 with latest driver hangs again and again (without "Watchdogtimeout" message!)

2011-03-11 Thread Jack Vogel
t; dev.em.3.queue0.tx_irq: 0
> dev.em.3.queue0.no_desc_avail: 0
> dev.em.3.queue0.rxd_head: 1489
> dev.em.3.queue0.rxd_tail: 1488
> dev.em.3.queue0.rx_irq: 0
> dev.em.3.mac_stats.excess_coll: 0
> dev.em.3.mac_stats.single_coll: 0
> dev.em.3.mac_stats.multiple_coll: 0
> dev.em.3.mac_stats.late_coll: 0
> dev.em.3.mac_stats.collision_count: 0
> dev.em.3.mac_stats.symbol_errors: 0
> dev.em.3.mac_stats.sequence_errors: 0
> dev.em.3.mac_stats.defer_count: 0
> dev.em.3.mac_stats.missed_packets: 5518
> dev.em.3.mac_stats.recv_no_buff: 31
> dev.em.3.mac_stats.recv_undersize: 0
> dev.em.3.mac_stats.recv_fragmented: 0
> dev.em.3.mac_stats.recv_oversize: 0
> dev.em.3.mac_stats.recv_jabber: 0
> dev.em.3.mac_stats.recv_errs: 0
> dev.em.3.mac_stats.crc_errs: 0
> dev.em.3.mac_stats.alignment_errs: 0
> dev.em.3.mac_stats.coll_ext_errs: 0
> dev.em.3.mac_stats.xon_recvd: 0
> dev.em.3.mac_stats.xon_txd: 0
> dev.em.3.mac_stats.xoff_recvd: 0
> dev.em.3.mac_stats.xoff_txd: 0
> dev.em.3.mac_stats.total_pkts_recvd: 1004852864
> dev.em.3.mac_stats.good_pkts_recvd: 1004847345
> dev.em.3.mac_stats.bcast_pkts_recvd: 19377766
> dev.em.3.mac_stats.mcast_pkts_recvd: 1713418
> dev.em.3.mac_stats.rx_frames_64: 1031384
> dev.em.3.mac_stats.rx_frames_65_127: 612329188
> dev.em.3.mac_stats.rx_frames_128_255: 21097424
> dev.em.3.mac_stats.rx_frames_256_511: 16515533
> dev.em.3.mac_stats.rx_frames_512_1023: 36547146
> dev.em.3.mac_stats.rx_frames_1024_1522: 317326670
> dev.em.3.mac_stats.good_octets_recvd: 529331348489
> dev.em.3.mac_stats.good_octets_txd: 1389567129164
> dev.em.3.mac_stats.total_pkts_txd: 1302125119
> dev.em.3.mac_stats.good_pkts_txd: 1302125119
> dev.em.3.mac_stats.bcast_pkts_txd: 412749
> dev.em.3.mac_stats.mcast_pkts_txd: 301
> dev.em.3.mac_stats.tx_frames_64: 156524010
> dev.em.3.mac_stats.tx_frames_65_127: 134491341
> dev.em.3.mac_stats.tx_frames_128_255: 25754249
> dev.em.3.mac_stats.tx_frames_256_511: 46463156
> dev.em.3.mac_stats.tx_frames_512_1023: 57886605
> dev.em.3.mac_stats.tx_frames_1024_1522: 881005758
> dev.em.3.mac_stats.tso_txd: 0
> dev.em.3.mac_stats.tso_ctx_fail: 0
> dev.em.3.interrupts.asserts: 1017261076
> dev.em.3.interrupts.rx_pkt_timer: 110374
> dev.em.3.interrupts.rx_abs_timer: 0
> dev.em.3.interrupts.tx_pkt_timer: 62746
> dev.em.3.interrupts.tx_abs_timer: 100063
> dev.em.3.interrupts.tx_queue_empty: 0
> dev.em.3.interrupts.tx_queue_min_thresh: 0
> dev.em.3.interrupts.rx_desc_min_thresh: 0
> dev.em.3.interrupts.rx_overrun: 0
>
>
> On Sun, Mar 6, 2011 at 9:48 PM, Jack Vogel  wrote:
> > Missed packets just mean that some temporary resource shortage or error
> > caused
> > the packet to be dropped. I don't believe this is indicative of a
> problem,
> > just let it
> > keep running, 2 days is good but 2 weeks is better :)
> >
> > Thanks for testing it!
> >
> > Jack
> >
> >
> > On Sun, Mar 6, 2011 at 4:37 AM, Özkan KIRIK 
> wrote:
> >>
> >> Hello,
> >>
> >> I've been testing the em.7.2.2 driver as kld. The system is up about 2
> >> days 6 hours.
> >> System has 4 em interfaces, Throughput is about 200Mbit/s. System
> >> didn't hang, but em2 has Input Errors.
> >>
> >> I saw that, dev.em.2.mac_stats.missed_packets is not zero? What could
> >> be the problem?
> >>
> >> # uname -r
> >> 8.2-RELEASE
> >>
> >> # sysctl dev.em.| grep miss
> >> dev.em.0.mac_stats.missed_packets: 0
> >> dev.em.1.mac_stats.missed_packets: 0
> >> dev.em.2.mac_stats.missed_packets: 5886
> >> dev.em.3.mac_stats.missed_packets: 0
> >>
> >> # netstat -nWI em2 | grep Link
> >> Name  Mtu Network   Address  Ipkts Ierrs Idrop
> >> Opkts Oerrs  Coll
> >> em2  1500   00:23:8b:89:e4:9e 267256324  5886 0
> >> 273081628 0 0
> >>
> >> # sysctl dev.em.2.
> >> dev.em.2.%desc: Intel(R) PRO/1000 Network Connection 7.2.2
> >> dev.em.2.%driver: em
> >> dev.em.2.%location: slot=0 function=0 handle=\_SB_.PCI0.P0P4.BR1E
> >> dev.em.2.%pnpinfo: vendor=0x8086 device=0x105e subvendor=0x108e
> >> subdevice=0x125e class=0x02
> >> dev.em.2.%parent: pci12
> >> dev.em.2.nvm: -1
> >> dev.em.2.debug: -1
> >> dev.em.2.rx_int_delay: 0
> >> dev.em.2.tx_int_delay: 66
> >> dev.em.2.rx_abs_int_delay: 66
> >> dev.em.2.tx_abs_int_delay: 66
> >> dev.em.2.rx_processing_limit: 100
> >> dev.em.2.flow_control: 3
> >> dev.em.2.eee_control: 0
> >> dev.em.2.link_irq: 0
> >> d

Re: ixgbe(4) and "Could not setup receive structures"

2011-04-14 Thread Jack Vogel
If you get this message its only for one reason, you don't have enough mbufs
to
fill your rings. You must do one of two things, either reduce the number of
queues,
or increase the relevant mbuf pool.

Increase the 9K mbuf cluster pool.

Jack


On Thu, Apr 14, 2011 at 6:05 AM, Leon Meßner
wrote:

> Hi,
>
> i tried setting the mtu on one of my ixgbe(4) intel NICs to support
> jumbo frames. This is on a box with RELENG_8 from today.
>
> # ifconfig ix0 mtu 9198
>
> I then get the following error:
>
> # tail -n 1 /var/log/messages
> Apr 14 12:48:43 siloneu kernel: ix0: Could not setup receive structures
>
> I already tried the following patch because of Jack Vogel's advice given
> in the following thread on -stable in Oct. last year, which still
> produces the same error message and leaves the box unpingable:
>
> http://lists.freebsd.org/pipermail/freebsd-stable/2010-October/059541.html
>
> # cat ~/patches/ixgbe.num_queues_to_4.patch
> --- /root/.vimbackup/ixgbe.c~   2011-04-12 22:14:27.0 +
> +++ sys/dev/ixgbe/ixgbe.c   2011-04-12 22:14:27.0 +
> @@ -273,7 +273,7 @@ TUNABLE_INT("hw.ixgbe.hdr_split", &ixgbe
>  * number of cpus. Each queue is a pair
>* of RX and TX rings with a msix vector
>  */
>  -static int ixgbe_num_queues = 0;
>  +static int ixgbe_num_queues = 4;
>   TUNABLE_INT("hw.ixgbe.num_queues", &ixgbe_num_queues);
>
> /*
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ixgbe(4) and "Could not setup receive structures"

2011-04-14 Thread Jack Vogel
So, what do you have in mind as the real problem then?

Jack


On Thu, Apr 14, 2011 at 11:55 AM, K. Macy  wrote:

> That isn't guaranteed to work if he is KVA limited.
>
> On Thu, Apr 14, 2011 at 6:44 PM, Jack Vogel  wrote:
> > If you get this message its only for one reason, you don't have enough
> mbufs
> > to
> > fill your rings. You must do one of two things, either reduce the number
> of
> > queues,
> > or increase the relevant mbuf pool.
> >
> > Increase the 9K mbuf cluster pool.
> >
> > Jack
> >
> >
> > On Thu, Apr 14, 2011 at 6:05 AM, Leon Meßner
> > wrote:
> >
> >> Hi,
> >>
> >> i tried setting the mtu on one of my ixgbe(4) intel NICs to support
> >> jumbo frames. This is on a box with RELENG_8 from today.
> >>
> >> # ifconfig ix0 mtu 9198
> >>
> >> I then get the following error:
> >>
> >> # tail -n 1 /var/log/messages
> >> Apr 14 12:48:43 siloneu kernel: ix0: Could not setup receive structures
> >>
> >> I already tried the following patch because of Jack Vogel's advice given
> >> in the following thread on -stable in Oct. last year, which still
> >> produces the same error message and leaves the box unpingable:
> >>
> >>
> http://lists.freebsd.org/pipermail/freebsd-stable/2010-October/059541.html
> >>
> >> # cat ~/patches/ixgbe.num_queues_to_4.patch
> >> --- /root/.vimbackup/ixgbe.c~   2011-04-12 22:14:27.0 +
> >> +++ sys/dev/ixgbe/ixgbe.c   2011-04-12 22:14:27.0 +
> >> @@ -273,7 +273,7 @@ TUNABLE_INT("hw.ixgbe.hdr_split", &ixgbe
> >>  * number of cpus. Each queue is a pair
> >>* of RX and TX rings with a msix vector
> >>  */
> >>  -static int ixgbe_num_queues = 0;
> >>  +static int ixgbe_num_queues = 4;
> >>   TUNABLE_INT("hw.ixgbe.num_queues", &ixgbe_num_queues);
> >>
> >> /*
> >>
> >> ___
> >> freebsd-stable@freebsd.org mailing list
> >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> >> To unsubscribe, send any mail to "
> freebsd-stable-unsubscr...@freebsd.org"
> >>
> > ___
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org
> "
> >
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ixgbe(4) and "Could not setup receive structures"

2011-04-14 Thread Jack Vogel
If you are using the latest code, then the RX ring size is set to be 2K
descriptors, so you
will use that many 9k jumbos per queue to initialize things. Having a spare
amount free
to use as you clean/refresh is needed also.

I upped the ring size for performance reasons on 10G, its possible to try
dropping it
to 1K.

But, for 10G, I don't think its unreasonable to have enough memory around to
handle
this.

Cheers,

Jack


On Thu, Apr 14, 2011 at 12:44 PM, Leon Meßner  wrote:

> On Thu, Apr 14, 2011 at 08:55:17PM +0200, K. Macy wrote:
> > That isn't guaranteed to work if he is KVA limited.
> >
> > On Thu, Apr 14, 2011 at 6:44 PM, Jack Vogel  wrote:
> > > If you get this message its only for one reason, you don't have enough
> mbufs
> > > to
> > > fill your rings. You must do one of two things, either reduce the
> number of
> > > queues,
> > > or increase the relevant mbuf pool.
> > >
> > > Increase the 9K mbuf cluster pool.
>
> I did set it to twice the default, and now it works and netstat -m
> shows:
>
> 8192/391/8583/12800 9k jumbo clusters in use (current/cache/total/max)
>
> Whats a reasonable amount to set kern.ipc.nmbjumbo9 to and is there
> any
> form of auto-tuning (i have absolutely no load on this machine and
> mbufs
> are higher than default pool size).
>
> Thanks to all,
> Leon
>
> > > On Thu, Apr 14, 2011 at 6:05 AM, Leon Meßner
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> i tried setting the mtu on one of my ixgbe(4) intel NICs to support
> > >> jumbo frames. This is on a box with RELENG_8 from today.
> > >>
> > >> # ifconfig ix0 mtu 9198
> > >>
> > >> I then get the following error:
> > >>
> > >> # tail -n 1 /var/log/messages
> > >> Apr 14 12:48:43 siloneu kernel: ix0: Could not setup receive
> structures
> > >>
> > >> I already tried the following patch because of Jack Vogel's advice
> given
> > >> in the following thread on -stable in Oct. last year, which still
> > >> produces the same error message and leaves the box unpingable:
> > >>
> > >>
> http://lists.freebsd.org/pipermail/freebsd-stable/2010-October/059541.html
> > >>
> > >> # cat ~/patches/ixgbe.num_queues_to_4.patch
> > >> --- /root/.vimbackup/ixgbe.c~   2011-04-12 22:14:27.0 +
> > >> +++ sys/dev/ixgbe/ixgbe.c   2011-04-12 22:14:27.0 +
> > >> @@ -273,7 +273,7 @@ TUNABLE_INT("hw.ixgbe.hdr_split", &ixgbe
> > >>  * number of cpus. Each queue is a pair
> > >>* of RX and TX rings with a msix vector
> > >>  */
> > >>  -static int ixgbe_num_queues = 0;
> > >>  +static int ixgbe_num_queues = 4;
> > >>   TUNABLE_INT("hw.ixgbe.num_queues", &ixgbe_num_queues);
> > >>
> > >> /*
> > >>
> > >> ___
> > >> freebsd-stable@freebsd.org mailing list
> > >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > >> To unsubscribe, send any mail to "
> freebsd-stable-unsubscr...@freebsd.org"
> > >>
> > > ___
> > > freebsd-stable@freebsd.org mailing list
> > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > > To unsubscribe, send any mail to "
> freebsd-stable-unsubscr...@freebsd.org"
> > >
> > ___
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org
> "
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: No data received with Intel Corporation Gigabit CT Desktop Adapter (82574L)

2011-04-28 Thread Jack Vogel
Notice this:  em0: Using MSIX interrupts with 0 vectors

ZERO vectors are not a good sign :) You need to look at your system, you
have MSIX
disabled or something? Maybe some message in /var/log/messages??

Jack


On Thu, Apr 28, 2011 at 1:20 PM, Wiktor Niesiobedzki  wrote:

> Hi,
>
> I've installed Intel Gigabit CT Desktop Adapter in my FreeBSD 8.2 box
> and I can't see any incoming traffic on this card. Even ARP resolution
> doesn't work. Though I see the outgoing traffic on the other end.
>
> Relevant info:
> kadlubek# uname -a
> FreeBSD kadlubek 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #20: Sat Feb 12
> 21:22:19 CET 2011 root@kadlubek:/usr/obj/usr/src/sys/KADLUB  i386
>
> kadlubek# dmesg | grep em0
> em0:  port 0xdc00-0xdc1f
> mem 0xfe9e-0xfe9f,0xfe90-0xfe97,0xfe9dc000-0xfe9d
> irq 24 at device 0.0 on pci2
> em0: Using MSIX interrupts with 0 vectors
> em0: [FILTER]
>
> kadlubek# ping 192.168.115.1
> PING 192.168.115.1 (192.168.115.1): 56 data bytes
> ping: sendto: Host is down
> ping: sendto: Host is down
>
> In mean time - tcpdump shows:
> kadlubek# tcpdump -i em0
> 22:03:55.962118 ARP, Request who-has 192.168.115.1 tell
> 192.168.115.220, length 28
> 22:03:56.967107 ARP, Request who-has 192.168.115.1 tell
> 192.168.115.220, length 28
> 22:03:57.972094 ARP, Request who-has 192.168.115.1 tell
> 192.168.115.220, length 28
>
> I've checked the firewall rules, but there are none there:
> kadlubek# pfctl -s rules
> No ALTQ support in kernel
> ALTQ related functions disabled
>
> pciconf -lv shows the card as:
> em0@pci0:2:0:0: class=0x02 card=0xa01f8086 chip=0x10d38086 rev=0x00
> hdr=0x00
>vendor = 'Intel Corporation'
>device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
>class  = network
>subclass   = ethernet
>
>
> kadlubek# sysctl dev.em.0
> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.1.9
> dev.em.0.%driver: em
> dev.em.0.%location: slot=0 function=0 handle=\_SB_.PCI0.NBPG.NPGS
> dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086
> subdevice=0xa01f class=0x02
> dev.em.0.%parent: pci2
> dev.em.0.nvm: -1
> dev.em.0.debug: -1
> dev.em.0.rx_int_delay: 0
> dev.em.0.tx_int_delay: 66
> dev.em.0.rx_abs_int_delay: 66
> dev.em.0.tx_abs_int_delay: 66
> dev.em.0.rx_processing_limit: 100
> dev.em.0.flow_control: 3
> dev.em.0.link_irq: 0
> dev.em.0.mbuf_alloc_fail: 0
> dev.em.0.cluster_alloc_fail: 0
> dev.em.0.dropped: 0
> dev.em.0.tx_dma_fail: 0
> dev.em.0.rx_overruns: 0
> dev.em.0.watchdog_timeouts: 0
> dev.em.0.device_control: 1477444168
> dev.em.0.rx_control: 67141634
> dev.em.0.fc_high_water: 18432
> dev.em.0.fc_low_water: 16932
> dev.em.0.queue0.txd_head: 35
> dev.em.0.queue0.txd_tail: 35
> dev.em.0.queue0.tx_irq: 0
> dev.em.0.queue0.no_desc_avail: 0
> dev.em.0.queue0.rxd_head: 117
> dev.em.0.queue0.rxd_tail: 1023
> dev.em.0.queue0.rx_irq: 0
> dev.em.0.mac_stats.excess_coll: 0
> dev.em.0.mac_stats.single_coll: 0
> dev.em.0.mac_stats.multiple_coll: 0
> dev.em.0.mac_stats.late_coll: 0
> dev.em.0.mac_stats.collision_count: 0
> dev.em.0.mac_stats.symbol_errors: 0
> dev.em.0.mac_stats.sequence_errors: 0
> dev.em.0.mac_stats.defer_count: 0
> dev.em.0.mac_stats.missed_packets: 0
> dev.em.0.mac_stats.recv_no_buff: 0
> dev.em.0.mac_stats.recv_undersize: 0
> dev.em.0.mac_stats.recv_fragmented: 0
> dev.em.0.mac_stats.recv_oversize: 0
> dev.em.0.mac_stats.recv_jabber: 0
> dev.em.0.mac_stats.recv_errs: 0
> dev.em.0.mac_stats.crc_errs: 0
> dev.em.0.mac_stats.alignment_errs: 0
> dev.em.0.mac_stats.coll_ext_errs: 0
> dev.em.0.mac_stats.xon_recvd: 0
> dev.em.0.mac_stats.xon_txd: 0
> dev.em.0.mac_stats.xoff_recvd: 0
> dev.em.0.mac_stats.xoff_txd: 0
> dev.em.0.mac_stats.total_pkts_recvd: 117
> dev.em.0.mac_stats.good_pkts_recvd: 117
> dev.em.0.mac_stats.bcast_pkts_recvd: 41
> dev.em.0.mac_stats.mcast_pkts_recvd: 42
> dev.em.0.mac_stats.rx_frames_64: 71
> dev.em.0.mac_stats.rx_frames_65_127: 0
> dev.em.0.mac_stats.rx_frames_128_255: 35
> dev.em.0.mac_stats.rx_frames_256_511: 11
> dev.em.0.mac_stats.rx_frames_512_1023: 0
> dev.em.0.mac_stats.rx_frames_1024_1522: 0
> dev.em.0.mac_stats.good_octets_recvd: 15499
> dev.em.0.mac_stats.good_octets_txd: 2240
> dev.em.0.mac_stats.total_pkts_txd: 35
> dev.em.0.mac_stats.good_pkts_txd: 35
> dev.em.0.mac_stats.bcast_pkts_txd: 35
> dev.em.0.mac_stats.mcast_pkts_txd: 0
> dev.em.0.mac_stats.tx_frames_64: 35
> dev.em.0.mac_stats.tx_frames_65_127: 0
> dev.em.0.mac_stats.tx_frames_128_255: 0
> dev.em.0.mac_stats.tx_frames_256_511: 0
> dev.em.0.mac_stats.tx_frames_512_1023: 0
> dev.em.0.mac_stats.tx_frames_1024_1522: 0
> dev.em.0.mac_stats.tso_txd: 0
> dev.em.0.mac_stats.tso_ctx_fail: 0
> dev.em.0.interrupts.asserts: 2
> dev.em.0.interrupts.rx_pkt_timer: 0
> dev.em.0.interrupts.rx_abs_timer: 0
> dev.em.0.interrupts.tx_pkt_timer: 0
> dev.em.0.interrupts.tx_abs_timer: 0
> dev.em.0.interrupts.tx_queue_empty: 0
> dev.em.0.interrupts.tx_queue_min_thresh: 0
> dev.em.0.interrupts.rx_desc_min_thresh: 0
> dev.em.0.interrup

Re: No data received with Intel Corporation Gigabit CT Desktop Adapter (82574L)

2011-04-28 Thread Jack Vogel
Well, rebuild your kernel so the driver is not static, then you can load and
unload
the driver to see what happens.  You only have one interface, no em1?

Jack


On Thu, Apr 28, 2011 at 2:17 PM, Wiktor Niesiobedzki  wrote:

> Hi,
>
> I really don't know (I haven't done that intentionally). There is
> nothing special in /var/log/messages:
> kadlubek# grep -i msix /var/log/messages
> Apr 28 21:37:03 kadlubek kernel: em0: Using MSIX interrupts with 0 vectors
>
> Though sysctl suggests, that I haven't disabled MSIX:
> kadlubek# sysctl -a | grep -i msix
> hw.pci.enable_msix: 1
>
> I've checked further pciconf output (now with -c option also) and there is:
> em0@pci0:2:0:0: class=0x02 card=0xa01f8086 chip=0x10d38086 rev=0x00
> hdr=0x00
>vendor = 'Intel Corporation'
>device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
>class  = network
>subclass   = ethernet
> cap 01[c8] = powerspec 2  supports D0 D3  current D0
>cap 05[d0] = MSI supports 1 message, 64 bit
>cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
>
> So it looks, like the card supports MSI-X and has them enabled.
>
> Though my PCI Express bridges report as:
> pcib2@pci0:0:2:0:   class=0x060400 card=0xc3231106 chip=0xa3641106
> rev=0x80 hdr=0x01
>vendor = 'VIA Technologies, Inc.'
>device = 'P4M900 PCI to PCI Bridge Controller'
>class  = bridge
>subclass   = PCI-PCI
>cap 10[40] = PCI-Express 1 root port max data 128(256) link x1(x1)
>cap 01[68] = powerspec 2  supports D0 D3  current D0
>cap 05[70] = MSI supports 1 message, 64 bit
>cap 08[88] = HT MSI fixed address window disabled at 0xfee0
>cap 0d[98] = PCI Bridge card=0xc3231106
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
> ecap 0002[140] = VC 1 max VC1
> ecap 0005[180] = unknown 1
> pcib3@pci0:0:3:0:   class=0x060400 card=0xc3231106 chip=0xc3641106
> rev=0x80 hdr=0x01
>vendor = 'VIA Technologies, Inc.'
>device = 'P4M900 PCI to PCI Bridge Controller'
>class  = bridge
>subclass   = PCI-PCI
>cap 10[40] = PCI-Express 1 root port max data 128(256) link x1(x1)
>cap 01[68] = powerspec 2  supports D0 D3  current D0
>cap 05[70] = MSI supports 1 message, 64 bit
>cap 08[88] = HT MSI fixed address window disabled at 0xfee0
>cap 0d[98] = PCI Bridge card=0xc3231106
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
> ecap 0002[140] = VC 1 max VC1
> ecap 0005[180] = unknown 1
>
> pcib5@pci0:128:0:0: class=0x060400 card=0x chip=0x287c1106
> rev=0x00 hdr=0x01
>vendor = 'VIA Technologies, Inc.'
>device = 'VT8251 Standard PCIe Root Port'
>class  = bridge
>subclass   = PCI-PCI
>cap 10[40] = PCI-Express 1 root port max data 128(256) link x0(x2)
>cap 01[68] = powerspec 2  supports D0 D3  current D0
>cap 05[70] = MSI supports 1 message, 64 bit, vector masks
>cap 08[88] = HT MSI fixed address window disabled at 0xfee0
>cap 0d[90] = PCI Bridge card=0x
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
> ecap 0002[140] = VC 1 max VC1
> ecap 0005[180] = unknown 1
> pcib6@pci0:128:0:1: class=0x060400 card=0x0004 chip=0x287d1106
> rev=0x00 hdr=0x01
>vendor = 'VIA Technologies, Inc.'
>device = 'VT8251 Standard PCIe Root Port'
>class  = bridge
>subclass   = PCI-PCI
>cap 10[40] = PCI-Express 1 root port max data 128(256) link x0(x1)
>cap 01[68] = powerspec 2  supports D0 D3  current D0
>cap 05[70] = MSI supports 1 message, 64 bit, vector masks
>cap 08[88] = HT MSI fixed address window disabled at 0xfee0
>cap 0d[90] = PCI Bridge card=0x0004
>
> Though they mention that HT MSI windows is disabled. I'm not sure,
> whether this matters.
>
> Cheers,
>
> Wiktor
>
> 2011/4/28 Jack Vogel :
> > Notice this:  em0: Using MSIX interrupts with 0 vectors
> >
> > ZERO vectors are not a good sign :) You need to look at your system, you
> > have MSIX
> > disabled or something? Maybe some message in /var/log/messages??
> >
> > Jack
> >
> >
> > On Thu, Apr 28, 2011 at 1:20 PM, Wiktor Niesiobedzki 
> wrote:
> >>
> >> Hi,
> >>
> >> I've installed Intel Gigabit CT Desktop Adapter in my FreeBSD 8.2 box
> >> and I can't see any incoming traffic on this card. Even ARP resolution
> >> doesn't work. Though I see the outgoing traffic on the other 

Re: No data received with Intel Corporation Gigabit CT Desktop Adapter (82574L)

2011-04-28 Thread Jack Vogel
On Thu, Apr 28, 2011 at 2:28 PM, John Baldwin  wrote:

> On Thursday, April 28, 2011 5:17:11 pm Wiktor Niesiobedzki wrote:
> > Hi,
> >
> > I really don't know (I haven't done that intentionally). There is
> > nothing special in /var/log/messages:
> > kadlubek# grep -i msix /var/log/messages
> > Apr 28 21:37:03 kadlubek kernel: em0: Using MSIX interrupts with 0
> vectors
> >
> > Though sysctl suggests, that I haven't disabled MSIX:
> > kadlubek# sysctl -a | grep -i msix
> > hw.pci.enable_msix: 1
> >
> > I've checked further pciconf output (now with -c option also) and there
> is:
> > em0@pci0:2:0:0: class=0x02 card=0xa01f8086 chip=0x10d38086 rev=0x00
> hdr=0x00
> > vendor = 'Intel Corporation'
> > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> > class  = network
> > subclass   = ethernet
> > cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > cap 05[d0] = MSI supports 1 message, 64 bit
> > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
> >
> > So it looks, like the card supports MSI-X and has them enabled.
> >
> > Though my PCI Express bridges report as:
> > pcib2@pci0:0:2:0:   class=0x060400 card=0xc3231106 chip=0xa3641106
> > rev=0x80 hdr=0x01
> > vendor = 'VIA Technologies, Inc.'
> > device = 'P4M900 PCI to PCI Bridge Controller'
> > class  = bridge
> > subclass   = PCI-PCI
> > cap 10[40] = PCI-Express 1 root port max data 128(256) link x1(x1)
> > cap 01[68] = powerspec 2  supports D0 D3  current D0
> > cap 05[70] = MSI supports 1 message, 64 bit
> > cap 08[88] = HT MSI fixed address window disabled at 0xfee0
> > cap 0d[98] = PCI Bridge card=0xc3231106
> > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
> > ecap 0002[140] = VC 1 max VC1
> > ecap 0005[180] = unknown 1
> > pcib3@pci0:0:3:0:   class=0x060400 card=0xc3231106 chip=0xc3641106
> > rev=0x80 hdr=0x01
> > vendor = 'VIA Technologies, Inc.'
> > device = 'P4M900 PCI to PCI Bridge Controller'
> > class  = bridge
> > subclass   = PCI-PCI
> > cap 10[40] = PCI-Express 1 root port max data 128(256) link x1(x1)
> > cap 01[68] = powerspec 2  supports D0 D3  current D0
> > cap 05[70] = MSI supports 1 message, 64 bit
> > cap 08[88] = HT MSI fixed address window disabled at 0xfee0
> > cap 0d[98] = PCI Bridge card=0xc3231106
> > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
> > ecap 0002[140] = VC 1 max VC1
> > ecap 0005[180] = unknown 1
> >
> > pcib5@pci0:128:0:0: class=0x060400 card=0x chip=0x287c1106
> > rev=0x00 hdr=0x01
> > vendor = 'VIA Technologies, Inc.'
> > device = 'VT8251 Standard PCIe Root Port'
> > class  = bridge
> > subclass   = PCI-PCI
> > cap 10[40] = PCI-Express 1 root port max data 128(256) link x0(x2)
> > cap 01[68] = powerspec 2  supports D0 D3  current D0
> > cap 05[70] = MSI supports 1 message, 64 bit, vector masks
> > cap 08[88] = HT MSI fixed address window disabled at 0xfee0
> > cap 0d[90] = PCI Bridge card=0x
> > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
> > ecap 0002[140] = VC 1 max VC1
> > ecap 0005[180] = unknown 1
> > pcib6@pci0:128:0:1: class=0x060400 card=0x0004 chip=0x287d1106
> > rev=0x00 hdr=0x01
> > vendor = 'VIA Technologies, Inc.'
> > device = 'VT8251 Standard PCIe Root Port'
> > class  = bridge
> > subclass   = PCI-PCI
> > cap 10[40] = PCI-Express 1 root port max data 128(256) link x0(x1)
> > cap 01[68] = powerspec 2  supports D0 D3  current D0
> > cap 05[70] = MSI supports 1 message, 64 bit, vector masks
> > cap 08[88] = HT MSI fixed address window disabled at 0xfee0
> > cap 0d[90] = PCI Bridge card=0x0004
> >
> > Though they mention that HT MSI windows is disabled. I'm not sure,
> > whether this matters.
>
> Yes, that is probably what breaks this.
>
> --
> John Baldwin
>

Opps, missed that, thanks John.  So, disable MSIX and MSI using sysctl,
then the driver should use legacy when it loads.

Still, I'd get a different motherboard, sucks to not have MSIX :(

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: No data received with Intel Corporation Gigabit CT Desktop Adapter (82574L)

2011-05-02 Thread Jack Vogel
Thanks John. Was gonna say... the code has been as it is forever, with
everyone else in the world working fine, figured something odd was going
on.

Jack


On Mon, May 2, 2011 at 7:01 AM, John Baldwin  wrote:

> On Saturday, April 30, 2011 2:42:11 am Wiktor Niesiobedzki wrote:
> > 2011/4/29 Wiktor Niesiobedzki :
> > > 2011/4/28 Jack Vogel :
> > >> On Thu, Apr 28, 2011 at 2:28 PM, John Baldwin 
> wrote:
> > >>>
> > >>> On Thursday, April 28, 2011 5:17:11 pm Wiktor Niesiobedzki wrote:
> > >>> > Though they mention that HT MSI windows is disabled. I'm not sure,
> > >>> > whether this matters.
> > >>>
> > >>> Yes, that is probably what breaks this.
> > >>>
> > >>> --
> > >>> John Baldwin
> > >>
> > >> Opps, missed that, thanks John.  So, disable MSIX and MSI using
> sysctl,
> > >> then the driver should use legacy when it loads.
> > >>
> > >> Still, I'd get a different motherboard, sucks to not have MSIX :(
> > >>
> > >
> > > Thanks for hints. I've disabled MSIX and MSI:
> > > kadlubek# sysctl hw.pci | grep msi
> > > hw.pci.honor_msi_blacklist: 1
> > > hw.pci.enable_msix: 0
> > > hw.pci.enable_msi: 0
> > >
> >
> > Ok, I found other way round about this. I've did some source code
> > reading and found following tunable:
> > hw.em.enable_msix=0
> >
> > When set in loader.conf to 0, then the card magically starts to work
> > properly. The only thing in our code in em_setup_msix(), that raises
> > my doubts, is the following code path:
> >
> >
> > int rid = PCIR_BAR(EM_MSIX_BAR);
> > adapter->msix_mem = bus_alloc_resource_any(dev,
> > SYS_RES_MEMORY, &rid, RF_ACTIVE);
> > ...
> > bus_release_resource(dev, SYS_RES_MEMORY,
> > PCIR_BAR(EM_MSIX_BAR), adapter->msix_mem);
> >
> > Though manpage for bus_release_resource specifies, that rid needs to
> > be exactly the same, as this returned by bus_alloc_resource.
>
> In practice the rid is not changed for PCI resources, so the code is fine.
>
> > Changing the bus_release_resource to use rid instead of
> > PCIR_BAR(EM_MSIX_BAR) makes the card working, with sysctl settings:
> > hw.pci.enable_msix: 0
> > hw.pci.enable_msi: 0
> >
> > instead of hw.em.enable_msix=0
> >
> > The only thing that worries me, that when I don't have MSIX disabled
> > (anyway), then driver succeeds with the MSI-X allocation. Shouldn't we
> > in em_setup_msix check, how many vectors we have allocated with
> > pci_alloc_msix and if this is 0, then fallback to MSI/Legacy?
> >
> > Or maybe pci_alloc_msix should report an error, when no
> > PCIB_ALLOC_MSIX succeded?
>
> Well, the problem here is that PCIB_ALLOC_MSIX() worked fine.  There is
> another bug that is breaking MSI in your system that I need to finish the
> fix for in 9 before it can be MFC'd.
>
> --
> John Baldwin
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Intel "em" driver sleeps with non-sleepable lock.

2011-05-05 Thread Jack Vogel
So, this happens EVERY time after an install of 8.2 ??

Give me details about the hardware please.

Jack


On Thu, May 5, 2011 at 11:11 AM, Zaphod Beeblebrox wrote:

> The motherboard in question is made by Intel and contains a Xeon 3440
> (4 core x 2 HT per core).  16 Gig of RAM is installed and we are
> installing the 64 bit FreeBSD 8.2 using the PC-BSD installer (to
> install zfs root faster).  The motherboard has 4 "igb" ethernet and
> one "em" ethernet.  The "em" ethernet is shared with an internal
> "RMM3" remote management card and/or the onboard ILOM.  This error
> happens when rebooting after installation and is repeatable with at
> least FreeBSD 8.1 and FreeBSD 8.2.
>
> The last boot message is "Starting devd" ... so I assume that the
> active link on em0 might be making devd start dhclient.  After this
> last boot message, the screen reads:
>
> Sleeping thread (tid 100195, pid 619) owns a non-sleepable lock
> panic: sleeping thread
> cpuid = 2
> KDB: stack backtrace:
> #0 0x805f4e03 at kdb_backtrace+0x5e
> #1 0x805c2d07 at panic+0x187
> #2 0x80601a5d at propagate_priority+0x1cd
> #3 0x8060278a at turnstile_wait+0x1aa
> #4 0x805b34c0 at _mtx_lock_sleep+0xb0
> #5 0x8032fd97 at em_init_locked+0xce7
> #6 0x80331b8e at em_ioctl+0x5fe
> #7 0x80671114 at ifioctl+0x9e4
> #8 0x806043c2 at kern_ioctl+0x102
> #9 0x806045fd at ioctl+0xfd
> #10 0xff80600dd5 at syscallenter+0x1e5
> #11 0xff808aca5b at syscall+0x4b
> #12 0xff80895292 at Xfast_syscall+0xe2
>
> Now. I assume that booting without link on em0 (inconvenient) or
> booting without "em" in the kernel will fix things.  I'll be checking
> this out shortly.
>
> I can provide (for a limited time) full console access and/or I can
> test code if someone sees a patch for this.
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0 watchdog timeouts on 8-STABLE

2011-06-15 Thread Jack Vogel
I have hardware now, am working on reproducing this. Just curious, do you
have
the em driver defined in the kernel, or as a module?

Jack


On Wed, Jun 15, 2011 at 2:09 AM, Joshua Boyd  wrote:

> On Wed, Jun 15, 2011 at 3:57 AM, Jeremy Chadwick
> wrote:
>
> > On Wed, Jun 15, 2011 at 03:14:43AM -0400, Joshua Boyd wrote:
> > > I recently updated my server to the latest 8-STABLE, and upgraded to
> v28
> > > ZFS. I have not had these problems on any other version of 8-STABLE or
> > > 7-STABLE, which this box was upgraded from some time ago.
> > >
> > > Now, during my weekly scrub, I get the following messages and em0 is
> > > unresponsive:
> > >
> > > Jun 12 03:07:58 foghornleghorn kernel: em0: Watchdog timeout --
> resetting
> > > Jun 12 03:07:58 foghornleghorn kernel: em0: link state changed to DOWN
> > > Jun 12 03:08:01 foghornleghorn kernel: em0: link state changed to UP
> > > Jun 12 03:08:47 foghornleghorn kernel: em0: Watchdog timeout --
> resetting
> > > Jun 12 03:08:47 foghornleghorn kernel: em0: link state changed to DOWN
> > > Jun 12 03:08:50 foghornleghorn kernel: em0: link state changed to UP
> > >
> > > My scrub is scheduled to start at 03:00:00, so it looks like watchdog
> > > timeouts start occurring pretty quickly once I/O ramps up.
> > >
> > > Here's some possibly relevant information, let me know if anything else
> > > would be helpful to troubleshoot.
> > >
> > > FreeBSD foghornleghorn.res.openband.net 8.2-STABLE FreeBSD 8.2-STABLE
> > #17:
> > > Mon Jun  6 19:40:19 EDT 2011
> > > r...@foghornleghorn.res.openband.net:
> /usr/obj/usr/src/sys/FOGHORNLEGHORN
> > >  amd64
> > >
> > > em0:  port
> > 0xe800-0xe83f
> > > mem 0xfebe-0xfebf,0xfebc-0xfebd irq 20 at device 5.0 on
> > pci7
> > >
> > > em0@pci0:7:5:0: class=0x02 card=0x13768086 chip=0x107c8086
> rev=0x05
> > > hdr=0x00
> > > vendor = 'Intel Corporation'
> > > device = 'Gigabit Ethernet Controller (Copper) rev 5 (82541PI)'
> > > class  = network
> > > subclass   = ethernet
> > >
> > > And, the SAS cards:
> > >
> > > dev.mpt.0.%desc: LSILogic SAS/SATA Adapter
> > > dev.mpt.0.%driver: mpt
> > > dev.mpt.0.%location: slot=0 function=0
> > > dev.mpt.0.%pnpinfo: vendor=0x1000 device=0x0058 subvendor=0x15d9
> > > subdevice=0xa580 class=0x01
> > > dev.mpt.0.%parent: pci1
> > > dev.mpt.0.debug: 3
> > > dev.mpt.0.role: 1
> > > dev.mpt.1.%desc: LSILogic SAS/SATA Adapter
> > > dev.mpt.1.%driver: mpt
> > > dev.mpt.1.%location: slot=0 function=0
> > > dev.mpt.1.%pnpinfo: vendor=0x1000 device=0x0058 subvendor=0x15d9
> > > subdevice=0xa580 class=0x01
> > > dev.mpt.1.%parent: pci2
> > > dev.mpt.1.debug: 3
> > > dev.mpt.1.role: 1
> > > dev.mpt.2.%desc: LSILogic SAS/SATA Adapter
> > > dev.mpt.2.%driver: mpt
> > > dev.mpt.2.%location: slot=0 function=0
> > > dev.mpt.2.%pnpinfo: vendor=0x1000 device=0x0058 subvendor=0x1000
> > > subdevice=0x30a0 class=0x01
> > > dev.mpt.2.%parent: pci6
> > > dev.mpt.2.debug: 3
> > > dev.mpt.2.role: 1
> >
> > Please provide output from the following commands (as root):
> >
> > # pciconf -lvcb
> >
>
> hostb0@pci0:0:0:0: class=0x06 card=0x59561002 chip=0x59561002 rev=0x00
> hdr=0x00
>vendor = 'ATI Technologies Inc. / Advanced Micro Devices, Inc.'
>device = 'RD790 GFX Dual Slot'
>class  = bridge
>subclass   = HOST-PCI
> pcib1@pci0:0:2:0: class=0x060400 card=0x59561002 chip=0x59781002 rev=0x00
> hdr=0x01
>vendor = 'ATI Technologies Inc. / Advanced Micro Devices, Inc.'
>device = 'RD790 PCI to PCI bridge (external gfx0 port A)'
>class  = bridge
>subclass   = PCI-PCI
> pcib2@pci0:0:3:0: class=0x060400 card=0x59561002 chip=0x59791002 rev=0x00
> hdr=0x01
>vendor = 'ATI Technologies Inc. / Advanced Micro Devices, Inc.'
>device = 'RD790 PCI to PCI bridge (external gfx0 port B)'
>class  = bridge
>subclass   = PCI-PCI
> pcib3@pci0:0:4:0: class=0x060400 card=0x59561002 chip=0x597a1002 rev=0x00
> hdr=0x01
>vendor = 'ATI Technologies Inc. / Advanced Micro Devices, Inc.'
>device = 'RD790 PCI to PCI bridge (PCIe gpp port A)'
>class  = bridge
>subclass   = PCI-PCI
> pcib4@pci0:0:6:0: class=0x060400 card=0x59561002 chip=0x597c1002 rev=0x00
> hdr=0x01
>vendor = 'ATI Technologies Inc. / Advanced Micro Devices, Inc.'
>device = 'RD790 PCI to PCI bridge (PCIe gpp port C)'
>class  = bridge
>subclass   = PCI-PCI
> pcib5@pci0:0:7:0: class=0x060400 card=0x59561002 chip=0x597d1002 rev=0x00
> hdr=0x01
>vendor = 'ATI Technologies Inc. / Advanced Micro Devices, Inc.'
>device = 'RD790 PCI to PCI bridge (PCIe gpp port D)'
>class  = bridge
>subclass   = PCI-PCI
> pcib6@pci0:0:11:0: class=0x060400 card=0x59561002 chip=0x59801002 rev=0x00
> hdr=0x01
>vendor = 'ATI Technologies Inc. / Advanced Micro Devices, Inc.'
>device = 'RD790 PCI to PCI bridge (external gfx1 port A)'
>class  = bridge
>subclass   = PCI-PCI

Re: em0 watchdog timeouts on 8-STABLE

2011-06-21 Thread Jack Vogel
I cannot repro this, I used your kernel config, this is on a Dell 1850 btw,
I ran netperf stress from 3 clients, and have seen no watchdogs :(

Jack


On Tue, Jun 21, 2011 at 7:59 PM, Joshua Boyd  wrote:

> If needed, I can reproduce this on demand. Just need to know what sort of
> statistics are needed when the problem is occurring. I've had to turn off my
> weekly scrubs until I can figure out how to fix this problem.
>
>
> On Wed, Jun 15, 2011 at 8:37 PM, Joshua Boyd  wrote:
>
>> In the kernel. Here's my kernel configuration:
>>
>> http://pastebin.com/raw.php?i=4JL814m3
>>
>>  On Wed, Jun 15, 2011 at 8:20 PM, Jack Vogel  wrote:
>>
>>> I have hardware now, am working on reproducing this. Just curious, do you
>>> have
>>> the em driver defined in the kernel, or as a module?
>>>
>>> Jack
>>>
>>>
>>> On Wed, Jun 15, 2011 at 2:09 AM, Joshua Boyd  wrote:
>>>
>>>> On Wed, Jun 15, 2011 at 3:57 AM, Jeremy Chadwick
>>>> wrote:
>>>>
>>>> > On Wed, Jun 15, 2011 at 03:14:43AM -0400, Joshua Boyd wrote:
>>>> > > I recently updated my server to the latest 8-STABLE, and upgraded to
>>>> v28
>>>> > > ZFS. I have not had these problems on any other version of 8-STABLE
>>>> or
>>>> > > 7-STABLE, which this box was upgraded from some time ago.
>>>> > >
>>>> > > Now, during my weekly scrub, I get the following messages and em0 is
>>>> > > unresponsive:
>>>> > >
>>>> > > Jun 12 03:07:58 foghornleghorn kernel: em0: Watchdog timeout --
>>>> resetting
>>>> > > Jun 12 03:07:58 foghornleghorn kernel: em0: link state changed to
>>>> DOWN
>>>> > > Jun 12 03:08:01 foghornleghorn kernel: em0: link state changed to UP
>>>> > > Jun 12 03:08:47 foghornleghorn kernel: em0: Watchdog timeout --
>>>> resetting
>>>> > > Jun 12 03:08:47 foghornleghorn kernel: em0: link state changed to
>>>> DOWN
>>>> > > Jun 12 03:08:50 foghornleghorn kernel: em0: link state changed to UP
>>>> > >
>>>> > > My scrub is scheduled to start at 03:00:00, so it looks like
>>>> watchdog
>>>> > > timeouts start occurring pretty quickly once I/O ramps up.
>>>> > >
>>>> > > Here's some possibly relevant information, let me know if anything
>>>> else
>>>> > > would be helpful to troubleshoot.
>>>> > >
>>>> > > FreeBSD foghornleghorn.res.openband.net 8.2-STABLE FreeBSD
>>>> 8.2-STABLE
>>>> > #17:
>>>> > > Mon Jun  6 19:40:19 EDT 2011
>>>> > > r...@foghornleghorn.res.openband.net:
>>>> /usr/obj/usr/src/sys/FOGHORNLEGHORN
>>>> > >  amd64
>>>> > >
>>>> > > em0:  port
>>>> > 0xe800-0xe83f
>>>> > > mem 0xfebe-0xfebf,0xfebc-0xfebd irq 20 at device 5.0
>>>> on
>>>> > pci7
>>>> > >
>>>> > > em0@pci0:7:5:0: class=0x02 card=0x13768086 chip=0x107c8086
>>>> rev=0x05
>>>> > > hdr=0x00
>>>> > > vendor = 'Intel Corporation'
>>>> > > device = 'Gigabit Ethernet Controller (Copper) rev 5
>>>> (82541PI)'
>>>> > > class  = network
>>>> > > subclass   = ethernet
>>>> > >
>>>> > > And, the SAS cards:
>>>> > >
>>>> > > dev.mpt.0.%desc: LSILogic SAS/SATA Adapter
>>>> > > dev.mpt.0.%driver: mpt
>>>> > > dev.mpt.0.%location: slot=0 function=0
>>>> > > dev.mpt.0.%pnpinfo: vendor=0x1000 device=0x0058 subvendor=0x15d9
>>>> > > subdevice=0xa580 class=0x01
>>>> > > dev.mpt.0.%parent: pci1
>>>> > > dev.mpt.0.debug: 3
>>>> > > dev.mpt.0.role: 1
>>>> > > dev.mpt.1.%desc: LSILogic SAS/SATA Adapter
>>>> > > dev.mpt.1.%driver: mpt
>>>> > > dev.mpt.1.%location: slot=0 function=0
>>>> > > dev.mpt.1.%pnpinfo: vendor=0x1000 device=0x0058 subvendor=0x15d9
>>>> > > subdevice=0xa580 class=0x01
>>>> > > dev.mpt.1.%parent: pci2
>>>> > > dev.mpt.1.debug: 3
>>>> > > dev.mpt.1.role: 1
>>>> >

Re: 10G Inter adapter

2011-08-23 Thread Jack Vogel
What OS release are you going to be using, 8.2 ?  The driver in HEAD is the
latest code, the internal tarball goes thru release machinery so it is
lagging a
bit (2.3.8 vs 2.3.11), you should be OK in either case, but I'd recommend
the newer.

Jack


On Tue, Aug 23, 2011 at 12:39 AM, Sami Halabi  wrote:

> Hi everyone,
> i have a 82599EB network card.
> the ixgbe driver on 8* supports 82598 cards, although it identified my card
> but i'm not sure it will work fine and won't make kernel panics, since its
> a
> production server i want to put a good driver that will work without
> problems.
>
> I just found this from the Intel website.
> Looks like they have added support for the 82598 and 82599 which isn't on
> the ixgbe driver on 8 stable/release.
>
> *
>
> http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=14688&lang=eng
> *
>
> is it safe to dowload and put the files of this driver in /sys/dev/ixgbe
> and
> then install it?
>
> is it simply: *make && make install && make clean*?
> i guess i need to recompile the kernel also because ixgbe is on the GENERIC
> kernel.
>
> what do you advise me?
>
>
>
> --
> Sami Halabi
> Information Systems Engineer
> NMS Projects Expert
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: rsync corrupted MAC

2011-10-10 Thread Jack Vogel
Well, for a start I'd get both interfaces at the same speed, sounds like a
hardware
issue of some sort, cable or switch maybe?

Jack


On Mon, Oct 10, 2011 at 5:42 PM, Larry Rosenman  wrote:

> On Mon, 10 Oct 2011, Jeremy Chadwick wrote:
>
>  On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote:
>>
>>> On 10/10/2011 3:57 PM, Louis Mamakos wrote:
>>>
>>>> On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:
>>>>
>>>>  On 10/10/2011 10:47 AM, John Baldwin wrote:
>>>>>
>>>>>> On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:
>>>>>>
>>>>>>> Any ideas on which side or what might be broke here?
>>>>>>>
>>>>>>> ler/MAIL-ARCHIVE/2008/12/INBOX
>>>>>>> Corrupted MAC on input.
>>>>>>> Disconnecting: Packet corrupt
>>>>>>> rsync: connection unexpectedly closed (33845045 bytes received so
>>>>>>> far)
>>>>>>>
>>>>>> [receiver]
>>>>>>
>>>>>>> rsync error: error in rsync protocol data stream (code 12) at
>>>>>>> io.c(605)
>>>>>>>
>>>>>> [receiver=3.0.9]
>>>>>>
>>>>>>> rsync: connection unexpectedly closed (1450 bytes received so far)
>>>>>>>
>>>>>> [generator]
>>>>>>
>>>>>>> rsync error: unexplained error (code 255) at io.c(605)
>>>>>>> [generator=3.0.9]
>>>>>>>
>>>>>> I've had somewhat similar issues (ssh getting corruption in its data
>>>>>> stream)
>>>>>> when a NIC in my netbook was corrupting packet data when it ran at 1G
>>>>>> (it
>>>>>> worked fine at 10/100).  Pyun eventually fixed the issue by applying
>>>>>> enough
>>>>>> workarounds (it was likely a hardware bug in the NIC's chipset).
>>>>>>  However, it
>>>>>> wasn't easy to debug unfortunately. :(
>>>>>>
>>>>>>  Any ideas on where to start?
>>>>>
>>>>> from the 8.2 box (tbh.lerctr.org in the script):
>>>>>
>>>>> 8.2->PIX->Provider->Internet->**Motorola SBG6580
>>>>> (Time-Warner)->Trendnet TEG-160WS Gig switch->9.0 box (borg.lerctr.org
>>>>> ).
>>>>>
>>>>> So, where do I start?
>>>>>
>>>> I'd turn off IP / TCP / UDP checksum offloading on your NIC if it
>>>> supports it, and see if you are getting network layer checksum errors.  If
>>>> the IP checksum is wrong, then it happened on the last hops between the NIC
>>>> and memory or across the previous network hop.
>>>>
>>>>
>>>>
>>>>  Good idea, but, it didn't show ANY errors on EITHER side (both are
>>> em nics).
>>>
>>> Next?
>>> $ ifconfig em0
>>> em0: flags=8843 metric 0 mtu
>>> 1500
>>>options=2098
>>>ether 00:30:48:2e:99:ba
>>>inet 192.147.25.65 netmask 0xff00 broadcast 192.147.25.255
>>>inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
>>>inet 192.147.25.45 netmask 0xff00 broadcast 192.147.25.255
>>>inet 192.147.25.11 netmask 0xff00 broadcast 192.147.25.255
>>>nd6 options=3
>>>media: Ethernet autoselect (100baseTX )
>>>status: active
>>> $
>>> $ uname -a
>>> FreeBSD thebighonker.lerctr.org 8.2-STABLE FreeBSD 8.2-STABLE #45:
>>> Sat Oct  8 10:57:43 CDT 2011
>>> r...@thebighonker.lerctr.org:/**usr/obj/usr/src/sys/**THEBIGHONKER
>>> amd64
>>> $
>>>
>>>
>>>
>>> $ ifconfig em0
>>> em0: flags=8843 metric 0 mtu
>>> 1500
>>>options=2088
>>>ether 00:30:48:8e:9f:f3
>>>inet 192.168.200.4 netmask 0xff00 broadcast 192.168.200.255
>>>inet6 fe80::230:48ff:fe8e:9ff3%em0 prefixlen 64 scopeid 0x1
>>>nd6 options=29
>>>media: Ethernet autoselect (1000baseT )
>>>status: active
>>> $ uname -a
>>> FreeBSD borg.lerctr.org 9.0-BETA3 FreeBSD 9.0-BETA3 #1: Sun Oct  9
>>> 10:03:42 CDT 2011
>>> r...@borg.lerctr.org:/usr/obj/**usr/src/sys/BORG-DTRACE  amd64
>>> $
>>>
>&

Re: rsync corrupted MAC

2011-10-11 Thread Jack Vogel
Oh, I see.  So, did you have a previous working state?

Jack


On Tue, Oct 11, 2011 at 12:06 AM, Larry Rosenman  wrote:

> ** They are not local to each other. See the diagram. They are across the
> internet from each other.
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>
>
> Jack Vogel  wrote:
>>
>> Well, for a start I'd get both interfaces at the same speed, sounds like a
>> hardware
>> issue of some sort, cable or switch maybe?
>>
>> Jack
>>
>>
>> On Mon, Oct 10, 2011 at 5:42 PM, Larry Rosenman  wrote:
>>
>>> On Mon, 10 Oct 2011, Jeremy Chadwick wrote:
>>>
>>>  On Mon, Oct 10, 2011 at 04:15:25PM -0500, Larry Rosenman wrote:
>>>>
>>>>> On 10/10/2011 3:57 PM, Louis Mamakos wrote:
>>>>>
>>>>>> On Oct 10, 2011, at 2:38 PM, Larry Rosenman wrote:
>>>>>>
>>>>>>  On 10/10/2011 10:47 AM, John Baldwin wrote:
>>>>>>>
>>>>>>>> On Sunday, October 09, 2011 5:06:26 pm Larry Rosenman wrote:
>>>>>>>>
>>>>>>>>> Any ideas on which side or what might be broke here?
>>>>>>>>>
>>>>>>>>> ler/MAIL-ARCHIVE/2008/12/INBOX
>>>>>>>>> Corrupted MAC on input.
>>>>>>>>> Disconnecting: Packet corrupt
>>>>>>>>> rsync: connection unexpectedly closed (33845045 bytes received so
>>>>>>>>> far)
>>>>>>>>>
>>>>>>>> [receiver]
>>>>>>>>
>>>>>>>>> rsync error: error in rsync protocol data stream (code 12) at
>>>>>>>>> io.c(605)
>>>>>>>>>
>>>>>>>> [receiver=3.0.9]
>>>>>>>>
>>>>>>>>> rsync: connection unexpectedly closed (1450 bytes received so far)
>>>>>>>>>
>>>>>>>> [generator]
>>>>>>>>
>>>>>>>>> rsync error: unexplained error (code 255) at io.c(605)
>>>>>>>>> [generator=3.0.9]
>>>>>>>>>
>>>>>>>> I've had somewhat similar issues (ssh getting corruption in its data
>>>>>>>> stream)
>>>>>>>> when a NIC in my netbook was corrupting packet data when it ran at
>>>>>>>> 1G (it
>>>>>>>> worked fine at 10/100).  Pyun eventually fixed the issue by applying
>>>>>>>> enough
>>>>>>>> workarounds (it was likely a hardware bug in the NIC's chipset).
>>>>>>>>  However, it
>>>>>>>> wasn't easy to debug unfortunately. :(
>>>>>>>>
>>>>>>>>  Any ideas on where to start?
>>>>>>>
>>>>>>> from the 8.2 box (tbh.lerctr.org in the script):
>>>>>>>
>>>>>>> 8.2->PIX->Provider->Internet->**Motorola SBG6580
>>>>>>> (Time-Warner)->Trendnet TEG-160WS Gig switch->9.0 box (
>>>>>>> borg.lerctr.org).
>>>>>>>
>>>>>>> So, where do I start?
>>>>>>>
>>>>>> I'd turn off IP / TCP / UDP checksum offloading on your NIC if it
>>>>>> supports it, and see if you are getting network layer checksum errors.  
>>>>>> If
>>>>>> the IP checksum is wrong, then it happened on the last hops between the 
>>>>>> NIC
>>>>>> and memory or across the previous network hop.
>>>>>>
>>>>>>
>>>>>>
>>>>>>  Good idea, but, it didn't show ANY errors on EITHER side (both are
>>>>> em nics).
>>>>>
>>>>> Next?
>>>>> $ ifconfig em0
>>>>> em0: flags=8843 metric 0 mtu
>>>>> 1500
>>>>>options=2098
>>>>>ether 00:30:48:2e:99:ba
>>>>>inet 192.147.25.65 netmask 0xff00 broadcast 192.147.25.255
>>>>>inet6 fe80::230:48ff:fe2e:99ba%em0 prefixlen 64 scopeid 0x1
>>>>>inet 192.147.25.45 netmask 0xff00 broadcast 192.147.25.255
>>>>>inet 192.147.25.11 netmask 0xff00 broadcast 192.147.25.255
>>>>>nd6 options=3
>>>>> 

Re: em0 watchdog timeout

2011-11-13 Thread Jack Vogel
On Sun, Nov 13, 2011 at 10:22 AM, Willem Jan Withagen wrote:

> On 2011-11-10 23:25, Joshua Boyd wrote:
>
>> On Thu, Nov 10, 2011 at 6:51 AM, Willem Jan Withagen > > wrote:
>>
>>em0@pci0:0:25:0:class=0x02 card=0x10bd15d9
>>chip=0x10bd8086 rev=0x02 hdr=0x00
>>vendor = 'Intel Corporation'
>>device = 'Intel 82566DM Gigabit Ethernet Adapter (82566DM)'
>>class  = network
>>subclass   = ethernet
>>bar   [10] = type Memory, range 32, base 0xdf90, size
>>131072, enabled
>>bar   [14] = type Memory, range 32, base 0xdf924000, size 4096,
>>enabled
>>bar   [18] = type I/O Port, range 32, base 0x1820, size 32, enabled
>>cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
>>cap 13[e0] = PCI Advanced Features: FLR TP
>>
>>
>>And note that this problem only raises it nasty head very few weeks...
>>
>>
>> I have had the same problem, as shown here:
>>
>> http://lists.freebsd.org/**pipermail/freebsd-stable/2011-**
>> June/063092.html
>>
>> According to your pciconf output, your card either doesn't support
>> MSI-X, or you have MSI-X disabled.
>>
>> Check the hw.pci.enable_msix sysctl and make sure that it is set to 1.
>> Also check to make sure there aren't any BIOS settings blocking MSI-X.
>>
>> Apparently the older Intel gigabit cards don't support MSI-X, and as
>> such get starved.
>>
>
> Upgraded to a new bios, but that does not help either.
>
> Now the trick question will be:
>IF I get a new servertype PCI-E ethernet card, would that get me
>an MSI-X ethernet device.
>
>
There is no 'trick' to it :)  The only MSIX capable device that uses the em
driver is 82574. But if you go with igb (82575 and beyond) they are all
MSIX and multiqueue capable.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: igb hang when cable unplugged

2011-11-25 Thread Jack Vogel
On Fri, Nov 25, 2011 at 1:18 PM, Daniel Kalchev  wrote:

> I am observing an transmit hang of the igb driver when the cable is
> unplugged. It only recovers after unit reset, such as
>
> ifconfig igb0 down up
>
> This is with kernel
>
> FreeBSD xxx 8.2-STABLE FreeBSD 8.2-STABLE #0: Fri Sep 30 16:17:47 EEST
> 2011 root@xxx:/usr/obj/usr/src/sys/GENERIC  amd64
>
> igb0:  port
> 0x3020-0x303f mem
> 0xb1d6-0xb1d7,0xb1d4-0xb1d5,0xb1e04000-0xb1e07fff irq 37 at
> device 0.0 on pci13
> igb0: Using MSIX interrupts with 9 vectors
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: Ethernet address: 00:25:90:36:ee:7c
>
> The interface is quad port Supermicro branded PCI-E card with
>
> pciconf -vl
>
> igb0@pci0:13:0:0:   class=0x02 card=0x10c915d9 chip=0x10c98086
> rev=0x01 hdr=0x00
>vendor = 'Intel Corporation'
>class  = network
>subclass   = ethernet
> igb1@pci0:13:0:1:   class=0x02 card=0x10c915d9 chip=0x10c98086
> rev=0x01 hdr=0x00
>vendor = 'Intel Corporation'
>class  = network
>subclass   = ethernet
> igb2@pci0:16:0:0:   class=0x02 card=0x10c915d9 chip=0x10c98086
> rev=0x01 hdr=0x00
>vendor = 'Intel Corporation'
>class  = network
>subclass   = ethernet
> igb3@pci0:16:0:1:   class=0x02 card=0x10c915d9 chip=0x10c98086
> rev=0x01 hdr=0x00
>vendor = 'Intel Corporation'
>class  = network
>subclass   = ethernet
>
>
> Has anyone experience something like this? Is there solution? It is very
> inconvenient to have to down/up the interfaces manually via the IPMI
> console when such thing happens.
>
>
Ya, don't unplug the cable  :)

Just a bit of holiday humor   will look into the issue after the long
weekend.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


nmbclusters: how do we want to fix this for 8.3 ?

2012-02-22 Thread Jack Vogel
Using igb and/or ixgbe on a reasonably powered server requires 1K mbuf
clusters per MSIX vector,
that's how many are in a ring. Either driver will configure 8 queues on a
system with that many or more
cores, so 8K clusters per port...

My test engineer has a system with 2 igb ports, and 2 10G ixgbe, this is
hardly heavy duty, and yet this
exceeds the default mbuf pool on the installed kernel (1024 + maxusers *
64).

Now, this can be immediately fixed by a sysadmin after that first boot, but
it does result in the second
driver that gets started to complain about inadequate buffers.

I think the default calculation is dated and should be changed, but am not
sure the best way, so are
there suggestions/opinions about this, and might we get it fixed before 8.3
is baked?

Cheers,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: nmbclusters: how do we want to fix this for 8.3 ?

2012-02-22 Thread Jack Vogel
On Wed, Feb 22, 2012 at 1:44 PM, Luigi Rizzo  wrote:

> On Wed, Feb 22, 2012 at 09:09:46PM +, Ben Hutchings wrote:
> > On Wed, 2012-02-22 at 21:52 +0100, Luigi Rizzo wrote:
> ...
> > > I have hit this problem recently, too.
> > > Maybe the issue mostly/only exists on 32-bit systems.
> >
> > No, we kept hitting mbuf pool limits on 64-bit systems when we started
> > working on FreeBSD support.
>
> ok never mind then, the mechanism would be the same, though
> the limits (especially VM_LIMIT) would be different.
>
> > > Here is a possible approach:
> > >
> > > 1. nmbclusters consume the kernel virtual address space so there
> > >must be some upper limit, say
> > >
> > > VM_LIMIT = 256000 (translates to 512MB of address space)
> > >
> > > 2. also you don't want the clusters to take up too much of the
> available
> > >memory. This one would only trigger for minimal-memory systems,
> > >or virtual machines, but still...
> > >
> > > MEM_LIMIT = (physical_ram / 2) / 2048
> > >
> > > 3. one may try to set a suitably large, desirable number of buffers
> > >
> > > TARGET_CLUSTERS = 128000
> > >
> > > 4. and finally we could use the current default as the absolute minimum
> > >
> > > MIN_CLUSTERS = 1024 + maxusers*64
> > >
> > > Then at boot the system could say
> > >
> > > nmbclusters = min(TARGET_CLUSTERS, VM_LIMIT, MEM_LIMIT)
> > >
> > > nmbclusters = max(nmbclusters, MIN_CLUSTERS)
> > >
> > >
> > > In turn, i believe interfaces should do their part and by default
> > > never try to allocate more than a fraction of the total number
> > > of buffers,
> >
> > Well what fraction should that be?  It surely depends on how many
> > interfaces are in the system and how many queues the other interfaces
> > have.
>
> > > if necessary reducing the number of active queues.
> >
> > So now I have too few queues on my interface even after I increase the
> > limit.
> >
> > There ought to be a standard way to configure numbers of queues and
> > default queue lengths.
>
> Jack raised the problem that there is a poorly chosen default for
> nmbclusters, causing one interface to consume all the buffers.
> If the user explicitly overrides the value then
> the number of cluster should be what the user asks (memory permitting).
> The next step is on devices: if there are no overrides, the default
> for a driver is to be lean. I would say that topping the request between
> 1/4 and 1/8 of the total buffers is surely better than the current
> situation. Of course if there is an explicit override, then use
> it whatever happens to the others.
>
> cheers
> luigi
>

Hmmm, well, I could make the default use only 1 queue or something like
that,
was thinking more of what actual users of the hardware would want.

After the installed kernel is booted and the admin would do whatever post
install
modifications they wish it could be changed, along with nmbclusters.

This was why i sought opinions, of the algorithm itself, but also anyone
using
ixgbe and igb in heavy use, what would you find most convenient?

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ixgbe v2.3.11 won't negotiate LACP, v2.4.4 does

2012-03-06 Thread Jack Vogel
Never rains but it pours, this is the second request today :)

Yes, I will do an MFC as soon as quickly as I am able.

Jack


On Tue, Mar 6, 2012 at 3:00 PM, Chris Forgeron  wrote:

> I have a few systems with Intel X520-DA2 PCIe network cards (10 Gig).
>
> The problem I've been running into is with a fresh 9.0-STABLE or
> 9.0-RELEASE install. I can't get a LACP connection established over the ix0
> and ix1 ports. It's showing COLLECTING and DISTRIBUTING, but not ACTIVE.
>
> I've noticed that older 9.0-BETA copies with the 2.3.10 ixgbe driver are
> working with the same switch without problems.
>
> The 9.0-STABLE that I was doing the most work with had an ixgbe of 2.3.11
>
>  After some digging around, I downloaded the ixgbe 2.4.4 from the Intel
> site, compiled the .ko (a little editing due to the bool typdef), and now
> my 9.0-STABLE systems can properly setup a LACP link over ixgbe devices.
>
>  I'm sure others will run into this in time - Can we get the 2.4.4 into
> 9-STABLE?
>
>  Thanks.
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Support for Intel 82599ES?

2012-06-01 Thread Jack Vogel
Yes, it is supported in the ixgbe driver.

Jack


On Fri, Jun 1, 2012 at 8:36 AM, Rick Miller wrote:

> Hi All,
>
> I did not see the Intel 82599ES chipset in the hardware release notes
> for 8.3 or 9.0.  Are these controllers supported at this time?
>
> --
> Take care
> Rick Miller
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Intel X520-DA2 Supported in stable/8?

2012-06-22 Thread Jack Vogel
Increase your system mbuf pool size, you do not want that failure to happen.

Jack


On Fri, Jun 22, 2012 at 2:01 PM, Rick Miller wrote:

> dmesg and ifconfig output below...
>
> On Fri, Jun 22, 2012 at 4:02 PM, Rick Miller 
> wrote:
> > On Fri, Jun 22, 2012 at 3:54 PM, Andrew Boyer 
> wrote:
> >> The ixgbe driver creates devices named ix0, etc.
> >>
> >> I believe you need to run 'ifconfig ix0 up' before it will attempt to
> get link.
> >
> > Thanks for clarifying that tidbit.  At least I know the driver loading
> > is the correct driver :)
> >
> > I did try ifup'ing the interface...it shows the interface up, status
> > is still no carrier.  I've had confirmation that the cable itself is
> > good.  I wonder if it matters that the upstream switch has VLAN
> > tagging enabled?
>
> ix0: 
> port 0x7000-0x701f mem 0xf6b8-0xf6bf,0xf6b7-0xf6b73fff irq
> 40 at device 0.0 on pci7
> ix0: Using MSIX interrupts with 9 vectors
> ix0: RX Descriptors exceed system mbuf max, using default instead!
> ix0: [ITHREAD]
> ix0: [ITHREAD]
> ix0: [ITHREAD]
> ix0: [ITHREAD]
> ix0: [ITHREAD]
> ix0: [ITHREAD]
> ix0: [ITHREAD]
> ix0: [ITHREAD]
> ix0: [ITHREAD]
> ix0: Ethernet address: 90:e2:ba:15:e2:60
> ix0: PCI Express Bus: Speed 5.0Gb/s Width x8
> ix1: 
> port 0x7020-0x703f mem 0xf6a8-0xf6af,0xf6a7-0xf6a73fff irq
> 44 at device 0.1 on pci7
> ix1: Using MSIX interrupts with 9 vectors
> ix1: RX Descriptors exceed system mbuf max, using default instead!
> ix1: [ITHREAD]
> ix1: [ITHREAD]
> ix1: [ITHREAD]
> ix1: [ITHREAD]
> ix1: [ITHREAD]
> ix1: [ITHREAD]
> ix1: [ITHREAD]
> ix1: [ITHREAD]
> ix1: [ITHREAD]
> ix1: Ethernet address: 90:e2:ba:15:e2:61
> ix1: PCI Express Bus: Speed 5.0Gb/s Width x8
>
>
> ix0: flags=8843 metric 0 mtu 1500
>
>  
> options=401bb
>ether 90:e2:ba:XX:XX:XX
>inet 10.1.2.50 netmask 0xfe00 broadcast 10.1.3.255
>media: Ethernet autoselect
>status: no carrier
> ix1: flags=8802 metric 0 mtu 1500
>
>  
> options=401bb
>ether 90:e2:ba:XX:XX:XX
>media: Ethernet autoselect
>status: no carrier
>
>
> --
> Take care
> Rick Miller
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Intel X520-DA2 Supported in stable/8?

2012-06-22 Thread Jack Vogel
Would probably be good to take care of the storm threshold if you haven't,
set it to 0
and you disable the check, that's what we do internally. As for the queues
and number
of descriptors, that's kind of up to you, different work loads and
environments work best
with different setups.

Hopefully, when you get rid of the rx ring setup failure you will get
things working.

Jack


On Fri, Jun 22, 2012 at 3:19 PM, Rick Miller wrote:

> On Fri, Jun 22, 2012 at 5:21 PM, Jack Vogel  wrote:
> > Increase your system mbuf pool size, you do not want that failure to
> happen.
>
> Thanks, Jack.  I saw a thread where you discussed this.  You are
> referring to kern.ipc.nmbclusters, correct?
>
> Should I also adjust the following?
>
> hw.ixgbe.rxd
> hw.ixgbe.txd
> hw.ixgbe.num_queues
> hw.intr_storm_threshold
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Intel X520-DA2 Supported in stable/8?

2012-06-25 Thread Jack Vogel
Glad you figured it out.

Cheers,

Jack


On Mon, Jun 25, 2012 at 3:52 PM, Rick Miller wrote:

> Turns out the gbic in the switch was bad...I didn't think there was a
> problem on the host, but you all still gave me some good info.  I
> appreciate it!
>
>
>
> On 6/25/12, Rick Miller  wrote:
> > On Fri, Jun 22, 2012 at 7:23 PM, Jack Vogel  wrote:
> >> Would probably be good to take care of the storm threshold if you
> >> haven't,
> >> set it to 0
> >> and you disable the check, that's what we do internally. As for the
> >> queues
> >> and number
> >> of descriptors, that's kind of up to you, different work loads and
> >> environments work best
> >> with different setups.
> >>
> >> Hopefully, when you get rid of the rx ring setup failure you will get
> >> things
> >> working.
> >
> > Thanks, Jack.  I did get rid of the rx ring failure.  Link status
> > still shows no carrier.  I think everything looks right from the
> > host's perspective.
> >
> > --
> > Take care
> > Rick Miller
> >
>
> --
> Sent from my mobile device
>
> Take care
> Rick Miller
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 8-STABLE on R620 w/ X520-DA2/Intel 82599

2012-06-29 Thread Jack Vogel
Be patient, a new version will hit HEAD soon with the ID added.

Jack


On Fri, Jun 29, 2012 at 9:35 AM, Rick Miller wrote:

> On Fri, Jun 29, 2012 at 11:56 AM, Gary Palmer  wrote:
> > On Fri, Jun 29, 2012 at 10:50:52AM -0400, Rick Miller wrote:
> >> Hi All,
> >>
> >> I have 2 hosts, HP DL360 G8 and Dell R620.  Both have the
> >> X520-DA2/Intel 82599 10G Fiber NIC.  Both also have the same FreeBSD
> >> 8-STABLE image.  The Dell displays the following in dmesg and we are
> >> unable to configure the ix0 or ix1 interfaces where the HP works just
> >> fine.  Wondering if anyone else has experienced this?
> >>
> >> pci4:  at device 0.0 (no driver attached)
> >> pci4:  at device 0.1 (no driver attached)
> >
> > Please see
> >
> > http://lists.freebsd.org/pipermail/freebsd-net/2012-June/032579.html
> >
> > it may be of some assistance.  It looks like adding the Dell specific
> > PCI IDs may be all thats required.
>
> Hrmm, very interesting indeed.
>
> How do I identify if/when/where the source has been updated?
>
> --
> Take care
> Rick Miller
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: network probs rxcsum

2010-05-19 Thread Jack Vogel
vmstat -i ?

Custom kernel? If you use stock kernel do you still see this problem?
If you use 8 RELEASE do you see the problem?

Jack


On Wed, May 19, 2010 at 11:06 AM, Mark Stapper  wrote:

> On 05/19/10 12:44, Jeremy Chadwick wrote:
> > On Wed, May 19, 2010 at 12:34:17PM +0200, Mark Stapper wrote:
> >
> >> I have two machines running FreeBSD amd64 8.0-Stable with custom
> kernels.
> >> My newer box has had troubles with ssh from day one.
> >> I hoped a kernel upgrade would help, but it didn't.
> >> When I'd ssh into the box ssh would exit with errors:
> >> Bad packet length xx
> >> Disconnecting: Packet corrupt.
> >>
> >> after issueing: "ifconfig em0 -rxcons" everything was stable again.
> >> First I figured it'd be a driver issue. However, I use the same NIC in
> >> my other box!
> >> What could be causing this problem?
> >>
> > I think you mean "-rxcsum", not "-rxcons".
> >
> > Could you please provide output from the following commands?  Jack Vogel
> > will probably respond later about this, but said output would help him.
> >
> > - uname -a
> > - dmesg | grep em0
> > - pciconf -lvc
> >
> > Thanks.
> >
> >
> Could it be a shared interrupt problem?
> Even though ssh worked with rxcsup disabled, network performance was
> horrible!
> Using my onboard nick in stead of em0 cleared it right up!
> em0 is a pci addon card.
> Here are the outputs you requested:
>
> [r...@mario ~]# uname -a
> FreeBSD mario 8.0-STABLE FreeBSD 8.0-STABLE #0: Tue May 18 19:37:30 CEST
> 2010 root@:/usr/obj/usr/src/sys/mario  amd64
> [r...@mario ~]# dmesg |grep em0
> em0:  port
> 0x9c00-0x9c3f mem 0xfdfa-0xfdfb,0xfdfc-0xfdfd irq 18 at
> device 6.0 on pci2
> em0: [FILTER]
> em0: Ethernet address: 00:1b:21:4b:8b:85
> em0: link state changed to UP
> em0: link state changed to DOWN
> em0: link state changed to UP
> em0: link state changed to DOWN
> em0: link state changed to UP
> em0: link state changed to DOWN
> em0: link state changed to UP
> em0: link state changed to DOWN
> em0: link state changed to UP
> em0: link state changed to DOWN
> [r...@mario ~]# pciconf -lvc
> no...@pci0:0:0:0:   class=0x05 card=0x02f010de chip=0x02f410de
> rev=0xa2 hdr=0x00
>vendor = 'NVIDIA Corporation'
>device = 'C51 Host Bridge'
>class  = memory
>subclass   = RAM
>cap 08[44] = HT slave
>cap 08[e0] = HT MSI address window disabled at 0xfee0
> no...@pci0:0:0:1:   class=0x05 card=0x02fa10de chip=0x02fa10de
> rev=0xa2 hdr=0x00
>vendor = 'NVIDIA Corporation'
>device = 'C51 Memory Controller 0'
>class  = memory
>subclass   = RAM
> no...@pci0:0:0:2:   class=0x05 card=0x02fe10de chip=0x02fe10de
> rev=0xa2 hdr=0x00
>vendor = 'NVIDIA Corporation'
>device = 'C51 Memory Controller 1'
>class  = memory
>subclass   = RAM
> no...@pci0:0:0:3:   class=0x05 card=0x02f810de chip=0x02f810de
> rev=0xa2 hdr=0x00
>vendor = 'NVIDIA Corporation'
>device = 'C51 Memory Controller 5'
>class  = memory
>subclass   = RAM
> no...@pci0:0:0:4:   class=0x05 card=0x02f910de chip=0x02f910de
> rev=0xa2 hdr=0x00
>vendor = 'NVIDIA Corporation'
>device = 'C51 Memory Controller 4'
>class  = memory
>subclass   = RAM
> no...@pci0:0:0:5:   class=0x05 card=0x02ff10de chip=0x02ff10de
> rev=0xa2 hdr=0x00
>vendor = 'NVIDIA Corporation'
>device = 'C51 Host Bridge'
>class  = memory
>subclass   = RAM
>cap 00[44] = unknown
> no...@pci0:0:0:6:   class=0x05 card=0x027f10de chip=0x027f10de
> rev=0xa2 hdr=0x00
>vendor = 'NVIDIA Corporation'
>device = 'C51 Memory Controller 3'
>class  = memory
>subclass   = RAM
> no...@pci0:0:0:7:   class=0x05 card=0x027e10de chip=0x027e10de
> rev=0xa2 hdr=0x00
>vendor = 'NVIDIA Corporation'
>device = 'C51 Memory Controller 2'
>class  = memory
>subclass   = RAM
> pc...@pci0:0:4:0:   class=0x060400 card=0x10de chip=0x02fb10de
> rev=0xa1 hdr=0x01
>vendor = 'NVIDIA Corporation'
>device = 'C51 PCIe Bridge'
>class  = bridge
>subclass   = PCI-PCI
>cap 0d[40] = PCI Bridge card=0x10de
>cap 01[48] = powerspec 2  supports D0 D3  current 

Re: Strange igb befavior

2010-05-27 Thread Jack Vogel
Panic is due to a failure to get enough mbufs, when you make  your ring that
big  you
hit the problem, I have been experimenting with a change to fix it but am
not yet
completely confident, for the moment don't make your ring so big :)

Jack


On Thu, May 27, 2010 at 1:08 AM, Kirill Yelizarov  wrote:

> Hi
>
> I'm having reproducible panics with 8-Stable of May13 2010. Panic occurs in
> igb code. Panic start to happen when i set hw.igb.rxd="4096" and
> hw.igb.txd="4096" in /boot/loader.conf. Panic happens immediately after boot
> in igb1 code in my case. igb1 is connected to 100Mbit 3COM switch and switch
> is not connected to anything else.
>
> Here is dmesg for igb
> # dmesg | grep igb
> igb0:  port
> 0x2020-0x203f mem 0xb1a2-0xb1a3,0xb1a44000-0xb1a47fff irq 40 at
> device 0.0 on pci1
> igb0: Using MSIX interrupts with 5 vectors
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: [ITHREAD]
> igb0: Ethernet address: 00:15:17:ba:2e:00
> igb1:  port
> 0x2000-0x201f mem 0xb1a0-0xb1a1,0xb1a4-0xb1a43fff irq 28 at
> device 0.1 on pci1
> igb1: Using MSIX interrupts with 5 vectors
> igb1: [ITHREAD]
> igb1: [ITHREAD]
> igb1: [ITHREAD]
> igb1: [ITHREAD]
> igb1: [ITHREAD]
> igb1: Ethernet address: 00:15:17:ba:2e:01
> igb1: link state changed to UP
> igb0: link state changed to UP
>
> border2# ifconfig
> igb0: flags=8843 metric 0 mtu 1500
>options=13b
>ether 00:15:17:ba:2e:00
>inet 192.168.10.2 netmask 0xff00 broadcast 192.168.10.255
>inet 192.168.10.201 netmask 0x broadcast 192.168.10.201
>inet 192.168.10.202 netmask 0x broadcast 192.168.10.202
>inet 192.168.10.203 netmask 0x broadcast 192.168.10.203
>inet 192.168.10.204 netmask 0x broadcast 192.168.10.204
>media: Ethernet autoselect (1000baseT )
>status: active
> igb1: flags=8843 metric 0 mtu 1500
>options=13b
>ether 00:15:17:ba:2e:01
>inet XXX.74.229.230 netmask 0xfff0 broadcast XXX.74.229.239
>inet XXX.74.229.226 netmask 0x broadcast XXX.74.229.226
>inet XXX.74.229.227 netmask 0x broadcast XXX.74.229.227
>media: Ethernet autoselect (100baseTX )
>status: active
> lo0: flags=8049 metric 0 mtu 16384
>options=3
>inet 127.0.0.1 netmask 0xff00
> pfsync0: flags=0<> metric 0 mtu 1460
>syncpeer: 224.0.0.240 maxupd: 128
> pflog0: flags=141 metric 0 mtu 33152
>
> #pciconf -lv
> i...@pci0:1:0:0:class=0x02 card=0x34de8086 chip=0x10a78086
> rev=0x02 hdr=0x00
>vendor = 'Intel Corporation'
>device = '82575EB Gigabit Network Connection'
>class  = network
>subclass   = ethernet
> i...@pci0:1:0:1:class=0x02 card=0x34de8086 chip=0x10a78086
> rev=0x02 hdr=0x00
>vendor = 'Intel Corporation'
>device = '82575EB Gigabit Network Connection'
>class  = network
>subclass   = ethernet
>
> # netstat -i
> NameMtu Network   Address  Ipkts Ierrs IdropOpkts
> Oerrs  Coll
> igb0   1500   00:15:17:ba:2e:00 2315 0 01415
>   0 0
> igb0   1500 192.168.10.0  border2   1664 - - 1412
>   - -
> igb0   1500 192.168.10.20 mysql-border20 - -0
>   - -
> igb0   1500 192.168.10.20 apache-border2   0 - -0
>   - -
> igb0   1500 192.168.10.20 squid-border20 - -0
>   - -
> igb0   1500 192.168.10.20 postgresql-border1 - -0
>   - -
> igb1   1500   00:15:17:ba:2e:01  129 0 00
>   0 0
> igb1   1500 XXX.74.229.22 border2  0 - -0
>   - -
> igb1   1500 XXX.74.229.22 apache-border2   0 - -0
>   - -
> igb1   1500 XXX.74.229.22 squid-border20 - -0
>   - -
> lo0   163841 0 01
>   0 0
> lo0   16384 your-net  localhost0 - -1
>   - -
> pfsyn  14600 0 00
>   0 0
> pflog 331520 0 00
>   0 0
>
> There are several jails on this server.
>
> When i set hw.igb.rxd="2048" and hw.igb.txd="2048" i don't have panic
> anymore.
>
> The reason i tried to add more buffers than default is because i have NFS
> export on this server. But it is on igb0. igb1 is currently doing nothing
> but it will soon once the server will be ready for production.
>
> I didn't get core dump because my system is rather old. But i can do it if
> needed.
>
> Regards,
> Kirill
>
>
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
_

Re: Strange igb befavior

2010-05-27 Thread Jack Vogel
Well, it might make sense to increase the mbuf pool, do:  sysctl
kern.ipc.nmbclusters
this is a parameter that can be set in /etc/sysctl.conf. Our testers set
this on all their
machines to 262144.

As for the aim stuff, and that includes those latency tuneables, they have
all been
replaced in my latest code with a simplified, more automatic approach.

Jack


On Thu, May 27, 2010 at 3:30 AM, Kirill Yelizarov  wrote:

> Thank You Jack
>
> i'll keep it at 2048 now. I have plans to add two more igb interfaces.
> Should i decrease values to 1024 in case i will have four interfaces in
> server?
>
> I found there are additional tweaks for igb card available:
> hw.igb.enable_aim=1 this one enabled by default and i understand its
> meaning
> Where can i read about the rest?
> hw.igb.low_latency=1000
> hw.igb.ave_latency=2000
> hw.igb.bulk_latency=4000
> hw.igb.rx_process_limit=400
> hw.igb.fc_setting=0
> hw.igb.lro=0
>
> Some years ago i downloaded an article about em card from intel site.
> Perhaps there is one for igb as well?
>
> Kirill
>
> --- On Thu, 5/27/10, Jack Vogel  wrote:
>
> > From: Jack Vogel 
> > Subject: Re: Strange igb befavior
> > To: "Kirill Yelizarov" 
> > Cc: freebsd-stable@freebsd.org
> > Date: Thursday, May 27, 2010, 1:06 PM
> > Panic is due to a failure to get
> > enough mbufs, when you make  your ring that
> > big  you
> > hit the problem, I have been experimenting with a change to
> > fix it but am
> > not yet
> > completely confident, for the moment don't make your ring
> > so big :)
> >
> > Jack
> >
> >
> > On Thu, May 27, 2010 at 1:08 AM, Kirill Yelizarov 
> > wrote:
> >
> > > Hi
> > >
> > > I'm having reproducible panics with 8-Stable of May13
> > 2010. Panic occurs in
> > > igb code. Panic start to happen when i set
> > hw.igb.rxd="4096" and
> > > hw.igb.txd="4096" in /boot/loader.conf. Panic happens
> > immediately after boot
> > > in igb1 code in my case. igb1 is connected to 100Mbit
> > 3COM switch and switch
> > > is not connected to anything else.
> > >
> > > Here is dmesg for igb
> > > # dmesg | grep igb
> > > igb0:  > - 1.9.5> port
> > > 0x2020-0x203f mem
> > 0xb1a2-0xb1a3,0xb1a44000-0xb1a47fff irq 40 at
> > > device 0.0 on pci1
> > > igb0: Using MSIX interrupts with 5 vectors
> > > igb0: [ITHREAD]
> > > igb0: [ITHREAD]
> > > igb0: [ITHREAD]
> > > igb0: [ITHREAD]
> > > igb0: [ITHREAD]
> > > igb0: Ethernet address: 00:15:17:ba:2e:00
> > > igb1:  > - 1.9.5> port
> > > 0x2000-0x201f mem
> > 0xb1a0-0xb1a1,0xb1a4-0xb1a43fff irq 28 at
> > > device 0.1 on pci1
> > > igb1: Using MSIX interrupts with 5 vectors
> > > igb1: [ITHREAD]
> > > igb1: [ITHREAD]
> > > igb1: [ITHREAD]
> > > igb1: [ITHREAD]
> > > igb1: [ITHREAD]
> > > igb1: Ethernet address: 00:15:17:ba:2e:01
> > > igb1: link state changed to UP
> > > igb0: link state changed to UP
> > >
> > > border2# ifconfig
> > > igb0:
> > flags=8843
> > metric 0 mtu 1500
> > >
> > options=13b
> > >ether 00:15:17:ba:2e:00
> > >inet 192.168.10.2 netmask
> > 0xff00 broadcast 192.168.10.255
> > >inet 192.168.10.201 netmask
> > 0x broadcast 192.168.10.201
> > >inet 192.168.10.202 netmask
> > 0x broadcast 192.168.10.202
> > >inet 192.168.10.203 netmask
> > 0x broadcast 192.168.10.203
> > >inet 192.168.10.204 netmask
> > 0x broadcast 192.168.10.204
> > >media: Ethernet autoselect
> > (1000baseT )
> > >status: active
> > > igb1:
> > flags=8843
> > metric 0 mtu 1500
> > >
> > options=13b
> > >ether 00:15:17:ba:2e:01
> > >inet XXX.74.229.230 netmask
> > 0xfff0 broadcast XXX.74.229.239
> > >inet XXX.74.229.226 netmask
> > 0x broadcast XXX.74.229.226
> > >inet XXX.74.229.227 netmask
> > 0x broadcast XXX.74.229.227
> > >media: Ethernet autoselect
> > (100baseTX )
> > >status: active
> > > lo0: flags=8049
> > metric 0 mtu 16384
> > >
> > options=3
> > >inet 127.0.0.1 netmask
> > 0xff00
> > > pfsync0: flags=0<> metric 0 mtu 1460
> > >syncpeer: 224.0.0

Re: em(4) duplex problems with 82541EI on RELENG_8, -CURRENT on PowerEdge 1850

2010-06-18 Thread Jack Vogel
> >>* This parameter control whether or not the driver will wait for
> >>* autonegotiation to complete.
> >>* 1 - Wait for autonegotiation to complete
> >>* 0 - Don't wait for autonegotiation to complete
> >>   */
> >>
> >> Also seems odd that some ICs are affected but not others.
> >>
> >> Its also possible that my problems are pf(4) + setfib(8) related and I
> >> that this is a separate issue.
> >>
> >> Two new notes since the original post:
> >>
> >>  - I have confirmed this problem on two revisions of the Dell
> >>8th gen hardware in two different datacenters
> >>  - The problem persists on -CURRENT from 05/2010
> >>  - RELENG_7 does not seem to be impacted
> >>  - More stats below.
> >>
> >>
> >> Thanks,
> >> ~BAS
> >>
> >> ---
> >>
> >>
> >>
> >> em1: link state changed to DOWN
> >> em1: link state changed to UP
> >> em1: link state changed to DOWN
> >> em1: link state changed to UP
> >> em1: link state changed to DOWN
> >> em1: link state changed to UP
> >> em1: link state changed to DOWN
> >> em1: link state changed to UP
> >> em1: link state changed to DOWN
> >> em1: link state changed to UP
> >> em1: link state changed to DOWN
> >>
> >> em0: Excessive collisions = 0
> >> em0: Sequence errors = 0
> >> em0: Defer count = 0
> >> em0: Missed Packets = 0
> >> em0: Receive No Buffers = 0
> >> em0: Receive Length Errors = 0
> >> em0: Receive errors = 0
> >> em0: Crc errors = 0
> >> em0: Alignment errors = 0
> >> em0: Collision/Carrier extension errors = 0
> >> em0: RX overruns = 0
> >> em0: watchdog timeouts = 0
> >> em0: RX MSIX IRQ = 0 TX MSIX IRQ = 0 LINK MSIX IRQ = 0
> >> em0: XON Rcvd = 0
> >> em0: XON Xmtd = 0
> >> em0: XOFF Rcvd = 0
> >> em0: XOFF Xmtd = 0
> >> em0: Good Packets Rcvd = 1319916
> >> em0: Good Packets Xmtd = 1070646
> >> em0: TSO Contexts Xmtd = 0
> >> em0: TSO Contexts Failed = 0
> >> em1: Excessive collisions = 0
> >> em1: Sequence errors = 0
> >> em1: Defer count = 0
> >> em1: Missed Packets = 0
> >> em1: Receive No Buffers = 0
> >> em1: Receive Length Errors = 0
> >> em1: Receive errors = 0
> >> em1: Crc errors = 0
> >> em1: Alignment errors = 0
> >> em1: Collision/Carrier extension errors = 0
> >> em1: RX overruns = 0
> >> em1: watchdog timeouts = 0
> >> em1: RX MSIX IRQ = 0 TX MSIX IRQ = 0 LINK MSIX IRQ = 0
> >> em1: XON Rcvd = 0
> >> em1: XON Xmtd = 0
> >> em1: XOFF Rcvd = 0
> >> em1: XOFF Xmtd = 0
> >> em1: Good Packets Rcvd = 251348
> >> em1: Good Packets Xmtd = 204160
> >> em1: TSO Contexts Xmtd = 0
> >> em1: TSO Contexts Failed = 0
> >>
> >> 
> >>
> >>
> >> as0# sh int fa0/43
> >> FastEthernet0/43 is up, line protocol is up (connected)
> >> Hardware is Fast Ethernet, address is 0015.c683.51ab (bia
> >> 0015.c683.51ab)
> >> Description: X-Server EM1
> >> MTU 1500 bytes, BW 10 Kbit, DLY 100 usec,
> >> reliability 255/255, txload 1/255, rxload 1/255
> >> Encapsulation ARPA, loopback not set
> >> Keepalive set (10 sec)
> >> Full-duplex, 100Mb/s, media type is 100BaseTX
> >> input flow-control is unsupported output flow-control is unsupported
> >> ARP type: ARPA, ARP Timeout 04:00:00
> >> Last input never, output 00:00:08, output hang never
> >> Last clearing of "show interface" counters 6d03h
> >> Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
> >> Queueing strategy: fifo
> >> Output queue: 0/40 (size/max)
> >> 5 minute input rate 0 bits/sec, 0 packets/sec
> >> 5 minute output rate 1000 bits/sec, 3 packets/sec
> >> 291422 packets input, 131521274 bytes, 0 no buffer
> >> Received 798 broadcasts (0 multicast)
> >> 0 runts, 0 giants, 0 throttles
> >> 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
> >> 0 watchdog, 99 multicast, 0 pause input
> >> 0 input packets with dribble condition detected
> >> 651929 packets output, 73550092 bytes, 0 underruns
> >> 0 output errors, 0 collisions, 4 interface resets
> >> 0 babbles, 0 late collision, 0 deferred
> >> 0 lost carrier, 0 no carrier, 0 PAUSE output
> >> 0 output buffer failures, 0 output buffers swapped out
> >
> > Brian, could you please provide the following output?
> >
> > - uname -a  (you can X-out the machine name if need be)
> > - dmesg | egrep 'em0|em3'  (provides em driver version number)
> > - pciconf -lvc  (this will differ from what you provided above)
> >
> > Next, some of the stats you provided are for em1 when most of your post
> > focuses around em0 and em3.  Is there some correlation or was it a
> > mistake?
> >
> > Adding Jack Vogel of Intel to the CC list, as he's been working on em(4)
> > as of late.
>
> Brian, I have no idea if this will help or not, but...
>
> Jack just committed bits to the Intel drivers (em(4) ixgbe(4)), will
> you have a chance to test a new build? I'm trying to find an unused
> system ATM to test on myself, but it may take me a day or to.
>
> BTW, it appears Jack may be trying to get the fixes (and features)
> into 8.1-RELEASE, let's hope that it happens :)
>
> -Brandon
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: RELENG_8 em(4) -- input errors ("Missed Packets")

2010-06-19 Thread Jack Vogel
I do not believe this is a problem, a bit hard to parse the numbers on that
netstat, but
missed packets will happen when an interface gets lots of traffic. Keep an
eye on things
though.

Thanks,

Jack


On Sat, Jun 19, 2010 at 12:23 PM, Jeremy Chadwick
wrote:

> Something I came across today on a RELENG_8 (8.1-PRERELEASE, amd64,
> built Jun 8th) system we have:
>
> $ netstat -i -n -d -I em1
> NameMtu Network   Address  Ipkts Ierrs IdropOpkts
> Oerrs  Coll Drop
> em11500   xx:xx:xx:xx:xx:xx 1611737117 0 11011087
>   0 00
> em11500 x/xx  xxx   16108452 - - 11024972
>   - --
>
> The input errors are what concerned me.  I poked dev.em.1.stats and got
> the following:
>
> em1: Excessive collisions = 0
> em1: Sequence errors = 0
> em1: Defer count = 0
> em1: Missed Packets = 17
> em1: Receive No Buffers = 0
> em1: Receive Length Errors = 0
> em1: Receive errors = 0
> em1: Crc errors = 0
> em1: Alignment errors = 0
> em1: Collision/Carrier extension errors = 0
> em1: watchdog timeouts = 0
> em1: XON Rcvd = 0
> em1: XON Xmtd = 0
> em1: XOFF Rcvd = 0
> em1: XOFF Xmtd = 0
> em1: Good Packets Rcvd = 16117349
> em1: Good Packets Xmtd = 11046837
> em1: TSO Contexts Xmtd = 17609
> em1: TSO Contexts Failed = 0
>
> What exactly does "Missed Packets" mean here?  How is a packet "missed"?
> Said port on our switch doesn't any sign of problems:
>
> hp2510g# show interfaces 2
>
>  Status and Counters - Port Counters for port 2
>
>  Name  : port2
>  Link Status : Up
>  Totals (Since boot or last clear) :
>   Bytes Rx: 3,548,949,136  Bytes Tx: 2,613,086,119
>   Unicast Rx  : 182,088,023Unicast Tx  : 255,981,685
>   Bcast/Mcast Rx  : 14,674 Bcast/Mcast Tx  : 81,852
>  Errors (Since boot or last clear) :
>   FCS Rx  : 0  Drops Rx: 0
>   Alignment Rx: 0  Collisions Tx   : 0
>   Runts Rx: 0  Late Colln Tx   : 0
>   Giants Rx   : 0  Excessive Colln : 0
>   Total Rx Errors : 0  Deferred Tx : 0
>  Rates (5 minute weighted average) :
>   Total Rx  (bps) : 1457984Total Tx  (bps) : 1494568
>   Unicast Rx (Pkts/sec) : 0Unicast Tx (Pkts/sec) : 0
>   B/Mcast Rx (Pkts/sec) : 0B/Mcast Tx (Pkts/sec) : 0
>   Utilization Rx  : 00.04 %Utilization Tx  : 00.04 %
>
> Relevant userland and kernel stuff:
>
> $ ifconfig em1
> em1: flags=8843 metric 0 mtu 1500
>
>  
> options=219b
>ether xx:xx:xx:xx:xx:xx
>inet xxx netmask xx broadcast xxx
>media: Ethernet autoselect (1000baseT )
>status: active
>
> $ netstat -m
> 2162/1678/3840 mbufs in use (current/cache/total)
> 2048/1030/3078/25600 mbuf clusters in use (current/cache/total/max)
> 2048/896 mbuf+clusters out of packet secondary zone in use (current/cache)
> 0/104/104/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 4636K/2895K/7532K bytes allocated to network (current/cache/total)
> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/0/0 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
>
> $ vmstat -i
> interrupt  total   rate
> irq4: uart0  868  0
> irq6: fdc0 1  0
> irq23: uhci3 ehci1+  332  0
> cpu0: timer   1818467717   2000
> irq256: em0   375532  0
> irq257: em1  5004095  5
> irq258: ahci05879898  6
> cpu1: timer   1818466783   2000
> cpu3: timer   1818466831   2000
> cpu2: timer   1818466828   2000
> Total 7285128885   8012
>
> $ dmesg | grep em1
> em1:  port 0x3000-0x301f mem
> 0xd030-0xd031 irq 17 at device 0.0 on pci15
> em1: Using MSI interrupt
> em1: [FILTER]
>
> $ pciconf -lvc
> e...@pci0:15:0:0:class=0x02 card=0x109a15d9 chip=0x109a8086
> rev=0x00 hdr=0x00
>vendor = 'Intel Corporation'
>device = 'Intel PRO/1000 PL Network Adaptor (82573L)'
>class  = network
>subclass   = ethernet
>cap 01[c8] = powerspec 2  supports D0 D3  current D0
>cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
>cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>
> --
> | Jeremy Chadwick   j...@parodius.com |
> | Parodius Networking   http://www.parodius.com/ |
> | UNIX S

Re: RELENG_7 em problems (and RELENG_8)

2010-07-02 Thread Jack Vogel
I got the email,  there are server outages around here today and people
leaving
for a long weekend, so not much getting done. I'll take some time and look
into
this after the weekend, ok?

Jack


On Fri, Jul 2, 2010 at 10:39 AM, Mike Tancsa  wrote:

> Hi Jack,
>Just a followup to the email below. I now saw what appears to be the
> same problem on RELENG_8, but on a different nic and with VLANs.  So not
> sure if this is a general em problem, a problem specific to some em NICs, or
> a TSO problem in general.  The issue seemed to be triggered when I added a
> new vlan based on
>
> e...@pci0:14:0:0:class=0x02 card=0x109a15d9 chip=0x109a8086
> rev=0x00 hdr=0x00
>vendor = 'Intel Corporation'
>device = 'Intel PRO/1000 PL Network Adaptor (82573L)'
>class  = network
>subclass   = ethernet
>cap 01[c8] = powerspec 2  supports D0 D3  current D0
>cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
>cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>
> pci14:  on pcib5
> em3:  port 0x6000-0x601f mem
> 0xe830-0xe831 irq 17 at device 0.0 on pci14
> em3: Using MSI interrupt
> em3: [FILTER]
> em3: Ethernet address: 00:30:48:9f:eb:81
>
> em3: flags=8943 metric 0
> mtu 1500
>options=2098
>ether 00:30:48:9f:eb:81
>inet 10.255.255.254 netmask 0xfffc broadcast 10.255.255.255
>media: Ethernet autoselect (1000baseT )
>status: active
>
> I had to disable tso, rxcsum and txsum in order to see the devices on the
> other side of the two vlans trunked off em3.  Unfortunately, the other sides
> were switches 100km and 500km away so I didnt have any tcpdump capabilities
> to diagnose the issue.  I had already created one vlan off this NIC and all
> was fine.  A few weeks later, I added a new one and I could no longer telnet
> into the remote switches from the local machine But, I could telnet into
> the switches from machines not on the problem box. Hence, it would appear to
> be a general TSO issue no ? I disabled tso on the nic (I didnt disable
> net.inet.tcp.tso as I forgot about that).. Still nothing. I could always
> ping the remote devices, but no tcp services.  I then remembered this issue
> from before, so I tried disabling tso on the NIC. Still nothing. Then I
> disabled rxcsum and txcsum and I could then telnet into the remote devices.
>
> This newly observed issue was from a buildworld on Mon Jun 14 11:29:12 EDT
> 2010.
>
> I will try and recreate the issue locally again to see if I can trigger the
> problem on demand.  Any thoughts on what it might be ? Perhaps an issue
> specific to certain em nics ?
>
>---Mike
>
>
> At 04:31 PM 6/10/2010, Mike Tancsa wrote:
>
>> Hi Jack,
>>I am seeing some issues on RELENG_7 with a specific em nic
>>
>> e...@pci0:13:0:0:class=0x02 card=0x108c15d9 chip=0x108c8086
>> rev=0x03 hdr=0x00
>>vendor = 'Intel Corporation'
>>device = 'Intel Corporation 82573E Gigabit Ethernet Controller
>> (Copper) (82573E)'
>>class  = network
>>subclass   = ethernet
>>cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
>>cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>
>> If I disable tso, I am not able to make a tcp connection into the host
>>
>> eg
>> 0[psbgate1]# ifconfig em2
>> em2: flags=8843 metric 0 mtu 1500
>>
>>
>> options=219b
>>ether 00:30:48:9f:eb:80
>>inet 192.168.128.200 netmask 0xfff0 broadcast 192.168.128.207
>>media: Ethernet autoselect (100baseTX )
>>status: active
>> 0[psbgate1]# ifconfig em2 -tso
>> 0[psbgate1]#
>>
>>
>> Looking at the pcap, the checksum is bad on the syn-ack.  If I re-enable
>> tso, it seems to be ok
>>
>> 16:18:01.113297 IP (tos 0x10, ttl 64, id 6339, offset 0, flags [DF], proto
>> TCP (6), length 60) 192.168.128.196.54172 > 192.168.128.200.22: S, cksum
>> 0x4e79 (correct), 3313156149:3313156149(0) win 65535 > 3,sackOK,timestamp 3376174416 0>
>> 16:18:01.123676 IP (tos 0x0, ttl 64, id 3311, offset 0, flags [DF], proto
>> TCP (6), length 60) 192.168.128.200.22 > 192.168.128.196.54172: S, cksum
>> 0x81c9 (incorrect (-> 0x51f2), 1373042663:1373042663(0) ack 3313156150 win
>> 65535 
>>
>>
>> em2:  port 0x5000-0x501f mem
>> 0xe820-0xe821 irq 16 at device 0.0 on pci13
>> em2: Using MSI interrupt
>> em2: [FILTER]
>> em2: Ethernet address: 00:30:48:9f:eb:80
>> pcib5:  irq 16 at device 28.5 on pci0
>> pci14:  on pcib5
>> em3:  port 0x6000-0x601f mem
>> 0xe830-0xe831 irq 17 at device 0.0 on pci14
>> em3: Using MSI interrupt
>> em3: [FILTER]
>> em3: Ethernet address: 00:30:48:9f:eb:81
>>
>>
>> Also there is still the issue with
>>
>>
>> http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052842.html
>>
>> in RELENG_7 ?
>>
>>---Mike
>>
>>
>> 
>> Mike T

Re: em(4) duplex problems with 82541EI on RELENG_8, -CURRENT on PowerEdge 1850

2010-07-15 Thread Jack Vogel
The fact that I WISH it to be MFC'd doesn't mean that I am actually given
permission to do so.

Jack


On Thu, Jul 15, 2010 at 10:48 AM, Steve Polyack  wrote:

> On 07/15/10 13:31, Michael Tuexen wrote:
>
>> On Jul 15, 2010, at 6:50 PM, Brian A. Seklecki wrote:
>>
>>
>>
>>>
>>>
 It may have gone in before the RELENG_8_1 tag/branch occurred?  SVN
 r209309

 Jacks's change went into stable/8 on June 18:



>>> Also, did anyone provide feedback on SVN r209959 to
>>> head/sys/dev/e1000/if_em.c ?
>>>
>>> It's saying "8.1 MFC", so you might want to ask people to test that on
>>> stable/8 then expedite the 8.1 MFC as well.
>>>
>>>
>> Maybe Jack can MFC it to stable/8. MFCing to releng/8.1 must come from
>> stable/8 anyway.
>> There is no need for re@ approval to MFC it into stable/8.
>>
>>
>
> As Brian stated, the change has already been MFC'd into stable/8 (June
> 18th) with the following comment from Jack:
>
> "MFC to RELENG8.1 asap"
>
>
> Steve
>
>> Best regards
>> Michael
>>
>>
>>> ~BAS
>>>
>>>
>> ___
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>
>>
>>
>
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Watchdog resets on 82575

2010-08-06 Thread Jack Vogel
If you have this adapter and have been getting watchdogs you need to pick up
the small
update I checked into HEAD today. When I added the SR-IOV support for the
82576
adapter I removed a call to set the MAC type in an early routine, thinking
it was unnecessary,
since a slightly later shared code init does the same thing. I also saw no
problem when
I did this on the 82576 well, it did have a bad effect that I did not
notice, the slightly
later call, igb_setup_msix() did not have the mac set and this resulted in
the 82575
creating more queues than it is really able to handle.

So, bottom line, this is a critical fix for 82575:   SVN rev 210968

Cheers,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: svn commit: r209611 - head/sys/dev/e1000

2010-08-17 Thread Jack Vogel
Cool the first person to actually try and use it :)

Yes, there's one key thing you have to do right now that's not
documented, because of the simplistic PCI structure the guest
has the kernel blacklists it from using MSIX. SO, what you need
to do is set the honor_blacklist (that's not the complete string,
use sysctl -a |grep blacklist to find it) and set that to 0. It needs
to be set at boot.

That should get you running.

Jack


On Tue, Aug 17, 2010 at 8:18 AM, pluknet  wrote:

> On 1 July 2010 02:13, Jack Vogel  wrote:
> > On Wed, Jun 30, 2010 at 2:50 PM, Julian Elischer  >wrote:
> >
> >> On 6/30/10 10:26 AM, Jack F Vogel wrote:
> >>
> >>> Author: jfv
> >>> Date: Wed Jun 30 17:26:47 2010
> >>> New Revision: 209611
> >>> URL: http://svn.freebsd.org/changeset/base/209611
> >>>
> >>> Log:
> >>>   SR-IOV support added to igb
> >>>
> >>>   What this provides is support for the 'virtual function'
> >>>   interface that a FreeBSD VM may be assigned from a host
> >>>   like KVM on Linux, or newer versions of Xen with such
> >>>   support.
> >>>
> >>>   When the guest is set up with the capability, a special
> >>>   limited function 82576 PCI device is present in its virtual
> >>>   PCI space, so with this driver installed in the guest that
> >>>   device will be detected and function nearly like the bare
> >>>   metal, as it were.
> >>>
> >>>   The interface is only allowed a single queue in this configuration
> >>>   however initial performance tests have looked very good.
> >>>
> >>>   Enjoy!!
> >>>
> >>>
> >> do these extra devices turn up in a standard ifconfig output?
> >> in other words, can we assign them to jails using vimage?
> >>
> >>
> > They only show up if configured in the PF host, for instance if using
> Linux
> > and KVM (I did develop and test
> > with Fedora 13) you must load the igb driver there specifying that you
> want
> > vf's created and how many.
> > Next in the management of the guest you need to assign one of these vf
> > devices to the guest. After you
> > do all that and load this igb driver then yes, it will look just like a
> > standard igbX device.
> >
>
> Hi, Jack.
>
> I set up qemu-kvm on openSUSE 11.3
>  with 82576 PCI device as you described.
>
> Guest fails to attach with:
> igb0:  mem
> 0xf206-0xf2063fff,0xf2064000-0xf2067fff at device 5.0 on pci0
> igb0: Unable to allocate bus resource: interrupt
> device_attach: igb0 attach returned 6
>
> i...@pci0:0:5:0:class=0x02 card=0xa03c8086 chip=0x10ca8086
> rev=0x01 hdr=0x00
>vendor = 'Intel Corporation'
>class  = network
>subclass   = ethernet
>cap 11[40] = MSI-X supports 3 messages in map 0x1c
>
> Did  I missed something?
>
> --
> wbr,
> pluknet
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: RELENG_7 em problems (and RELENG_8)

2010-08-17 Thread Jack Vogel
Hmmm, interesting, I'll have to have some testing done, maybe for the 574 it
should automagically disable CSUM?

Jack


On Tue, Aug 17, 2010 at 11:52 AM, Pyun YongHyeon  wrote:

> On Mon, Aug 16, 2010 at 05:07:11PM -0400, Mike Tancsa wrote:
> > Hi Jack,
> > FYI, I am still seeing this same problem on RELENG_8 (code
> > as of today).  Unfortunately I cant try Pyun's patch since the
> > underlying code has changed since then.
> >
> > e...@pci0:3:0:0: class=0x02 card=0x34ec8086 chip=0x10d38086
> > rev=0x00 hdr=0x00
> > vendor = 'Intel Corporation'
> > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> > class  = network
> > subclass   = ethernet
> > cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> >
> > pci3:  on pcib3
> > em4:  port 0x1000-0x101f
> > mem 0xb190-0xb191,0xb192-0xb1923fff irq 16 at device 0.0 on
> pci3
> > em4: Using MSI interrupt
> > em4: [FILTER]
> > em4: Ethernet address: 00:15:17:ed:3e:c4
> >
>
> Here is updated patch for HEAD and stable/8.
> http://people.freebsd.org/~yongari/em.csum_tso.20100817.patch
>
> It seems to work as expected under my limited environments. If
> you're using multiple Tx queues with em(4) it would be better to
> disable Tx checksum offloading as driver always have to create a
> new checksum context for each frame. This will effectively disable
> pipelined Tx data DMA which in turn greatly slows down Tx
> performance for small sized frames. The reason driver have to
> create a new checksum context when it uses multiple Tx queues comes
> from hardware limitation. The controller tracks only for the last
> context descriptor that was written such that driver does not know
> the state of checksum context configured in other Tx queue.
> Hope this helps.
>
> >
> >
> > ---Mike
> >
> >
> > At 03:36 PM 7/2/2010, Pyun YongHyeon wrote:
> > >On Fri, Jul 02, 2010 at 01:39:22PM -0400, Mike Tancsa wrote:
> > >> Hi Jack,
> > >> Just a followup to the email below. I now saw what appears
> > >> to be the same problem on RELENG_8, but on a different nic and with
> > >> VLANs.  So not sure if this is a general em problem, a problem
> > >> specific to some em NICs, or a TSO problem in general.  The issue
> > >> seemed to be triggered when I added a new vlan based on
> > >>
> > >> e...@pci0:14:0:0:class=0x02 card=0x109a15d9
> > >> chip=0x109a8086 rev=0x00 hdr=0x00
> > >> vendor = 'Intel Corporation'
> > >> device = 'Intel PRO/1000 PL Network Adaptor (82573L)'
> > >> class  = network
> > >> subclass   = ethernet
> > >> cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > >> cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> > >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > >>
> > >> pci14:  on pcib5
> > >> em3:  port 0x6000-0x601f
> > >> mem 0xe830-0xe831 irq 17 at device 0.0 on pci14
> > >> em3: Using MSI interrupt
> > >> em3: [FILTER]
> > >> em3: Ethernet address: 00:30:48:9f:eb:81
> > >>
> > >> em3: flags=8943
> > >> metric 0 mtu 1500
> > >> options=2098
> > >> ether 00:30:48:9f:eb:81
> > >> inet 10.255.255.254 netmask 0xfffc broadcast
> 10.255.255.255
> > >> media: Ethernet autoselect (1000baseT )
> > >> status: active
> > >>
> > >> I had to disable tso, rxcsum and txsum in order to see the devices on
> > >> the other side of the two vlans trunked off em3.  Unfortunately, the
> > >> other sides were switches 100km and 500km away so I didnt have any
> > >> tcpdump capabilities to diagnose the issue.  I had already created
> > >> one vlan off this NIC and all was fine.  A few weeks later, I added a
> > >> new one and I could no longer telnet into the remote switches from
> > >> the local machine But, I could telnet into the switches from
> > >> machines not on the problem box. Hence, it would appear to be a
> > >> general TSO issue no ? I disabled tso on the nic (I didnt disable
> > >> net.inet.tcp.tso as I forgot about that).. Still nothing. I could
> > >> always ping the remote devices, but no tcp services.  I then
> > >> remembered this issue from before, so I tried disabling tso on the
> > >> NIC. Still nothing. Then I disabled rxcsum and txcsum and I could
> > >> then telnet into the remote devices.
> > >>
> > >> This newly observed issue was from a buildworld on Mon Jun 14
> > >> 11:29:12 EDT 2010.
> > >>
> > >> I will try and recreate the issue locally again to see if I can
> > >> trigger the problem on demand.  Any thoughts on what it might be ?
> > >> Perhaps an issue specific to certain em nics ?
> > >>
> > >
> > >http://www.freebsd.org/cgi/query-pr.cgi?pr=141843
> > >I

Re: RELENG_7 em problems (and RELENG_8)

2010-08-17 Thread Jack Vogel
I believe the requirement of a context descriptor for most frames in the igb
driver
is just the way the hardware works, I've looked over the Linux driver again
and it
looks like they require the same. I don't believe its a big deal, just the
added
descriptor for the frame.

Jack


On Tue, Aug 17, 2010 at 12:14 PM, Pyun YongHyeon  wrote:

> On Tue, Aug 17, 2010 at 11:52:08AM -0700, Pyun YongHyeon wrote:
> > On Mon, Aug 16, 2010 at 05:07:11PM -0400, Mike Tancsa wrote:
> > > Hi Jack,
> > > FYI, I am still seeing this same problem on RELENG_8 (code
> > > as of today).  Unfortunately I cant try Pyun's patch since the
> > > underlying code has changed since then.
> > >
> > > e...@pci0:3:0:0: class=0x02 card=0x34ec8086 chip=0x10d38086
> > > rev=0x00 hdr=0x00
> > > vendor = 'Intel Corporation'
> > > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> > > class  = network
> > > subclass   = ethernet
> > > cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> > > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > > cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> > >
> > > pci3:  on pcib3
> > > em4:  port 0x1000-0x101f
> > > mem 0xb190-0xb191,0xb192-0xb1923fff irq 16 at device 0.0 on
> pci3
> > > em4: Using MSI interrupt
> > > em4: [FILTER]
> > > em4: Ethernet address: 00:15:17:ed:3e:c4
> > >
> >
> > Here is updated patch for HEAD and stable/8.
> > http://people.freebsd.org/~yongari/em.csum_tso.20100817.patch
> >
> > It seems to work as expected under my limited environments. If
> > you're using multiple Tx queues with em(4) it would be better to
> > disable Tx checksum offloading as driver always have to create a
> > new checksum context for each frame. This will effectively disable
> > pipelined Tx data DMA which in turn greatly slows down Tx
> > performance for small sized frames. The reason driver have to
> > create a new checksum context when it uses multiple Tx queues comes
> > from hardware limitation. The controller tracks only for the last
> > context descriptor that was written such that driver does not know
> > the state of checksum context configured in other Tx queue.
> > Hope this helps.
> >
>
> For igb(4) controllers, it seems we also need a way to avoid
> creating a new checksum context for every Tx frame as we did in
> em(4). Unlike em(4) controllers, igb(4) seems to maintain context
> per queue such that we can safely reuse previously configured
> context for a queue.
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: RELENG_7 em problems (and RELENG_8)

2010-08-17 Thread Jack Vogel
Well we do of course, i'll have my test engineer try it both ways and see
what
looks better. Let you know...

Jack


On Tue, Aug 17, 2010 at 12:35 PM, Pyun YongHyeon  wrote:

> On Tue, Aug 17, 2010 at 12:05:56PM -0700, Jack Vogel wrote:
> > Hmmm, interesting, I'll have to have some testing done, maybe for the 574
> it
> > should automagically disable CSUM?
> >
>
> I don't have 82574 controller to test but it may depend on how
> pipelined Tx data DMA works. If 82574 can still pipeline Tx data
> DMA when a new context is written it would be better to enable
> checksum offloading. If em(4) uses single Tx queue, we can safely
> enable checksum offloading, I guess.
>
> > Jack
> >
> >
> > On Tue, Aug 17, 2010 at 11:52 AM, Pyun YongHyeon 
> wrote:
> >
> > > On Mon, Aug 16, 2010 at 05:07:11PM -0400, Mike Tancsa wrote:
> > > > Hi Jack,
> > > > FYI, I am still seeing this same problem on RELENG_8 (code
> > > > as of today).  Unfortunately I cant try Pyun's patch since the
> > > > underlying code has changed since then.
> > > >
> > > > e...@pci0:3:0:0: class=0x02 card=0x34ec8086 chip=0x10d38086
> > > > rev=0x00 hdr=0x00
> > > > vendor = 'Intel Corporation'
> > > > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> > > > class  = network
> > > > subclass   = ethernet
> > > > cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > > > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1
> message
> > > > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > > > cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> > > >
> > > > pci3:  on pcib3
> > > > em4:  port 0x1000-0x101f
> > > > mem 0xb190-0xb191,0xb192-0xb1923fff irq 16 at device 0.0
> on
> > > pci3
> > > > em4: Using MSI interrupt
> > > > em4: [FILTER]
> > > > em4: Ethernet address: 00:15:17:ed:3e:c4
> > > >
> > >
> > > Here is updated patch for HEAD and stable/8.
> > > http://people.freebsd.org/~yongari/em.csum_tso.20100817.patch<http://people.freebsd.org/%7Eyongari/em.csum_tso.20100817.patch>
> <http://people.freebsd.org/%7Eyongari/em.csum_tso.20100817.patch>
> > >
> > > It seems to work as expected under my limited environments. If
> > > you're using multiple Tx queues with em(4) it would be better to
> > > disable Tx checksum offloading as driver always have to create a
> > > new checksum context for each frame. This will effectively disable
> > > pipelined Tx data DMA which in turn greatly slows down Tx
> > > performance for small sized frames. The reason driver have to
> > > create a new checksum context when it uses multiple Tx queues comes
> > > from hardware limitation. The controller tracks only for the last
> > > context descriptor that was written such that driver does not know
> > > the state of checksum context configured in other Tx queue.
> > > Hope this helps.
> > >
> > > >
> > > >
> > > > ---Mike
> > > >
> > > >
> > > > At 03:36 PM 7/2/2010, Pyun YongHyeon wrote:
> > > > >On Fri, Jul 02, 2010 at 01:39:22PM -0400, Mike Tancsa wrote:
> > > > >> Hi Jack,
> > > > >> Just a followup to the email below. I now saw what appears
> > > > >> to be the same problem on RELENG_8, but on a different nic and
> with
> > > > >> VLANs.  So not sure if this is a general em problem, a problem
> > > > >> specific to some em NICs, or a TSO problem in general.  The issue
> > > > >> seemed to be triggered when I added a new vlan based on
> > > > >>
> > > > >> e...@pci0:14:0:0:class=0x02 card=0x109a15d9
> > > > >> chip=0x109a8086 rev=0x00 hdr=0x00
> > > > >> vendor = 'Intel Corporation'
> > > > >> device = 'Intel PRO/1000 PL Network Adaptor (82573L)'
> > > > >> class  = network
> > > > >> subclass   = ethernet
> > > > >> cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > > > >> cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1
> message
> > > > >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link
> x1(x1)
>

Re: RELENG_7 em problems (and RELENG_8)

2010-08-17 Thread Jack Vogel
The guy who worries about the Linux driver's performance is in my team, and
he's
a very good engineer, and we're talking about the hardware's DMA, so its not
an OS issue, thus I'm not saying I'm sure, but I'm dubious about this, at
least
when we're talking about igb. But hey, I'm willing to be proven wrong :)

Jack


On Tue, Aug 17, 2010 at 12:47 PM, Pyun YongHyeon  wrote:

> On Tue, Aug 17, 2010 at 12:34:31PM -0700, Jack Vogel wrote:
> > I believe the requirement of a context descriptor for most frames in the
> igb
> > driver
> > is just the way the hardware works, I've looked over the Linux driver
> again
> > and it
> > looks like they require the same. I don't believe its a big deal, just
> the
> > added
> > descriptor for the frame.
> >
>
> Setting up context does not cost much. The real cost comes from
> stopping requesting DMA for next packet whenever a new context
> is written.
> AFAIK Linux always added a new checksum context. I don't know
> whether Linux cares about the cost of refilling pipeline or
> measured the performance differences. FreeBSD noticed the
> difference on em(4) controllers and took appropriate action to take
> full advantage of the hardware feature, I think.
> I have to experiment how igb(4) works when it is told to reuse
> configured context(both checksum and TSO context).
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: svn commit: r209611 - head/sys/dev/e1000

2010-08-19 Thread Jack Vogel
On Thu, Aug 19, 2010 at 2:45 AM, pluknet  wrote:

>
> By the way,
>
> Sometimes after boot I have to kldreload if_igb.ko several
> times until watchdog go to sleep, so traffic starts flowing.
>

Hmmm, the intention is that the VF always be single queue, but I
see the code I used to limit it is broken, so you are getting two
queues. For now near the top of if_igb.c set igb_num_queues = 1;

I believe that will get rid of the watchdogs.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Crashes on X7SPE-HF with em

2010-08-26 Thread Jack Vogel
Hmmm, can you remove ALTQ from the mix and see if that eliminates it?

Jack


On Thu, Aug 26, 2010 at 2:56 PM, Philipp Wuensche wrote:

> Jeremy Chadwick wrote:
> >
> > CC'ing Jack Vogel of Intel and Yong-Hyeon PYUN who might have some
> > ideas.  OP's backtrace is here:
> >
> >
> http://lists.freebsd.org/pipermail/freebsd-stable/2010-August/058425.html
> >
> > Philipp, can you please provide the following output?
> >
> > * dmesg | egrep 'em[0-9]'
>
> em0:  port 0xdc00-0xdc1f mem
> 0xfe9e-0xfe9f,0xfe9dc000-0xfe9d irq 16 at device 0.0 on pci2
> em0: Using MSI interrupt
> em0: [FILTER]
> em0: Ethernet address: 00:25:90:04:6e:fa
> em1:  port 0xec00-0xec1f mem
> 0xfeae-0xfeaf,0xfeadc000-0xfead irq 17 at device 0.0 on pci3
> em1: Using MSI interrupt
> em1: [FILTER]
> em1: Ethernet address: 00:25:90:04:6e:fb
>
> > * uname -a (you can XXX out the machine name if need be)
>
> FreeBSD XXX 8.1-STABLE FreeBSD 8.1-STABLE #2: Wed Aug 25 10:38:50 CEST
> 2010 r...@xxx:/usr/obj/usr/src/sys/XXX  amd64
>
> Date of source is Aug 17 14:09 CEST 2010. It happend with 8.1-RELEASE
> too, I can go back to RELEASE or any SVN revision you would like, if it
> is helping in any way.
>
> Kernel-config:
>
> include GENERIC
>
> ident   XXX
>
> options IPSEC
>
> options DEVICE_POLLING
> options ACCEPT_FILTER_HTTP
>
> options ALTQ
>
> options ALTQ_CBQ
> options ALTQ_RED
> options ALTQ_RIO
> options ALTQ_HFSC
> options ALTQ_PRIQ
>
> device  crypto
> device  enc
>
>
> > * pciconf -lvc (only include the em(4) items please)
>
> e...@pci0:2:0:0: class=0x02 card=0x060a15d9 chip=0x10d38086 rev=0x00
> hdr=0x00
>vendor = 'Intel Corporation'
>device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
>class  = network
>subclass   = ethernet
>cap 01[c8] = powerspec 2  supports D0 D3  current D0
>cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
>cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> e...@pci0:3:0:0: class=0x02 card=0x060a15d9 chip=0x10d38086 rev=0x00
> hdr=0x00
>vendor = 'Intel Corporation'
>device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
>class  = network
>subclass   = ethernet
>cap 01[c8] = powerspec 2  supports D0 D3  current D0
>cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
>cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>cap 11[a0] = MSI-X supports 5 messages in map 0x1c
>
> > * vmstat -i
>
> interrupt  total   rate
> irq1: atkbd0   9  0
> cpu0: timer 36544552   1994
> irq256: em0 3801  0
> irq257: em1 32963909   1799
> irq258: ahci0 175662  9
> cpu1: timer 36543525   1994
> cpu2: timer 36543525   1994
> cpu3: timer 36543525   1994
> Total  179318508   9786
>
> There is an shared IPMI interface on em0, but the interface is not used
> by FreeBSD. em1 is used by four VLANs. Polling is only in the
> Kernelconfig, not activated on the devices.
>
> Greetings,
> philipp
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Crashes on X7SPE-HF with em

2010-08-26 Thread Jack Vogel
On Thu, Aug 26, 2010 at 3:15 PM, Jeremy Chadwick
wrote:

> On Thu, Aug 26, 2010 at 11:56:48PM +0200, Philipp Wuensche wrote:
> > Jeremy Chadwick wrote:
> > >
> > > CC'ing Jack Vogel of Intel and Yong-Hyeon PYUN who might have some
> > > ideas.  OP's backtrace is here:
> > >
> > >
> http://lists.freebsd.org/pipermail/freebsd-stable/2010-August/058425.html
> > >
> > > Philipp, can you please provide the following output?
> > >
> > > * dmesg | egrep 'em[0-9]'
> >
> > em0:  port 0xdc00-0xdc1f mem
> > 0xfe9e-0xfe9f,0xfe9dc000-0xfe9d irq 16 at device 0.0 on pci2
> > em0: Using MSI interrupt
> > em0: [FILTER]
> > em0: Ethernet address: 00:25:90:04:6e:fa
> > em1:  port 0xec00-0xec1f mem
> > 0xfeae-0xfeaf,0xfeadc000-0xfead irq 17 at device 0.0 on pci3
> > em1: Using MSI interrupt
> > em1: [FILTER]
> > em1: Ethernet address: 00:25:90:04:6e:fb
> >
> > > * uname -a (you can XXX out the machine name if need be)
> >
> > FreeBSD XXX 8.1-STABLE FreeBSD 8.1-STABLE #2: Wed Aug 25 10:38:50 CEST
> > 2010 r...@xxx:/usr/obj/usr/src/sys/XXX  amd64
> >
> > Date of source is Aug 17 14:09 CEST 2010. It happend with 8.1-RELEASE
> > too, I can go back to RELEASE or any SVN revision you would like, if it
> > is helping in any way.
> >
> > Kernel-config:
> >
> > include GENERIC
> >
> > ident   XXX
> >
> > options IPSEC
> >
> > options   DEVICE_POLLING
> > options ACCEPT_FILTER_HTTP
> >
> > options ALTQ
> >
> > options ALTQ_CBQ
> > options ALTQ_RED
> > options ALTQ_RIO
> > options ALTQ_HFSC
> > options ALTQ_PRIQ
> >
> > devicecrypto
> > deviceenc
> >
> >
> > > * pciconf -lvc (only include the em(4) items please)
> >
> > e...@pci0:2:0:0:   class=0x02 card=0x060a15d9 chip=0x10d38086
> rev=0x00
> > hdr=0x00
> > vendor = 'Intel Corporation'
> > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> > class  = network
> > subclass   = ethernet
> > cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> > e...@pci0:3:0:0:   class=0x02 card=0x060a15d9 chip=0x10d38086
> rev=0x00
> > hdr=0x00
> > vendor = 'Intel Corporation'
> > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> > class  = network
> > subclass   = ethernet
> > cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> >
> > > * vmstat -i
> >
> > interrupt  total   rate
> > irq1: atkbd0   9  0
> > cpu0: timer 36544552   1994
> > irq256: em0 3801  0
> > irq257: em1 32963909   1799
> > irq258: ahci0 175662  9
> > cpu1: timer 36543525   1994
> > cpu2: timer 36543525   1994
> > cpu3: timer 36543525   1994
> > Total  179318508   9786
> >
> > There is an shared IPMI interface on em0, but the interface is not used
> > by FreeBSD. em1 is used by four VLANs. Polling is only in the
> > Kernelconfig, not activated on the devices.
>
> So much complexity here.  Tracking this down might be difficult.
>
> One thing that does concern me is the interrupt rate for em1.  Jack et
> al, is this normal?  I don't see this behaviour on my 8.x systems with
> em(4) driver 7.0.5, but my systems all use 82573E and 82573L, and don't
> have MSI-X support.
>
>
He is only using one vector anyway it seems, so MSIX isnt making things
much more complex than your 573.

The interrupt rate seems high but I'm not sure if its abnormal for a busy
interface.

I tend to agree with Yongari, let's eliminate all the complicating factors
like IPSEC and ALTQ and see if it still occurs.

>From the crash data I do not see a clear cause either.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Crashes on X7SPE-HF with em

2010-08-27 Thread Jack Vogel
On Fri, Aug 27, 2010 at 6:04 AM, Philipp Wuensche wrote:

> Jack Vogel wrote:
> >
> >
> > On Thu, Aug 26, 2010 at 3:15 PM, Jeremy Chadwick
> > mailto:free...@jdc.parodius.com>> wrote:
> >
> > On Thu, Aug 26, 2010 at 11:56:48PM +0200, Philipp Wuensche wrote:
> > > Jeremy Chadwick wrote:
> > > >
> > > > CC'ing Jack Vogel of Intel and Yong-Hyeon PYUN who might have
> some
> > > > ideas.  OP's backtrace is here:
> > > >
> > > >
> >
> http://lists.freebsd.org/pipermail/freebsd-stable/2010-August/058425.html
> > > >
> > > > Philipp, can you please provide the following output?
> > > >
> > > > * dmesg | egrep 'em[0-9]'
> > >
> > > em0:  port
> > 0xdc00-0xdc1f mem
> > > 0xfe9e-0xfe9f,0xfe9dc000-0xfe9d irq 16 at device 0.0
> > on pci2
> > > em0: Using MSI interrupt
> > > em0: [FILTER]
> > > em0: Ethernet address: 00:25:90:04:6e:fa
> > > em1:  port
> > 0xec00-0xec1f mem
> > > 0xfeae-0xfeaf,0xfeadc000-0xfead irq 17 at device 0.0
> > on pci3
> > > em1: Using MSI interrupt
> > > em1: [FILTER]
> > > em1: Ethernet address: 00:25:90:04:6e:fb
> > >
> > > > * uname -a (you can XXX out the machine name if need be)
> > >
> > > FreeBSD XXX 8.1-STABLE FreeBSD 8.1-STABLE #2: Wed Aug 25 10:38:50
> CEST
> > > 2010 r...@xxx:/usr/obj/usr/src/sys/XXX  amd64
> > >
> > > Date of source is Aug 17 14:09 CEST 2010. It happend with
> 8.1-RELEASE
> > > too, I can go back to RELEASE or any SVN revision you would like,
> > if it
> > > is helping in any way.
> > >
> > > Kernel-config:
> > >
> > > include GENERIC
> > >
> > > ident   XXX
> > >
> > > options IPSEC
> > >
> > > options   DEVICE_POLLING
> > > options ACCEPT_FILTER_HTTP
> > >
> > > options ALTQ
> > >
> > > options ALTQ_CBQ
> > > options ALTQ_RED
> > > options ALTQ_RIO
> > > options ALTQ_HFSC
> > > options ALTQ_PRIQ
> > >
> > > devicecrypto
> > > deviceenc
> > >
> > >
> > > > * pciconf -lvc (only include the em(4) items please)
> > >
> > > e...@pci0:2:0:0:   class=0x02 card=0x060a15d9
> > chip=0x10d38086 rev=0x00
> > > hdr=0x00
> > > vendor = 'Intel Corporation'
> > > device = 'Intel 82574L Gigabit Ethernet Controller
> (82574L)'
> > > class  = network
> > > subclass   = ethernet
> > > cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1
> message
> > > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link
> x1(x1)
> > > cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> > > e...@pci0:3:0:0:   class=0x02 card=0x060a15d9
> > chip=0x10d38086 rev=0x00
> > > hdr=0x00
> > > vendor = 'Intel Corporation'
> > > device = 'Intel 82574L Gigabit Ethernet Controller
> (82574L)'
> > > class  = network
> > > subclass   = ethernet
> > > cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1
> message
> > > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link
> x1(x1)
> > > cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> > >
> > > > * vmstat -i
> > >
> > > interrupt  total   rate
> > > irq1: atkbd0   9  0
> > > cpu0: timer 36544552   1994
> > > irq256: em0 3801  0
> > > irq257: em1 32963909   1799
> > > irq258: ahci0 175662  9
> > > cpu1: timer 36543525   1994
> > > cpu2: timer 

Re: page fault in e1000_clear_hw_cntrs_base_generic() during SIOCAIFADDR

2010-09-01 Thread Jack Vogel
LOL, if its the VF its pretty new code, PLEASE anyone, if this is the case
make it clear in the title somewhere, ok? Thanks.

Jack


On Wed, Sep 1, 2010 at 10:24 AM, John Baldwin  wrote:

> On Wednesday, September 01, 2010 1:11:31 pm pluknet wrote:
> > On 1 September 2010 20:06, John Baldwin  wrote:
> > > On Wednesday, September 01, 2010 11:53:09 am pluknet wrote:
> > >> Hi.
> > >>
> > >> This is reproducible from time to time on boot when
> > >> handling SIOCAIFADDR called from ifconfig on igb
> > >> on fresh (and not so fresh) 8-STABLE.
> > >>
> > >> How can I help with debugging?
> > >>
> > >> Kernel page fault with the following non-sleepable locks held:
> > >> exclusive sleep mutex igb0 (IGB Core Lock) r = 0 (0xc2655534) locked @
> > >> /usr/src/sys/modules/igb/../../dev/e1000/if_igb.c:965
> > >> KDB: stack backtrace:
> > >> db_trace_self_wrapper(c08b5055,cce577b8,c060db15,3c5,0,...) at
> > >> db_trace_self_wrapper+0x26
> > >> kdb_backtrace(3c5,0,,c0a94864,cce577f0,...) at
> kdb_backtrace+0x29
> > >> _witness_debugger(c08b74fe,cce57804,4,1,0,...) at
> _witness_debugger+0x25
> > >> witness_warn(5,0,c08e3140,cce5782c,c2956000,...) at witness_warn+0x1fe
> > >> trap(cce57890) at trap+0x195
> > >> calltrap() at calltrap+0x6
> > >> --- trap 0xc, eip = 0xc3192477, esp = 0xcce578d0, ebp = 0xcce578e0 ---
> > >> e1000_clear_hw_cntrs_base_generic(c2651004,64,c3185850,c2651000,0,...)
> > >> at e1000_clear_hw_cntrs_base_generic+0x3e7
> > >
> > > Can you use gdb on your kernel.debug to map this to a source file and
> line?
> > >
> >
> > Here it is (btw, it took about 10-15 reboots to reproduce after adding
> > swap and dumpon setup).
> > Hmm.. don't see where it might access an invalid pointer.
> >
> > #0  doadump () at pcpu.h:231
> > #1  0xc04a3679 in db_fncall (dummy1=1, dummy2=0, dummy3=-1062122144,
> > dummy4=0xcce636a8 "") at /usr/src/sys/ddb/db_command.c:548
> > #2  0xc04a3a71 in db_command (last_cmdp=0xc093d19c, cmd_table=0x0,
> dopager=1)
> > at /usr/src/sys/ddb/db_command.c:445
> > #3  0xc04a3bca in db_command_loop () at /usr/src/sys/ddb/db_command.c:498
> > #4  0xc04a5aed in db_trap (type=12, code=0) at
> /usr/src/sys/ddb/db_main.c:229
> > #5  0xc05fa64e in kdb_trap (type=12, code=0, tf=0xcce63890)
> > at /usr/src/sys/kern/subr_kdb.c:535
> > #6  0xc084dcdf in trap_fatal (frame=0xcce63890, eva=3428511744)
> > at /usr/src/sys/i386/i386/trap.c:929
> > #7  0xc084e553 in trap (frame=0xcce63890) at
> /usr/src/sys/i386/i386/trap.c:328
> > #8  0xc082f66c in calltrap () at /usr/src/sys/i386/i386/exception.s:166
> > #9  0xc318c477 in e1000_clear_hw_cntrs_base_generic (hw=0xc2655004)
> > at /usr/src/sys/modules/igb/../../dev/e1000/e1000_mac.c:643
> > #10 0xc317ec82 in igb_init_locked (adapter=0xc2655000)
> > at /usr/src/sys/modules/igb/../../dev/e1000/if_igb.c:1202
> > #11 0xc31801e5 in igb_ioctl (ifp=0xc2943c00, command=2149607692,
> > data=0xc29db600 "╢╤\235бд╤\235бт╤\235б")
> > at /usr/src/sys/modules/igb/../../dev/e1000/if_igb.c:966
> > #12 0xc0696c4e in in_ifinit (ifp=0xc2943c00, ia=0xc29db600,
> > sin=Variable "sin" is not available.
> > )
> > at /usr/src/sys/netinet/in.c:848
> > #13 0xc06980cb in in_control (so=0xc2a5d9a8, cmd=2151704858,
> > data=0xc2649400 "igb0", ifp=0xc2943c00, td=0xc29b8280)
> > ---Type  to continue, or q  to quit---
> > at /usr/src/sys/netinet/in.c:563
> > #14 0xc067c860 in ifioctl (so=0xc2a5d9a8, cmd=2151704858,
> > data=0xc2649400 "igb0", td=0xc29b8280) at /usr/src/sys/net/if.c:2523
> > #15 0xc0617395 in soo_ioctl (fp=0xc29ce310, cmd=2151704858,
> data=0xc2649400,
> > active_cred=0xc254b100, td=0xc29b8280)
> > at /usr/src/sys/kern/sys_socket.c:212
> > #16 0xc06113dd in kern_ioctl (td=0xc29b8280, fd=3, com=2151704858,
> > data=0xc2649400 "igb0") at file.h:262
> > #17 0xc0611564 in ioctl (td=0xc29b8280, uap=0xcce63cf8)
> > at /usr/src/sys/kern/sys_generic.c:678
> > #18 0xc084e160 in syscall (frame=0xcce63d38)
> > at /usr/src/sys/i386/i386/trap.c:
> > #19 0xc082f6d1 in Xint0x80_syscall ()
> > at /usr/src/sys/i386/i386/exception.s:264
> > #20 0x0033 in ?? ()
> > Previous frame inner to this frame (corrupt stack?)
> >
> > (kgdb) f 9
> > #9  0xc318c477 in e1000_clear_hw_cntrs_base_generic (hw=0xc2655004)
> > at /usr/src/sys/modules/igb/../../dev/e1000/e1000_mac.c:643
> > 643 E1000_READ_REG(hw, E1000_SYMERRS);
> > (kgdb) list
> > 638 void e1000_clear_hw_cntrs_base_generic(struct e1000_hw *hw)
> > 639 {
> > 640 DEBUGFUNC("e1000_clear_hw_cntrs_base_generic");
> > 641
> > 642 E1000_READ_REG(hw, E1000_CRCERRS);
> > 643 E1000_READ_REG(hw, E1000_SYMERRS);
> > 644 E1000_READ_REG(hw, E1000_MPC);
> > 645 E1000_READ_REG(hw, E1000_SCC);
> > 646 E1000_READ_REG(hw, E1000_ECOL);
> > 647 E1000_READ_REG(hw, E1000_MCC);
> >
> > (kgdb) p *(struct e1000_osdep *)hw->back
> > $6 = {mem_bus_space_tag = 1, mem_bus_space_h

Re: MSIX failure

2010-09-06 Thread Jack Vogel
In the future make sure that you put E1000 or EM in the title otherwise I
might miss it,
fortunately I looked at this :)

I'm on a holiday weekend, I will investigate this tomorrow.

Jack


On Mon, Sep 6, 2010 at 8:53 AM, Gareth de Vaux  wrote:

> Hi all, I moved from 8.0-RELEASE to last week's -STABLE:
>
> $ uname -v
> FreeBSD 8.1-STABLE #0: Thu Sep  2 16:38:02 SAST 2010 r...@x
> :/usr/obj/usr/src/sys/GENERIC
>
> and all seems well except my network card is unusable. On boot up:
>
> em0:  port 0x3040-0x305f mem
> 0xe320-0xe321,0xe322-0xe3220fff irq 10 at device 25.0 on pci0
> em0: Setup MSIX failure
> em0: [FILTER]
> em0: Ethernet address: 00:27:0e:1e:5e:e3
>
> em1:  port 0x1000-0x103f
> mem 0xe312-0xe313,0xe310-0xe311 irq 9 at device 1.0 on pci5
> em1: [FILTER]
> em1: Ethernet address: 00:1b:21:5b:f2:18
>
>
> em0 is a PCI 'Intel(R) PRO/1000 GT Desktop Adapter' which worked up until
> now.
> em1 is onboard which didn't work with 8.0-RELEASE either.
>
>
> $ ifconfig em0
> em0: flags=8843 metric 0 mtu 1500
>
>  
> options=219b
>ether 00:27:0e:1e:5e:e3
>inet 
>media: Ethernet autoselect
>status: no carrier
>
>
> pciconf -lv:
>
> e...@pci0:0:25:0:class=0x02 card=0x8086 chip=0x10f08086
> rev=0x05 hdr=0x00
>vendor = 'Intel Corporation'
>class  = network
>subclass   = ethernet
>
> e...@pci0:5:1:0: class=0x02 card=0x13768086 chip=0x107c8086 rev=0x05
> hdr=0x00
>vendor = 'Intel Corporation'
>device = 'Gigabit Ethernet Controller (Copper) rev 5 (82541PI)'
>class  = network
>subclass   = ethernet
>
> (no device listing for em0)
>
> Swapping the PCI card with a PCI-X version gives the same behaviour.
> Setting
> hw.pci.enable_msix and hw.pci.enable_msi to 0 doesn't help in either case.
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSIX failure

2010-09-07 Thread Jack Vogel
Email to Gareth de Vaux is bouncing :(

First off, this device was not supported in 8.0 REL, what were you running
that last
worked?

Do you have MSI disabled on this system of yours, the reason for this
message
is that both MSIX and MSI setup failed, your device should succeed with MSI.

Tell me more about the system please?

Jack


On Mon, Sep 6, 2010 at 11:36 AM, Jack Vogel  wrote:

> In the future make sure that you put E1000 or EM in the title otherwise I
> might miss it,
> fortunately I looked at this :)
>
> I'm on a holiday weekend, I will investigate this tomorrow.
>
> Jack
>
>
>
> On Mon, Sep 6, 2010 at 8:53 AM, Gareth de Vaux  wrote:
>
>> Hi all, I moved from 8.0-RELEASE to last week's -STABLE:
>>
>> $ uname -v
>> FreeBSD 8.1-STABLE #0: Thu Sep  2 16:38:02 SAST 2010 r...@x
>> :/usr/obj/usr/src/sys/GENERIC
>>
>> and all seems well except my network card is unusable. On boot up:
>>
>> em0:  port 0x3040-0x305f mem
>> 0xe320-0xe321,0xe322-0xe3220fff irq 10 at device 25.0 on pci0
>> em0: Setup MSIX failure
>> em0: [FILTER]
>> em0: Ethernet address: 00:27:0e:1e:5e:e3
>>
>> em1:  port
>> 0x1000-0x103f mem 0xe312-0xe313,0xe310-0xe311 irq 9 at
>> device 1.0 on pci5
>> em1: [FILTER]
>> em1: Ethernet address: 00:1b:21:5b:f2:18
>>
>>
>> em0 is a PCI 'Intel(R) PRO/1000 GT Desktop Adapter' which worked up until
>> now.
>> em1 is onboard which didn't work with 8.0-RELEASE either.
>>
>>
>> $ ifconfig em0
>> em0: flags=8843 metric 0 mtu 1500
>>
>>  
>> options=219b
>>ether 00:27:0e:1e:5e:e3
>>inet 
>>media: Ethernet autoselect
>>status: no carrier
>>
>>
>> pciconf -lv:
>>
>> e...@pci0:0:25:0:class=0x02 card=0x8086 chip=0x10f08086
>> rev=0x05 hdr=0x00
>>vendor = 'Intel Corporation'
>>class  = network
>>subclass   = ethernet
>>
>> e...@pci0:5:1:0: class=0x02 card=0x13768086 chip=0x107c8086 rev=0x05
>> hdr=0x00
>>vendor = 'Intel Corporation'
>>device = 'Gigabit Ethernet Controller (Copper) rev 5 (82541PI)'
>>class  = network
>>subclass   = ethernet
>>
>> (no device listing for em0)
>>
>> Swapping the PCI card with a PCI-X version gives the same behaviour.
>> Setting
>> hw.pci.enable_msix and hw.pci.enable_msi to 0 doesn't help in either case.
>> ___
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSIX failure

2010-09-07 Thread Jack Vogel
I've looked at the code, this message was misleading, what really happens
is that the driver fails to be able to setup either MSIX OR MSI, when this
happens it will fall back and use a Legacy interrupt, so its non-fatal and
the device should work anyway.

The only real reason you should see this is a) you used sysctl and turned
msi and msix off, or b) a real hardware problem in the chipset has caused
the failure. All devices em drives (as opposed to lem) are PCI Express and
so by definition they have MSI and MSIX available.

I have just checked in a new delta to em in HEAD that corrects some other
issues, and I have added a changed message that will be less confusing.

Regards,

Jack


On Tue, Sep 7, 2010 at 10:00 AM, Jack Vogel  wrote:

> Email to Gareth de Vaux is bouncing :(
>
> First off, this device was not supported in 8.0 REL, what were you running
> that last
> worked?
>
> Do you have MSI disabled on this system of yours, the reason for this
> message
> is that both MSIX and MSI setup failed, your device should succeed with
> MSI.
>
> Tell me more about the system please?
>
> Jack
>
>
>
> On Mon, Sep 6, 2010 at 11:36 AM, Jack Vogel  wrote:
>
>> In the future make sure that you put E1000 or EM in the title otherwise I
>> might miss it,
>> fortunately I looked at this :)
>>
>> I'm on a holiday weekend, I will investigate this tomorrow.
>>
>> Jack
>>
>>
>>
>> On Mon, Sep 6, 2010 at 8:53 AM, Gareth de Vaux  wrote:
>>
>>> Hi all, I moved from 8.0-RELEASE to last week's -STABLE:
>>>
>>> $ uname -v
>>> FreeBSD 8.1-STABLE #0: Thu Sep  2 16:38:02 SAST 2010 r...@x
>>> :/usr/obj/usr/src/sys/GENERIC
>>>
>>> and all seems well except my network card is unusable. On boot up:
>>>
>>> em0:  port 0x3040-0x305f mem
>>> 0xe320-0xe321,0xe322-0xe3220fff irq 10 at device 25.0 on pci0
>>> em0: Setup MSIX failure
>>> em0: [FILTER]
>>> em0: Ethernet address: 00:27:0e:1e:5e:e3
>>>
>>> em1:  port
>>> 0x1000-0x103f mem 0xe312-0xe313,0xe310-0xe311 irq 9 at
>>> device 1.0 on pci5
>>> em1: [FILTER]
>>> em1: Ethernet address: 00:1b:21:5b:f2:18
>>>
>>>
>>> em0 is a PCI 'Intel(R) PRO/1000 GT Desktop Adapter' which worked up until
>>> now.
>>> em1 is onboard which didn't work with 8.0-RELEASE either.
>>>
>>>
>>> $ ifconfig em0
>>> em0: flags=8843 metric 0 mtu 1500
>>>
>>>  
>>> options=219b
>>>ether 00:27:0e:1e:5e:e3
>>>inet 
>>>media: Ethernet autoselect
>>>status: no carrier
>>>
>>>
>>> pciconf -lv:
>>>
>>> e...@pci0:0:25:0:class=0x02 card=0x8086 chip=0x10f08086
>>> rev=0x05 hdr=0x00
>>>vendor = 'Intel Corporation'
>>>class  = network
>>>subclass   = ethernet
>>>
>>> e...@pci0:5:1:0: class=0x02 card=0x13768086 chip=0x107c8086 rev=0x05
>>> hdr=0x00
>>>vendor = 'Intel Corporation'
>>>device = 'Gigabit Ethernet Controller (Copper) rev 5 (82541PI)'
>>>class  = network
>>>subclass   = ethernet
>>>
>>> (no device listing for em0)
>>>
>>> Swapping the PCI card with a PCI-X version gives the same behaviour.
>>> Setting
>>> hw.pci.enable_msix and hw.pci.enable_msi to 0 doesn't help in either
>>> case.
>>> ___
>>> freebsd-stable@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org
>>> "
>>>
>>
>>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSIX failure

2010-09-08 Thread Jack Vogel
This is what'd I'd expect, the onboard is PCH chipset, support was not in
8.0,
but as I said, in 8.1 (and hence stable/8) it is supported, and it should
work.

I do not know why you don't have MSI support, but it should still work with
Legacy interrupts.

Jack


On Wed, Sep 8, 2010 at 2:40 AM, Gareth de Vaux  wrote:

> On Tue 2010-09-07 (13:25), Jack Vogel wrote:
> > I've looked at the code, this message was misleading, what really happens
> > is that the driver fails to be able to setup either MSIX OR MSI, when
> this
> > happens it will fall back and use a Legacy interrupt, so its non-fatal
> and
> > the device should work anyway.
> >
> > The only real reason you should see this is a) you used sysctl and turned
> > msi and msix off, or b) a real hardware problem in the chipset has caused
> > the failure. All devices em drives (as opposed to lem) are PCI Express
> and
> > so by definition they have MSI and MSIX available.
>
> Ok I think I got my cards mixed up - in my original mail em1 is the PCI
> card and em0 is the onboard, sorry. I guessed the numbering may not have
> been as expected while trying to fix the issue, but I might not have fully
> tested this at the time.
>
> So here's the situation after looking through older kernel logs:
>
> I installed 8.0-RELEASE, the onboard card didn't work - the kernel didn't
> even pick it up, and ifconfig only showed the lo0 device.
>
> I added the PCI Intel(R) PRO/1000 GT card (Gigabit Ethernet Controller
> (Copper) rev 5 (82541PI)) - this worked and came up as em0.
>
> Last week I moved to -STABLE, GENERIC kernel. The kernel now detects both
> cards, with the kernel messages in my original mail. Whether either works
> I'm not completely sure, I'll need to get to the machine physically and
> switch cables/cards/configurations first.
>
> I didn't turn off msi/msix with sysctl (except when debugging in my
> original
> mail).
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSIX failure

2010-09-09 Thread Jack Vogel
On Thu, Sep 9, 2010 at 7:33 AM, Kurt Jaeger  wrote:

> Hi!
>
> > > Is this within a jail or something else along those lines?  I can't
> > > reproduce the problem otherwise.  Frustrating!  Someone else on the
> list
> > > might have ideas as to what could cause this.
> >
> > Nope, this's a normal host. I've got securelevel on 1, but doubt that
> > would affect this?
>
> I assume it affects it.
>
> http://www.freebsd.org/doc/en/books/faq/security.html#SECURELEVEL
>
> Basically, when the securelevel is positive, the kernel restricts
> certain tasks; not even the superuser (i.e., root) is allowed to
> do them.
>
> There:
>
> # Write to kernel memory via /dev/mem and /dev/kmem.
>
> So I assume it also restricts reading /dev/kmem ?
>
>
OH YUCK, another root isn't really root, so is it also possibly
the reason for the MSIX failure?? Is this pile, er feature, on by default?

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSIX failure

2010-09-09 Thread Jack Vogel
On Thu, Sep 9, 2010 at 11:37 AM, John Baldwin  wrote:

> On Thursday, September 09, 2010 12:41:07 pm Jack Vogel wrote:
> > On Thu, Sep 9, 2010 at 7:33 AM, Kurt Jaeger  wrote:
> >
> > > Hi!
> > >
> > > > > Is this within a jail or something else along those lines?  I can't
> > > > > reproduce the problem otherwise.  Frustrating!  Someone else on the
> > > list
> > > > > might have ideas as to what could cause this.
> > > >
> > > > Nope, this's a normal host. I've got securelevel on 1, but doubt that
> > > > would affect this?
> > >
> > > I assume it affects it.
> > >
> > > http://www.freebsd.org/doc/en/books/faq/security.html#SECURELEVEL
> > >
> > > Basically, when the securelevel is positive, the kernel restricts
> > > certain tasks; not even the superuser (i.e., root) is allowed to
> > > do them.
> > >
> > > There:
> > >
> > > # Write to kernel memory via /dev/mem and /dev/kmem.
> > >
> > > So I assume it also restricts reading /dev/kmem ?
> > >
> > >
> > OH YUCK, another root isn't really root, so is it also possibly
> > the reason for the MSIX failure?? Is this pile, er feature, on by
> default?
>
> securelevel does not affect any of the MSI/MSI-X bits.
>

Well then there's something else funny going on with that hardware, as at
least MSI should work with the chipset, I am not able to get that exact
skew from what I am told.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSIX failure

2010-09-09 Thread Jack Vogel
Gareth's email bouncing for anybody else or is it just me?

Gareth,  set hw.pci.honor_msi_blacklist=0, you'll have to do that at boot
btw.

Tell me more exactly the make/model of the hardware so I might try to get
my hands on one?

Jack


On Thu, Sep 9, 2010 at 1:20 PM, John Baldwin  wrote:

> On Thursday, September 09, 2010 3:04:27 pm Jack Vogel wrote:
> > On Thu, Sep 9, 2010 at 11:37 AM, John Baldwin  wrote:
> >
> > > On Thursday, September 09, 2010 12:41:07 pm Jack Vogel wrote:
> > > > On Thu, Sep 9, 2010 at 7:33 AM, Kurt Jaeger  wrote:
> > > >
> > > > > Hi!
> > > > >
> > > > > > > Is this within a jail or something else along those lines?  I
> can't
> > > > > > > reproduce the problem otherwise.  Frustrating!  Someone else on
> the
> > > > > list
> > > > > > > might have ideas as to what could cause this.
> > > > > >
> > > > > > Nope, this's a normal host. I've got securelevel on 1, but doubt
> that
> > > > > > would affect this?
> > > > >
> > > > > I assume it affects it.
> > > > >
> > > > > http://www.freebsd.org/doc/en/books/faq/security.html#SECURELEVEL
> > > > >
> > > > > Basically, when the securelevel is positive, the kernel restricts
> > > > > certain tasks; not even the superuser (i.e., root) is allowed to
> > > > > do them.
> > > > >
> > > > > There:
> > > > >
> > > > > # Write to kernel memory via /dev/mem and /dev/kmem.
> > > > >
> > > > > So I assume it also restricts reading /dev/kmem ?
> > > > >
> > > > >
> > > > OH YUCK, another root isn't really root, so is it also possibly
> > > > the reason for the MSIX failure?? Is this pile, er feature, on by
> > > default?
> > >
> > > securelevel does not affect any of the MSI/MSI-X bits.
> > >
> >
> > Well then there's something else funny going on with that hardware, as at
> > least MSI should work with the chipset, I am not able to get that exact
> > skew from what I am told.
>
> I think the first would be to disable the MSI blacklists via a tunable to
> see
> if that enables MSI.
>
> --
> John Baldwin
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSIX failure

2010-09-10 Thread Jack Vogel
On Fri, Sep 10, 2010 at 1:41 AM, Gareth de Vaux  wrote:

> On Thu 2010-09-09 (13:48), Jack Vogel wrote:
> > Gareth's email bouncing for anybody else or is it just me?
>
> Yes sorry I disabled this alias after picking up years of spam on the
> mailman archives. I assumed people would primarily reply to the list.
> I've re-enabled it for now.
>
> > Gareth,  set hw.pci.honor_msi_blacklist=0, you'll have to do that at boot
> > btw.
>
> Ok, I'll have to get back to you in a day or 2 when I reboot.
>
> > Tell me more exactly the make/model of the hardware so I might try to get
> > my hands on one?
>
> I can't tell much more from here, the card came from the datacentre's
> reserve when I was fighting with the onboard. The larger lettering on
> the card itself was just 'Intel(R) PRO/1000 GT Desktop Adapter'. I didn't
> note down the smaller model-type numbers. Is this
>
> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058748.html
> not enough?
>
>
No, not the add-on adapter, i have no trouble finding those,  what I want to
know about is the details about the system that has em0 LOM, only
way to check on that is to have the whole enchilada :)

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSIX failure

2010-09-12 Thread Jack Vogel
On Sun, Sep 12, 2010 at 1:15 AM, Gareth de Vaux  wrote:

> On Fri 2010-09-10 (10:41), Gareth de Vaux wrote:
> > > Gareth,  set hw.pci.honor_msi_blacklist=0, you'll have to do that at
> boot
> > > btw.
> >
> > Ok, I'll have to get back to you in a day or 2 when I reboot.
>
> Done:
>
> $ sysctl -a | grep msi
> hw.bce.msi_enable: 1
> hw.pci.honor_msi_blacklist: 0
> hw.pci.enable_msix: 1
> hw.pci.enable_msi: 1
>
> I get the same MSIX failure message though.
>
>
Ya, that's what I suspected, it just means the failure is not because
your system is blacklisted.

You still have not given me what I need to help: the exact details
of the system, I don't need to see the pciconf of the NIC, I need
to know about the motherboard/chipset its on.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSIX failure

2010-09-13 Thread Jack Vogel
We don't deal with desktop systems that much in my group, it was pointed out
by a coworker that the BIOS has settings that could disable MSI, please
check
out how yours is set.

Jack


On Mon, Sep 13, 2010 at 7:42 AM, Gareth de Vaux  wrote:

> On Fri 2010-09-10 (10:43), Jack Vogel wrote:
> > No, not the add-on adapter, i have no trouble finding those,  what I want
> to
> > know about is the details about the system that has em0 LOM, only
> > way to check on that is to have the whole enchilada :)
>
> Ah right. These are snippets from dmidecode, is this enough?
>
> Handle 0x, DMI type 4, 35 bytes
> Processor Information
>Socket Designation: LGA 1156
>Type: Central Processor
>Family: Other
>Manufacturer: Intel(R) Corporation
>ID: E5 06 01 00 FF FB EB BF
>Version: Intel(R) Core(TM) i5 CPU 750  @ 2.67GHz
>Voltage: 1.1 V
>External Clock: 133 MHz
>Max Speed: 4000 MHz
>Current Speed: 2668 MHz
>Status: Populated, Enabled
>Upgrade: Other
>L1 Cache Handle: 0x0004
>L2 Cache Handle: 0x0003
>L3 Cache Handle: 0x0001
>Serial Number: Not Specified
>Asset Tag: Not Specified
>Part Number: Not Specified
>
> Handle 0x0007, DMI type 2, 20 bytes
> Base Board Information
>Manufacturer: Intel Corporation
>Product Name: DP55WB
>Version: AAE64798-206
>Serial Number: AZWB005003A3
>Asset Tag: Base Board Asset Tag
>Features:
>Board is a hosting board
>Board is replaceable
>Location In Chassis: Base Board Chassis Location
>Chassis Handle: 0x0008
>Type: Unknown
>Contained Object Handles: 0
>
> Handle 0x0010, DMI type 10, 6 bytes
> On Board Device Information
>Type: Ethernet
>Status: Enabled
>Description: Intel(R) 82578DC Gigabit Network Connection
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: RELENG_7 em problems (and RELENG_8)

2010-09-24 Thread Jack Vogel
There is a new revision of the em driver coming next week, its going thru
some
stress pounding over the weekend, if no issues show up I'll put it into
HEAD.

Yongari's changes in TX context handling which effects checksum and tso
are added. I've also decided that multiple queues in 82574 just are a source
of problems without a lot of benefit, so it still uses MSIX but with only 3
vectors,
meaning it seperates TX and RX but has a single queue.

Its looking very stable, I hope it fixes everyone's issues.

Jack


On Tue, Sep 14, 2010 at 10:59 AM, Mike Tancsa  wrote:

> Hi Jack,
>Any plans to commit the patch below ? I have been running it on a
> number of boxes and it works as expected with no side effects.
>
>---Mike
>
>
>
> At 04:00 PM 8/17/2010, Pyun YongHyeon wrote:
>
>> On Tue, Aug 17, 2010 at 03:55:12PM -0400, Mike Tancsa wrote:
>> > At 02:52 PM 8/17/2010, Pyun YongHyeon wrote:
>> >
>> > >Here is updated patch for HEAD and stable/8.
>> > >http://people.freebsd.org/~yongari/em.csum_tso.20100817.patch
>> > >
>> > >It seems to work as expected under my limited environments. If
>> >
>> > Thanks! The patch applies cleanly and all works as expected now! I am
>> > no longer able to trigger the bug. I just use the stock unmodified
>> > driver normally, so no multi queues
>> >
>>
>> Glad to hear that. Thanks for testing!
>>
>> > # vmstat -i
>> > interrupt  total   rate
>> > irq256: em0  149  0
>> > irq257: em13  0
>> > irq259: em3  971  2
>> > irq260: ahci0   1520  3
>> >
>> >
>> >
>> > em3: flags=8843 metric 0 mtu
>> 1500
>> >
>> options=219b
>> > ether 00:15:17:xx:xx:xx
>> > inet6 fe80::215:17ff:fexx:%em3 prefixlen 64 scopeid 0x4
>> > inet 192.168.xx.xx netmask 0xff00 broadcast 192.168.xx.xx
>> > nd6 options=3
>> > media: Ethernet autoselect (100baseTX )
>> > status: active
>> >
>> >
>> > e...@pci0:3:0:0: class=0x02 card=0x34ec8086 chip=0x10d38086
>> > rev=0x00 hdr=0x00
>> > vendor = 'Intel Corporation'
>> > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
>> > class  = network
>> > subclass   = ethernet
>> > cap 01[c8] = powerspec 2  supports D0 D3  current D0
>> > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
>> > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>> > cap 11[a0] = MSI-X supports 5 messages in map 0x1c
>> >
>> >
>> >
>> > patch < em.csum_tso.20100817.patch
>> > Hmm...  Looks like a unified diff to me...
>> > The text leading up to this was:
>> > --
>> > |Index: sys/dev/e1000/if_em.c
>> > |===
>> > |--- sys/dev/e1000/if_em.c  (revision 211398)
>> > |+++ sys/dev/e1000/if_em.c  (working copy)
>> > --
>> > Patching file sys/dev/e1000/if_em.c using Plan A...
>> > Hunk #1 succeeded at 237.
>> > Hunk #2 succeeded at 1730.
>> > Hunk #3 succeeded at 1759.
>> > Hunk #4 succeeded at 1930.
>> > Hunk #5 succeeded at 3148.
>> > Hunk #6 succeeded at 3351.
>> > Hunk #7 succeeded at 3533.
>> > Hunk #8 succeeded at 3590.
>> > Hunk #9 succeeded at 3603.
>> > Hmm...  The next patch looks like a unified diff to me...
>> > The text leading up to this was:
>> > --
>> > |Index: sys/dev/e1000/if_em.h
>> > |===
>> > |--- sys/dev/e1000/if_em.h  (revision 211398)
>> > |+++ sys/dev/e1000/if_em.h  (working copy)
>> > --
>> > Patching file sys/dev/e1000/if_em.h using Plan A...
>> > Hunk #1 succeeded at 284.
>> > done
>> >
>> > ---Mike
>> >
>> >
>> > >you're using multiple Tx queues with em(4) it would be better to
>> > >disable Tx checksum offloading as driver always have to create a
>> > >new checksum context for each frame. This will effectively disable
>> > >pipelined Tx data DMA which in turn greatly slows down Tx
>> > >performance for small sized frames. The reason driver have to
>> > >create a new checksum context when it uses multiple Tx queues comes
>> > >from hardware limitation. The controller tracks only for the last
>> > >context descriptor that was written such that driver does not know
>> > >the state of checksum context configured in other Tx queue.
>> > >Hope this helps.
>> > >
>> > >>
>> > >>
>> > >> ---Mike
>> > >>
>> > >>
>> > >> At 03:36 PM 7/2/2010, Pyun YongHyeon wrote:
>> > >> >On Fri, Jul 02, 2010 at 01:39:22PM -0400, Mike Tancsa wrote:
>> > >> >> Hi Jack,
>> > >> >> Just a followup to the email below. I now saw what appears
>> > >> >> to be the same problem on RELENG_8, but on a different nic and
>> with
>> > >> >> VLANs.  So not sure if this is a general e

Re: RELENG_7 em problems (and RELENG_8)

2010-09-26 Thread Jack Vogel
The NIC has 5 MSIX vectors and can have 2 queues, I have been trying to
release code with both queues active, but its been unstable, I finally
concluded
its not worth the aggrevation :)

Your em1 is using MSI not MSIX and thus can't have multiple queues. I'm
not sure whats broken from what you show here. I will try to get the new
driver out shortly for you to try.

Jack


On Sun, Sep 26, 2010 at 2:57 PM, Mike Tancsa  wrote:

> At 06:36 PM 9/24/2010, Jack Vogel wrote:
>
>> There is a new revision of the em driver coming next week, its going thru
>> some
>> stress pounding over the weekend, if no issues show up I'll put it into
>> HEAD.
>>
>> Yongari's changes in TX context handling which effects checksum and tso
>> are added. I've also decided that multiple queues in 82574 just are a
>> source
>> of problems without a lot of benefit, so it still uses MSIX but with only
>> 3 vectors,
>> meaning it seperates TX and RX but has a single queue.
>>
>
> Thanks, looking forward to trying it out!  With respect to the multiple
> queues, I thought the driver already used just the one on RELENG_8 ?  If
> not, is there a way to force the existing driver to use just the one queue ?
>
> On the box that has the NIC locking up, it shows
>
> e...@pci0:9:0:0: class=0x02 card=0x34ec8086 chip=0x10d38086 rev=0x00
> hdr=0x00
>
>vendor = 'Intel Corporation'
>device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
>class  = network
>subclass   = ethernet
>cap 01[c8] = powerspec 2  supports D0 D3  current D0
>cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
>cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>
> and
>
> vmstat -i shows
>
> irq256: em0  5129063353
> irq257: em1   531251 36
>
> in a wedged state, stats look like
>
> dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5
> dev.em.1.%driver: em
> dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART
> dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086
> subdevice=0x34ec class=0x02
> dev.em.1.%parent: pci9
> dev.em.1.nvm: -1
> dev.em.1.rx_int_delay: 0
> dev.em.1.tx_int_delay: 66
> dev.em.1.rx_abs_int_delay: 66
> dev.em.1.tx_abs_int_delay: 66
> dev.em.1.rx_processing_limit: 100
> dev.em.1.link_irq: 0
> dev.em.1.mbuf_alloc_fail: 0
> dev.em.1.cluster_alloc_fail: 0
> dev.em.1.dropped: 0
> dev.em.1.tx_dma_fail: 0
> dev.em.1.fc_high_water: 18432
> dev.em.1.fc_low_water: 16932
> dev.em.1.mac_stats.excess_coll: 0
> dev.em.1.mac_stats.symbol_errors: 0
> dev.em.1.mac_stats.sequence_errors: 0
> dev.em.1.mac_stats.defer_count: 0
> dev.em.1.mac_stats.missed_packets: 41522
> dev.em.1.mac_stats.recv_no_buff: 19
> dev.em.1.mac_stats.recv_errs: 0
> dev.em.1.mac_stats.crc_errs: 0
> dev.em.1.mac_stats.alignment_errs: 0
> dev.em.1.mac_stats.coll_ext_errs: 0
> dev.em.1.mac_stats.rx_overruns: 41398
> dev.em.1.mac_stats.watchdog_timeouts: 0
> dev.em.1.mac_stats.xon_recvd: 0
> dev.em.1.mac_stats.xon_txd: 0
> dev.em.1.mac_stats.xoff_recvd: 0
> dev.em.1.mac_stats.xoff_txd: 0
> dev.em.1.mac_stats.total_pkts_recvd: 95229129
> dev.em.1.mac_stats.good_pkts_recvd: 95187607
> dev.em.1.mac_stats.bcast_pkts_recvd: 79244
> dev.em.1.mac_stats.mcast_pkts_recvd: 0
> dev.em.1.mac_stats.rx_frames_64: 93680
> dev.em.1.mac_stats.rx_frames_65_127: 1516349
> dev.em.1.mac_stats.rx_frames_128_255: 4464941
> dev.em.1.mac_stats.rx_frames_256_511: 4024
> dev.em.1.mac_stats.rx_frames_512_1023: 2096067
> dev.em.1.mac_stats.rx_frames_1024_1522: 87012546
> dev.em.1.mac_stats.good_octets_recvd: 0
> dev.em.1.mac_stats.good_octest_txd: 0
> dev.em.1.mac_stats.total_pkts_txd: 66775098
> dev.em.1.mac_stats.good_pkts_txd: 66775098
> dev.em.1.mac_stats.bcast_pkts_txd: 509
> dev.em.1.mac_stats.mcast_pkts_txd: 7
> dev.em.1.mac_stats.tx_frames_64: 48038472
> dev.em.1.mac_stats.tx_frames_65_127: 13402833
> dev.em.1.mac_stats.tx_frames_128_255: 5324413
> dev.em.1.mac_stats.tx_frames_256_511: 957
> dev.em.1.mac_stats.tx_frames_512_1023: 319
> dev.em.1.mac_stats.tx_frames_1024_1522: 8104
> dev.em.1.mac_stats.tso_txd: 1069
> dev.em.1.mac_stats.tso_ctx_fail: 0
> dev.em.1.interrupts.asserts: 0
> dev.em.1.interrupts.rx_pkt_timer: 0
> dev.em.1.interrupts.rx_abs_timer: 0
> dev.em.1.interrupts.tx_pkt_timer: 0
> dev.em.1.interrupts.tx_abs_timer: 0
> dev.em.1.interrupts.tx_queue_empty: 0
> dev.em.1.interrupts.tx_queue_min_thresh: 0
> dev.em.1.interrupts.rx_desc_min_thresh: 0
> dev.em.1.interrupts.rx_overrun: 0
> dev.em.1.host.breaker_tx_pkt: 0
> dev.em.1.host.host_tx_p

Re: RELENG_7 em problems (and RELENG_8)

2010-09-26 Thread Jack Vogel
The system I've had stress tests running on has 82574 LOMs, so I hope it
will solve the problem, will see tomorrow morning at how things have held
up...

Jack


On Sun, Sep 26, 2010 at 4:43 PM, Mike Tancsa  wrote:

> At 06:19 PM 9/26/2010, Jack Vogel wrote:
>
>> Your em1 is using MSI not MSIX and thus can't have multiple queues. I'm
>> not sure whats broken from what you show here. I will try to get the new
>> driver out shortly for you to try.
>>
>
> With this particular NIC, it will wedge under high load.  I tried 2
> different motherboards and chipsets the same behaviour.
>
>---Mike
>
>
>  Jack
>>
>>
>>
>> On Sun, Sep 26, 2010 at 2:57 PM, Mike Tancsa <<mailto:m...@sentex.net>
>> m...@sentex.net> wrote:
>> At 06:36 PM 9/24/2010, Jack Vogel wrote:
>> There is a new revision of the em driver coming next week, its going thru
>> some
>> stress pounding over the weekend, if no issues show up I'll put it into
>> HEAD.
>>
>> Yongari's changes in TX context handling which effects checksum and tso
>> are added. I've also decided that multiple queues in 82574 just are a
>> source
>> of problems without a lot of benefit, so it still uses MSIX but with only
>> 3 vectors,
>> meaning it seperates TX and RX but has a single queue.
>>
>>
>> Thanks, looking forward to trying it out!  With respect to the multiple
>> queues, I thought the driver already used just the one on RELENG_8 ?  If
>> not, is there a way to force the existing driver to use just the one queue ?
>>
>> On the box that has the NIC locking up, it shows
>>
>> e...@pci0:9:0:0: class=0x02 card=0x34ec8086 chip=0x10d38086 rev=0x00
>> hdr=0x00
>>
>>   vendor = 'Intel Corporation'
>>   device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
>>   class  = network
>>   subclass   = ethernet
>>   cap 01[c8] = powerspec 2  supports D0 D3  current D0
>>   cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
>>   cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>>
>> and
>>
>> vmstat -i shows
>>
>> irq256: em0  5129063353
>> irq257: em1   531251 36
>>
>> in a wedged state, stats look like
>>
>> dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5
>> dev.em.1.%driver: em
>> dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART
>> dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086
>> subdevice=0x34ec class=0x02
>> dev.em.1.%parent: pci9
>> dev.em.1.nvm: -1
>> dev.em.1.rx_int_delay: 0
>> dev.em.1.tx_int_delay: 66
>> dev.em.1.rx_abs_int_delay: 66
>> dev.em.1.tx_abs_int_delay: 66
>> dev.em.1.rx_processing_limit: 100
>> dev.em.1.link_irq: 0
>> dev.em.1.mbuf_alloc_fail: 0
>> dev.em.1.cluster_alloc_fail: 0
>> dev.em.1.dropped: 0
>> dev.em.1.tx_dma_fail: 0
>> dev.em.1.fc_high_water: 18432
>> dev.em.1.fc_low_water: 16932
>> dev.em.1.mac_stats.excess_coll: 0
>> dev.em.1.mac_stats.symbol_errors: 0
>> dev.em.1.mac_stats.sequence_errors: 0
>> dev.em.1.mac_stats.defer_count: 0
>> dev.em.1.mac_stats.missed_packets: 41522
>> dev.em.1.mac_stats.recv_no_buff: 19
>> dev.em.1.mac_stats.recv_errs: 0
>> dev.em.1.mac_stats.crc_errs: 0
>> dev.em.1.mac_stats.alignment_errs: 0
>> dev.em.1.mac_stats.coll_ext_errs: 0
>> dev.em.1.mac_stats.rx_overruns: 41398
>> dev.em.1.mac_stats.watchdog_timeouts: 0
>> dev.em.1.mac_stats.xon_recvd: 0
>> dev.em.1.mac_stats.xon_txd: 0
>> dev.em.1.mac_stats.xoff_recvd: 0
>> dev.em.1.mac_stats.xoff_txd: 0
>> dev.em.1.mac_stats.total_pkts_recvd: 95229129
>> dev.em.1.mac_stats.good_pkts_recvd: 95187607
>> dev.em.1.mac_stats.bcast_pkts_recvd: 79244
>> dev.em.1.mac_stats.mcast_pkts_recvd: 0
>> dev.em.1.mac_stats.rx_frames_64: 93680
>> dev.em.1.mac_stats.rx_frames_65_127: 1516349
>> dev.em.1.mac_stats.rx_frames_128_255: 4464941
>> dev.em.1.mac_stats.rx_frames_256_511: 4024
>> dev.em.1.mac_stats.rx_frames_512_1023: 2096067
>> dev.em.1.mac_stats.rx_frames_1024_1522: 87012546
>> dev.em.1.mac_stats.good_octets_recvd: 0
>> dev.em.1.mac_stats.good_octest_txd: 0
>> dev.em.1.mac_stats.total_pkts_txd: 66775098
>> dev.em.1.mac_stats.good_pkts_txd: 66775098
>> dev.em.1.mac_stats.bcast_pkts_txd: 509
>> dev.em.1.mac_stats.mcast_pkts_txd: 7
>> dev.em.1.mac_stats.tx_frames_64: 48038472
>> dev.em.1.mac_stats.tx_frames_65_127: 13402833
>

Re: Bogus "igb1: Could not setup receive structures" in 8-STABLE

2010-10-14 Thread Jack Vogel
The problem is mbuf resources, the driver is autoconfiguring the number of
queues based on the number of cores, on newer systems with lots of them
this is outstripping the mbuf resource pool.

I have decided to hard limit the queues to 8, you can fix the number
manually
by searching for num_queues in if_igb.c and setting it to something other
than
0 for now.

I am at work on a number of issues with igb and em right now which is why
there has not been an MFC yet.

Questions to me,

Jack


On Thu, Oct 14, 2010 at 3:26 PM, Terry Kennedy  wrote:

>  I've run across a strange problem with the igb driver in 8-STABLE - when
> I try to do anything with the second igb interface, I get one or more
> "igb1:
> Could not setup receive structures" error messages.
>
>  This can be reproduced as simply as booting in single-user mode with an
> empty /boot/loader.conf and doing:
>
>ifconfig igb0 up
>ifconfig igb1 up
>
>  I've tried to track this down, and as far as I can see, this is from some
> change introduced between 8.1-RELEASE (igb 1.9.5) and the current 8-STABLE
> (igb 2.0.1). When I try it when booting from the 8.1-RELEASE amd64 DVD, I
> can bring up both interfaces. When I try it with an 8-STABLE kernel, I get
> the error and igb1 is missing the "RUNNING" flag:
>
> igb0: flags=8843 metric 0 mtu 9000
>
>  options=1bb
>ether 00:25:90:xx:xx:bc
>inet6 fe80::225:90ff:fe02:xxbc%igb0 prefixlen 64 scopeid 0x1
>nd6 options=3
>media: Ethernet 1000baseT 
>status: active
> igb1: flags=8803 metric 0 mtu 9000
>
>  options=1bb
>ether 00:25:90:xx:xx:bd
>inet6 fe80::225:90ff:fe02:xxbd%igb1 prefixlen 64 tentative scopeid
> 0x2
>nd6 options=3
>media: Ethernet 1000baseT 
>status: active
>
>  I see 3 mbuf_jumbo_page allocation failures from:
>
> (1:20) rz1m:/sys/dev/e1000# vmstat -z | grep -v 0\$
> ITEM SIZE LIMIT  USED  FREE  REQUESTS
>  FAILURES
>
> 64 Bucket:536,0,  263,3,  263,
>  106
> 128 Bucket:  1048,0,  523,2,  525,
>  139
> mbuf_jumbo_page: 4096,12800,12307,  493,30343,
>3
>
>  which correspond to the 3 "igb1: Could not setup receive structures" mess-
> ages. If I try another "ifconfig igb1 up", I get another console message
> and
> the counter goes to 4. If I bump the kern.ipc.nmbjumbop sysctl to a larger
> value, like 15000, I get the same error message when trying to work with
> the
> igb1 device, so I don't think it is a "real" error but indicates a problem
> in the driver.
>
>  This is on a Supermicro X8DTH-iF, BIOS 2.0a (latest) with a dual on-board
> 82576. The dev.igb sysctl's for the two ports (excluding 0 values) are at-
> tached. Note that most of the igb1 values are zero:
>
> (0:34) rz1m:/sys/dev/e1000# sysctl -a | grep igb.1 | grep -v ": 0"
> dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection version - 2.0.1
> dev.igb.1.%driver: igb
> dev.igb.1.%location: slot=0 function=1
> dev.igb.1.%pnpinfo: vendor=0x8086 device=0x10c9 subvendor=0x15d9
> subdevice=0x0400 class=0x02
> dev.igb.1.%parent: pci1
> dev.igb.1.nvm: -1
> dev.igb.1.flow_control: 3
> dev.igb.1.enable_aim: 1
> dev.igb.1.rx_processing_limit: 100
> dev.igb.1.link_irq: 1
> dev.igb.1.device_control: 13632065
> dev.igb.1.extended_int_mask: 2147483648
> dev.igb.1.fc_high_water: 47488
> dev.igb.1.fc_low_water: 47472
>
> (0:35) rz1m:/sys/dev/e1000# sysctl -a | grep igb.0 | grep -v ": 0"
> dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.0.1
> dev.igb.0.%driver: igb
> dev.igb.0.%location: slot=0 function=0
> dev.igb.0.%pnpinfo: vendor=0x8086 device=0x10c9 subvendor=0x15d9
> subdevice=0x0400 class=0x02
> dev.igb.0.%parent: pci1
> dev.igb.0.nvm: -1
> dev.igb.0.flow_control: 3
> dev.igb.0.enable_aim: 1
> dev.igb.0.rx_processing_limit: 100
> dev.igb.0.link_irq: 4
> dev.igb.0.device_control: 1087373889
> dev.igb.0.rx_control: 67338274
> dev.igb.0.interrupt_mask: 4
> dev.igb.0.extended_int_mask: 2147484671
> dev.igb.0.fc_high_water: 47488
> dev.igb.0.fc_low_water: 47472
> dev.igb.0.queue0.txd_head: 823
> dev.igb.0.queue0.txd_tail: 823
> dev.igb.0.queue0.tx_packets: 1402
> dev.igb.0.queue0.rxd_head: 319
> dev.igb.0.queue0.rxd_tail: 318
> dev.igb.0.queue0.rx_packets: 319
> dev.igb.0.queue0.rx_bytes: 69075
> dev.igb.0.queue1.txd_head: 2
> dev.igb.0.queue1.txd_tail: 2
> dev.igb.0.queue1.tx_packets: 1
> dev.igb.0.queue1.rxd_head: 52
> dev.igb.0.queue1.rxd_tail: 51
> dev.igb.0.queue1.rx_packets: 52
> dev.igb.0.queue1.rx_bytes: 6369
> dev.igb.0.queue2.txd_head: 27
> dev.igb.0.queue2.txd_tail: 27
> dev.igb.0.queue2.tx_packets: 13
> dev.igb.0.queue2.rxd_head: 72
> dev.igb.0.queue2.rxd_tail: 71
> dev.igb.0.queue2.rx_packets: 72
> dev.igb.0.queue2.rx_bytes: 8789
> dev.igb.0.queue3.txd_head: 177
> dev.igb.0.queue3.txd_tail: 177
> dev.igb.0.queue3.tx_packets: 64
> dev.igb.0.queue3.rxd_head: 88
> dev.igb.0.queue3.rxd_tail: 87
> 

Re: Bogus "igb1: Could not setup receive structures" in 8-STABLE

2010-10-15 Thread Jack Vogel
The number of MSIX vectors it uses is the number of queues PLUS
one vector for link. I would use two or four rather than 3, but it should
be ok with that if that's what you wish.

Jack


On Thu, Oct 14, 2010 at 6:58 PM, Terry Kennedy  wrote:

> > The problem is mbuf resources, the driver is autoconfiguring the number
> of
> > queues based on the number of cores, on newer systems with lots of them
> > this is outstripping the mbuf resource pool.
>
>   That would make sense, as these systems have 16 cores (dual E5520's).
>
> > I have decided to hard limit the queues to 8, you can fix the number
> > manually
> > by searching for num_queues in if_igb.c and setting it to something other
> > than
> > 0 for now.
>
>   I changed it to 8, and saw the same problem. I noted that the igb boot
> messages changed from:
>
> Oct 14 18:28:02 rz1m kernel: igb0: Using MSIX interrupts with 10 vectors
> Oct 14 18:28:02 rz1m kernel: igb1: Using MSIX interrupts with 10 vectors
>
>  to:
>
> Oct 14 21:53:44 rz1m kernel: igb0: Using MSIX interrupts with 9 vectors
> Oct 14 21:53:44 rz1m kernel: igb1: Using MSIX interrupts with 9 vectors
>
>  So I dropped the value to 3 (on the assumption that the system uses one
> more than the specified value per interface), and got:
>
> igb0: Using MSIX interrupts with 4 vectors
> igb1: Using MSIX interrupts with 4 vectors
>
>  and both igb interfaces came up. I didn't try to find the maximum
> number of queues that would work.
>
> > I am at work on a number of issues with igb and em right now which is why
> > there has not been an MFC yet.
>
>   Understood. Thanks for the quick response and workaround.
>
>Terry Kennedy http://www.tmk.com
>te...@tmk.com New York, NY USA
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: repeating crashes with 8.1

2010-10-21 Thread Jack Vogel
pciconf -l

I need to know which hardware it is and that doesnt show me.

Jack


On Thu, Oct 21, 2010 at 3:28 PM, Randy Bush  wrote:

> em0:  port 0x2000-0x201f mem
> 0xe800-0xe801 irq 16 at device 0.0 on pci13
>
> randy
>
> Jeremy Chadwick wrote:
> >
> > On Thu, Oct 21, 2010 at 12:08:23PM -0700, Randy Bush wrote:
> > > FreeBSD 8.1-STABLE #2: Thu Oct 21 15:30:45 UTC 2010
> > > r...@rip.psg.com:/usr/obj/usr/src/sys/RIP amd64
> > >
> > > console recording
> > >
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > em0: discard frame w/o packet header
> > > panic: sbflush_internal: cc 4294965301 || mb 0 || mbcnt 0
> > > cpuid = 0
> > > panic: bufwrite: buffer is not busy???
> > >
> > >
> > > cpuid = 0
> > > Fatal trap 12: page fault while in kernel mode
> > > Uptime: cpuid = 2; 48mapic id = 02
> > > 36s
> > > fault virtual address   = 0x8040
> > > Physical memory: 4086 MB
> > > fault code  = supervisor read data, page not present
> > > Dumping 1647 MB:instruction pointer = 0x20:0x804c22ae
> > >  (CTRL-C to abort) stack pointer=
> 0x28:0xff8de9a0
> > > frame pointer   = 0x28:0xff8de9b0
> > > code segment= base 0x0, limit 0xf, type 0x1b
> > > = DPL 0, pres 1, long 1, def32 0, gran 1
> > > processor eflags= interrupt enabled, resume, IOPL = 0
> > > current process = 0 (em0 taskq)
> > > trap number = 12
> > >  1632 1616 1600 1584 1568 1552 1536 1520 1504 1488 1472 1456 1440 1424
> 1408 1392 1376 1360 1344 1328 1312 1296 1280 1264 1248 1232 1216 1200 1184
> 1168 1152 1136 1120 1104 1088 1072 1056 1040 1024 1008 992 976 960 944 928
> 912 896 880 864 848 832 816 800 784 768 752 736 720 704 688 672 656 640 624
> 608 592 576 560 544 528 512 496 480 464 448 432 416 400 384 368 352 336 320
> 304 288 272 256 240 224 208 192 176 160 144 128 112 96 80 64 48 32 16Attempt
> to write outside dump device boundaries.
> > >
> > > ** DUMP FAILED (ERROR 6) **
> > > Automatic reboot in 15 seconds - press a key on the console to abort
> > > em0: Watchdog timeout -- resetting
> > >
> > > and locked up.  required power cycle to reboot
> >
> > CC'ing Jack Vogel of Intel, who is currently re-working portions of the
> > em(4) driver.  I think taskq issue might be the thing he's fixing and
> > thus might have a workaround for you.
> >
> > But we're going to need to know exactly what em(4) model you have.
> >
> > Please provide "dmesg" output relevant to em0, and also "pciconf -lvc"
> > output for the em0@ device.
> >
> > --
> > | Jeremy Chadwick   j...@parodius.com |
> > | Parodius Networking   http://www.parodius.com/ |
> > | UNIX Systems Administrator  Mountain View, CA, USA |
> > | Making life hard for others since 1977.  PGP: 4BD6C0CB |
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: repeating crashes with 8.1

2010-10-22 Thread Jack Vogel
Odd, can you make any connection between this and the em complaints??

Jack


On Fri, Oct 22, 2010 at 6:59 PM, Mike Tancsa  wrote:

> At 09:11 PM 10/22/2010, Mike Tancsa wrote:
>
>> At 08:01 PM 10/22/2010, Chris Morrow wrote:
>>
>>> Note, Warren and I attempted to test this this evening on a 10.04 Ubuntu
>>> box, no crashy-crashy...
>>>
>>
>>
> I was able to trigger the issue on box (c).  I was ping6ing box (a) when I
> did a hard down of (d)'s connected interface. The box then dropped to
> debugger
>
>
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer = 0x20:0x80740a50
> stack pointer   = 0x28:0xff85a890
> frame pointer   = 0x28:0xff85a930
>
> code segment= base 0x0, limit 0xf, type 0x1b
>= DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 12 (swi4: clock)
> [thread pid 12 tid 17 ]
> Stopped at  in6_cksum+0x410:movzwl  (%rsi),%r10d
> db> bt
> Tracing pid 12 tid 17 td 0xff00025083e0
> in6_cksum() at in6_cksum+0x410
> icmp6_reflect() at icmp6_reflect+0x312
> icmp6_error() at icmp6_error+0x1ec
> nd6_llinfo_timer() at nd6_llinfo_timer+0x208
> softclock() at softclock+0x2a6
> intr_event_execute_handlers() at intr_event_execute_handlers+0x66
> ithread_loop() at ithread_loop+0xb2
> fork_exit() at fork_exit+0x12a
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xff85ad30, rbp = 0 ---
> db>
>
>
>
>
>  I was able to do it, but not the box I expected
>>
>> 4 boxes
>>
>> (a) Attacking host 2001:db8:1:1/64
>> (b) victim, not on a connected interface with a). Outside interface - em0
>> - 2001:db8::2:1/64, inside interface - em1 - 2001:db8::3:1/64
>> (c) a host behind (b) 2001:db8::3:c/64
>> (d) a host behind (b), 2001:db8::3:d/64
>>
>>
>> hosts (c) and (d) have default gateways to b).  (c) however, has a next
>> hop for (a) via (d).  So rather than go out its normal default gateway, it
>> takes an extra hop via (d).
>>
>> Start a ping6 from (a) to (c).  Then down (d)'s interface so that the
>> ping6 fails.  Let the ping keep running for an hour or two.  Eventually (b)
>> gets error messages like
>>
>> Oct 22 18:38:32 zoo kernel: em1: discard frame w/o packet header
>>
>> and crashes.
>>
>> Unfortunately, I thought it would be (c) that crapped out, not (b) and I
>> didnt have crash dumps enabled on the host.  Just in the process of setting
>> up a better environment.
>>
>>---Mike
>>
>>  -chris
>>>
>>> On 10/22/10 16:27, Joel Jaeggli wrote:
>>> > Ok I'll try testing that on some box I can reach with both hands.
>>> >
>>> > fyi nagasaki is:
>>> >
>>> > [r...@nagasaki ~]# uname -a
>>> > FreeBSD nagasaki.bogus.com 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #13:
>>> > Sun May 30 22:19:23 UTC 2010
>>> > r...@nagasaki.bogus.com:/usr/obj/usr/src/sys/GENERIC  i386
>>> > [r...@nagasaki ~]#
>>> >
>>> >
>>> > On 10/22/10 1:17 PM, Randy Bush wrote:
>>> >>> Do you know how this panic is triggered ? Are you able to
>>> >>> create it on demand ?
>>> >>
>>> >> no i do not.  bring server up and it'll happen in half an hour.
>>> >> and the server was happy for two months.  so i am thinking
>>> hardware.
>>> >
>>> > Perhaps. The reason I ask is that I had a box go down last night
>>> with
>>> > the same set of errors.  The box has a number of ipv6 routes, but
>>> its
>>> > next hop was down and the problems started soon after. So I wonder
>>> if
>>> > it has something to do with that.  Do you have ipv6 on this box and
>>> > are all the next hop addresses correct / reachable ?
>>> >
>>> > Oct 22 02:06:02 i4 kernel: em1: discard frame w/o packet header
>>> > Oct 22 02:06:10 i4 kernel: em2: discard frame w/o packet header
>>> > Oct 22 02:06:21 i4 kernel: em1: discard frame w/o packet header
>>> 
>>>  it was co-incident with a border router being taken down for new
>>> router
>>>  install.  that router was the v6 exit the servers was using.  i have
>>> now
>>>  pointed default6 to a different exit.  the server seems happy.
>>> >>>
>>> >>>
>>> >>> Are you servers still up ?  I guess the question now is how to
>>> >>> trigger this problem on demand.  Perhaps lots of inbound ipv6 traffic
>>> >>> with a bad next hop out ?  How recent are you sources ?  The kernel
>>> >>> said Oct 21st. Were the sources from then too ?
>>> >>
>>> >> yes, kernel and world from 21 oct
>>> >>
>>> >> chris had an idea on retrigger, install a static for a small dest that
>>> >> points to a hole.  send a packet to the small dest.
>>> >>
>>> >> randy
>>> >>
>>>
>>
>> 
>> Mike Tancsa,  tel +1 519 651 3400
>> Sentex Communications,m...@sentex.net
>> Providing Internet since 1994 

Re: icmp packets on em larger than 1472

2010-11-10 Thread Jack Vogel
Try the code from HEAD, I've run that on a 82546 and it worked ok.

Jack


On Wed, Nov 10, 2010 at 5:55 AM, Kirill Yelizarov  wrote:

>
>
> --- On Wed, 11/10/10, Jeremy Chadwick  wrote:
>
> > From: Jeremy Chadwick 
> > Subject: Re: icmp packets on em larger than 1472
> > To: "Kirill Yelizarov" 
> > Cc: freebsd-stable@freebsd.org, "Jack Vogel" 
> > Date: Wednesday, November 10, 2010, 3:59 PM
> > On Wed, Nov 10, 2010 at 04:21:12AM
> > -0800, Kirill Yelizarov wrote:
> > > Hi,
> > >
> > > All my em cards running 8.1 stable don't reply to icmp
> > echo requests packets larger than 1472 bytes.
> > >
> > > On stable 7.2 the same hardware works as expected:
> > > # ping -s 1500 192.168.64.99
> > > PING 192.168.64.99 (192.168.64.99): 1500 data bytes
> > > 1508 bytes from 192.168.64.99: icmp_seq=0 ttl=63
> > time=1.249 ms
> > > 1508 bytes from 192.168.64.99: icmp_seq=1 ttl=63
> > time=1.158 ms
> > >
> > > Here is the dump on em interface
> > > 15:06:31.452043 IP 192.168.66.65 > *: ICMP echo
> > request, id 28729, seq 5, length 1480
> > > 15:06:31.452047 IP 192.168.66.65 > : icmp
> > > 15:06:31.452069 IP  > 192.168.66.65: ICMP echo
> > reply, id 28729, seq 5, length 1480
> > > 15:06:31.452071 IP *** > 192.168.66.65: icmp
> > >
> > > Same ping from same source (it's a 8.1 stable with fxp
> > interface) to em card running 8.1 stable
> > > #pciconf -lv
> > > e...@pci0:3:4:0:class=0x02
> > card=0x10798086 chip=0x10798086 rev=0x03 hdr=0x00
> > > vendor
> >= 'Intel Corporation'
> > > device
> >= 'Dual Port Gigabit Ethernet Controller
> > (82546EB)'
> > > class  =
> > network
> > > subclass   =
> > ethernet
> > >
> > > # ping -s 1472 192.168.64.200
> > > PING 192.168.64.200 (192.168.64.200): 1472 data bytes
> > > 1480 bytes from 192.168.64.200: icmp_seq=0 ttl=63
> > time=0.848 ms
> > > ^C
> > >
> > > # ping -s 1473 192.168.64.200
> > > PING 192.168.64.200 (192.168.64.200): 1473 data bytes
> > > ^C
> > > --- 192.168.64.200 ping statistics ---
> > > 4 packets transmitted, 0 packets received, 100.0%
> > packet loss
> > >
> > > And here is it's dump on em card
> > > 5:11:15.191496 IP 192.168.66.65 > *: ICMP echo
> > request, id 33593, seq 0, length 1480
> > > 15:11:15.191534 IP 192.168.66.65 > *: icmp
> > > 15:11:16.192119 IP 192.168.66.65 > *: ICMP echo
> > request, id 33593, seq 1, length 1480
> > > 15:11:16.192156 IP 192.168.66.65 > **: icmp
> > >
> > > igb cards on 8.1 stable are not affected
> >
> > Please provide uname -a output from the machine with the
> > emX devices, as
> > well as relevant emX information from "dmesg" (e.g. driver
> > version).
> > "sysctl dev.em.X" might also be helpful.
> >
>
> Here are the two examples
>
> uname -a
> FreeBSD border1 8.1-STABLE FreeBSD 8.1-STABLE #0: Thu Aug 26 16:54:15 MSD
> 2010 r...@border1:/usr/obj/usr/src/sys/BORDER1  amd64
>
> Oct 22 14:36:18 border1 kernel: em0:  Connection 1.0.1> port 0xdc00-0xdc3f mem 0xfcfc-0xfcfd irq 54 at
> device 4.0 on pci3
> Oct 22 14:36:18 border1 kernel: em0: [FILTER]
> Oct 22 14:36:18 border1 kernel: em0: Ethernet address: 00:04:23:cc:df:ea
> Oct 22 14:36:18 border1 kernel: em1:  Connection 1.0.1> port 0xdc80-0xdcbf mem 0xfcfe-0xfcff irq 55 at
> device 4.1 on pci3
> Oct 22 14:36:18 border1 kernel: em1: [FILTER]
> Oct 22 14:36:18 border1 kernel: em1: Ethernet address: 00:04:23:cc:df:eb
>
> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.0.5
> dev.em.0.%driver: em
> dev.em.0.%location: slot=0 function=0 handle=\_SB_.PCI0.MRP1.HART
> dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086
> subdevice=0x34da class=0x02
> dev.em.0.%parent: pci1
> dev.em.0.nvm: -1
> dev.em.0.rx_int_delay: 66
> dev.em.0.tx_int_delay: 66
> dev.em.0.rx_abs_int_delay: 250
> dev.em.0.tx_abs_int_delay: 250
> dev.em.0.rx_processing_limit: -1
> dev.em.0.link_irq: 0
> dev.em.0.mbuf_alloc_fail: 0
> dev.em.0.cluster_alloc_fail: 0
> dev.em.0.dropped: 0
> dev.em.0.tx_dma_fail: 0
> dev.em.0.rx_overruns: 0
> dev.em.0.watchdog_timeouts: 0
> dev.em.0.device_control: 1477444168
> dev.em.0.rx_control: 67141634
> dev.em.0.fc_high_water: 18432
> dev.em.0.fc_low_water: 16932
> dev.em.0.queue0.txd_head:

Re: problems with network on em

2010-11-20 Thread Jack Vogel
Did you mean the 7.1.7 version from HEAD ?

Jack


On Sat, Nov 20, 2010 at 11:18 AM, Naujikas Rolandas <
rolandas.nauji...@mif.vu.lt> wrote:

> I'm trying to test with newest version of /sys/dev/e1000 from FreeBSD
> 8-STABLE.
> For that I'm using loadable module option, because it is easier to build
> with minimal changes in kernel source.
> Only /sys/dev/e1000 and /sys/modules/em need to be updated.
> Without changes in /sys/modules/em/Makefile it compiles, but have missing
> symbol or if you compile static kernel - the same problem.
> Now I'm testing and it looks promising (except I see a little bigger kernel
> thread netisr cpu load, but it's acceptable).
>
> Regards, Rolandas Naujikas
>
> On 2010.11.20, at 19:05, Jeremy Chadwick wrote:
>
> > On Sat, Nov 20, 2010 at 06:38:19PM +0200, Naujikas Rolandas wrote:
> >> I just got another lockup.
> >> It looks like in the time of lockup the number of Ierrs is increasing:
> >> NameMtu Network   Address  Ipkts Ierrs Idrop
>  Opkts Oerrs  Coll
> >> em21500   00:14:4f:XX:XX:XX 13060395 18438 0
>  6579984 1 0
> >>
> >> After "ifconfig em2 down;ifconfig em2 up" Ierrs stays at 0 rate for long
> time.
> >> Without DEVICE_POLLING it was similar situation.
> >>
> >> Regards, Rolandas Naujikas
> >>
> >> On 2010.11.20, at 18:24, rol...@gmail.com wrote:
> >>
> >>> On 2010.11.20, at 17:54, Jeremy Chadwick wrote:
> >>>
> >>>> On Sat, Nov 20, 2010 at 05:09:28PM +0200, rol...@gmail.com wrote:
> >>>>> I'm experiencing network interface stalls on em in FreeBSD
> 8.1-RELEASE (-p1).
> >>>>> It looks like the problem could be solved in 8-STABLE, but should I
> upgrade to it ?
> >>>>> Is it OK to try to get only em driver code and recompile as module
> and try to run it ?
> >>>>>
> >>>>> sysctl dev.em.2.stats=1:
> >>>>> ...
> >>>>> em2: Missed Packets = 101334
> >>>>> em2: Receive No Buffers = 488
> >>>>> ...
> >>>>> em2: RX overruns = 1356
> >>>>> em2: watchdog timeouts = 1
> >>>>> ...
> >>>>>
> >>>>> Only "ifconfig em2 down;ifconfig em2 up" helps for some time.
> >>>>> The same happens on em0 interface only, but not in the same time.
> >>>>> It is production (NAT) router with pf+pfsync+carp and failover over
> another router.
> >>>>> They are old "SunFire X4100" boxes (4GB RAM, 2*2 AMD Opteron 2.2GHz).
> >>>>
> >>>> You're going to need to provide output from the following, run as
> root.
> >>>> For the pciconf command, please only include the entry that's relevant
> >>>> to the device in question (em2).  You can also XXX-out the MAC address
> >>>> and/or IP addresses if you're worried about security.
> >>>>
> >>>> $ pciconf -lvc
> >>>
> >>> e...@pci0:1:2:0: class=0x02 card=0x10118086 chip=0x10108086
> rev=0x03 hdr=0x00
> >>>   vendor = 'Intel Corporation'
> >>>   device = 'Dual Port Gigabit Ethernet Controller (Copper)
> (82546EB)'
> >>>   class  = network
> >>>   subclass   = ethernet
> >>>   cap 01[dc] = powerspec 2  supports D0 D3  current D0
> >>>   cap 07[e4] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split
> transaction
> >>>   cap 05[f0] = MSI supports 1 message, 64 bit
> >>>
> >>>> $ dmesg | grep em2
> >>>
> >>> em2:  port
> 0x9400-0x943f mem 0xfbfa-0xfbfb irq 24 at device 2.0 on pci1
> >>> em2: [FILTER]
> >>> em2: Ethernet address: 00:14:4f:XX:XX:XX
> >>>
> >>>> $ sysctl dev.em.2
> >>>
> >>> dev.em.2.%desc: Intel(R) PRO/1000 Legacy Network Connection 1.0.1
> >>> dev.em.2.%driver: em
> >>> dev.em.2.%location: slot=2 function=0
> >>> dev.em.2.%pnpinfo: vendor=0x8086 device=0x1010 subvendor=0x8086
> subdevice=0x1011 class=0x02
> >>> dev.em.2.%parent: pci1
> >>> dev.em.2.debug: -1
> >>> dev.em.2.stats: -1
> >>> dev.em.2.rx_int_delay: 0
> >>> dev.em.2.tx_int_delay: 66
> >>> dev.em.2.rx_abs_int_delay: 66
> >>> dev.em.2.tx_abs_int_delay: 66
> >>> dev.em.2.rx_processing_limit: 100
> >>>
> >>>

Re: problems with network on em

2010-11-20 Thread Jack Vogel
I'd appreciate it if you could try and get the driver from HEAD, I will be
putting it into STABLE
next week, and it would be nice to see if it fixed your problem. It will
build in your STABLE
environment just fine, do you know how to do this, if not just say so and I
can give you
further details.

Regards,

Jack


On Sat, Nov 20, 2010 at 1:53 PM, Naujikas Rolandas <
rolandas.nauji...@mif.vu.lt> wrote:

> I don't know about version, but I'm using RELENG_8 branch only. It is
> FreeBSD 8-STABLE also.
>
> Regards, Rolandas Naujikas
>
> P.S. I just got ~1Gbit/s (125MB/s,115Kpps) forwarding traffic in testing
> (24 nodes was downloading a file with wget from server from another side of
> router), but finally there was some deadlock. I'm recovering the data on it.
>
> On 2010.11.20, at 22:37, Jack Vogel wrote:
>
> > Did you mean the 7.1.7 version from HEAD ?
> >
> > Jack
> >
> >
> > On Sat, Nov 20, 2010 at 11:18 AM, Naujikas Rolandas <
> > rolandas.nauji...@mif.vu.lt> wrote:
> >
> >> I'm trying to test with newest version of /sys/dev/e1000 from FreeBSD
> >> 8-STABLE.
> >> For that I'm using loadable module option, because it is easier to build
> >> with minimal changes in kernel source.
> >> Only /sys/dev/e1000 and /sys/modules/em need to be updated.
> >> Without changes in /sys/modules/em/Makefile it compiles, but have
> missing
> >> symbol or if you compile static kernel - the same problem.
> >> Now I'm testing and it looks promising (except I see a little bigger
> kernel
> >> thread netisr cpu load, but it's acceptable).
> >>
> >> Regards, Rolandas Naujikas
> >>
> >> On 2010.11.20, at 19:05, Jeremy Chadwick wrote:
> >>
> >>> On Sat, Nov 20, 2010 at 06:38:19PM +0200, Naujikas Rolandas wrote:
> >>>> I just got another lockup.
> >>>> It looks like in the time of lockup the number of Ierrs is increasing:
> >>>> NameMtu Network   Address  Ipkts Ierrs Idrop
> >> Opkts Oerrs  Coll
> >>>> em21500   00:14:4f:XX:XX:XX 13060395 18438 0
> >> 6579984 1 0
> >>>>
> >>>> After "ifconfig em2 down;ifconfig em2 up" Ierrs stays at 0 rate for
> long
> >> time.
> >>>> Without DEVICE_POLLING it was similar situation.
> >>>>
> >>>> Regards, Rolandas Naujikas
> >>>>
> >>>> On 2010.11.20, at 18:24, rol...@gmail.com wrote:
> >>>>
> >>>>> On 2010.11.20, at 17:54, Jeremy Chadwick wrote:
> >>>>>
> >>>>>> On Sat, Nov 20, 2010 at 05:09:28PM +0200, rol...@gmail.com wrote:
> >>>>>>> I'm experiencing network interface stalls on em in FreeBSD
> >> 8.1-RELEASE (-p1).
> >>>>>>> It looks like the problem could be solved in 8-STABLE, but should I
> >> upgrade to it ?
> >>>>>>> Is it OK to try to get only em driver code and recompile as module
> >> and try to run it ?
> >>>>>>>
> >>>>>>> sysctl dev.em.2.stats=1:
> >>>>>>> ...
> >>>>>>> em2: Missed Packets = 101334
> >>>>>>> em2: Receive No Buffers = 488
> >>>>>>> ...
> >>>>>>> em2: RX overruns = 1356
> >>>>>>> em2: watchdog timeouts = 1
> >>>>>>> ...
> >>>>>>>
> >>>>>>> Only "ifconfig em2 down;ifconfig em2 up" helps for some time.
> >>>>>>> The same happens on em0 interface only, but not in the same time.
> >>>>>>> It is production (NAT) router with pf+pfsync+carp and failover over
> >> another router.
> >>>>>>> They are old "SunFire X4100" boxes (4GB RAM, 2*2 AMD Opteron
> 2.2GHz).
> >>>>>>
> >>>>>> You're going to need to provide output from the following, run as
> >> root.
> >>>>>> For the pciconf command, please only include the entry that's
> relevant
> >>>>>> to the device in question (em2).  You can also XXX-out the MAC
> address
> >>>>>> and/or IP addresses if you're worried about security.
> >>>>>>
> >>>>>> $ pciconf -lvc
> >>>>>
> >>>>> e...@pci0:1:2:0: class=0x02 card=0x10118086 chip=0x10108086
> >> rev=0x03 hdr=0x

Re: repeating crashes with 8.1

2010-11-23 Thread Jack Vogel
I'm a bit dubious about this, if a descriptor still has an mbuf it was due
to a discard,
go look at em_rx_discard(), you will notice there that all these things are
already
being done at that point. So do you have a scenario where we can have an
unused
mbuf that didn't come thru that path??

Jack


On Tue, Nov 23, 2010 at 10:47 AM, Bjoern A. Zeeb <
bzeeb-li...@lists.zabbadoz.net> wrote:

> On Sat, 23 Oct 2010, Bjoern A. Zeeb wrote:
>
> Hi,
>
> just to get this out.  Jack you might want to review and if ok, include
> in HEAD before we get feedback maybe.  To my understanding worst it
> would be overhead but not really harm.
>
>
>  > Oct 22 02:06:02 i4 kernel: em1: discard frame w/o packet header
 > Oct 22 02:06:10 i4 kernel: em2: discard frame w/o packet header

>>>
> The following is a random-guess by code reading that hasn't been tested
> yet but believed to be correct; also ran it by gnn.
>
> http://people.freebsd.org/~bz/20101122-03-em-pkthdr.diff
>
> --- sys/dev/e1000/if_em.c.orig  2010-11-01 20:57:53.0 -0400
> +++ sys/dev/e1000/if_em.c   2010-11-16 01:28:00.0 -0500
> @@ -3754,8 +3769,13 @@ em_refresh_mbufs(struct rx_ring *rxr, in
>** they can only be due to an error
>** and are to be reused.
>*/
> -   if (rxbuf->m_head != NULL)
> +   if (rxbuf->m_head != NULL) {
> +   rxbuf->m_head->m_len = rxbuf->m_head->m_pkthdr.len
> = adapter->rx_mbuf_sz;
> +   rxbuf->m_head->m_flags |= M_PKTHDR;
> +   rxbuf->m_head->m_data =
> rxbuf->m_head->m_ext.ext_buf;
> +   rxbuf->m_head->m_next = NULL;
>goto reuse;
> +   }
>m = m_getjcl(M_DONTWAIT, MT_DATA,
>M_PKTHDR, adapter->rx_mbuf_sz);
>/*
>
> I am not sure if igb and the others need a similar fix.  Haven't
> looked in detail. gnn mentioned similarities though good ones imho
> in there.
>
> If you were able to reproduce the pkthdr issue it would be great to
> test it.  If you always paniced it with IPv6 you may also want to test
> the applicable patch (use the direct URLs mentioned) from the very end of
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/148857
> to make sure you are not running into that race.
>
>
> /bz
>
> --
> Bjoern A. Zeeb  Welcome a new stage of life.
> Going to jail sucks --  All my daemons like it!
>  http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em(4) on FreeBSD is sometimes annoying

2008-08-08 Thread Jack Vogel
"me too" 's are of little help. Please elaborate on your "exact same",  since
each person's perception will be slightly different.

So far I have heard nothing that sounds like a driver issue.

Jack

On Fri, Aug 8, 2008 at 5:50 AM, Markus Vervier <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I just stumbled upon this thread. I experience the exact same behaviour as
> Martin on my Thinkpad X60:
>
> Thinkpad X60 Model: 1706GMG
> BIOS-Version 2.15 (7BETD4WW)
> FreeBSD 7.0 STABLE amd64 (from about two weeks ago) - same situtation on
> 7.0-RELEASE i386
>
> --
> Markus
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em(4) on FreeBSD is sometimes annoying

2008-08-08 Thread Jack Vogel
OK, I just got access to a machine, am going to install and see if I
can repro this
this afternoon.

Jack


On Fri, Aug 8, 2008 at 10:56 AM, Markus Vervier <[EMAIL PROTECTED]> wrote:
> Jack Vogel schrieb:
>>
>> "me too" 's are of little help. Please elaborate on your "exact same",
>>  since
>> each person's perception will be slightly different.
>>
>>
>
> Hi Jack,
>
> maybe read it like: Thinkpad X60 1706GMG affected too, so the problem is not
> specific to Martins machine.
>
> I can write the same steps to reproduce the behaviour as Martin here:
>
> <-->
> Once again, steps to reproduce this behavior:
> 1) Power the laptop OFF. Really OFF, I mean. No reboots!
> 2) Detach the cable from NIC.
> 3) Boot FreeBSD. Let it pass the DHCP phase (ifconfig_em0="DHCP") until
> login appears.
> 4) Attach the cable to the NIC.
> 5) Voila... no link.
> <-->
>
> My perception is that if the em driver gets loaded without a cable being
> plugged in, no link can be established.
> I can workaround the problem when em was not build into the kernel, by
> unloading the em-kmod and reloading it
> again with the cable plugged in. If the cable is not plugged in the
> interface will always stay in state "no carrier".
> The NIC works fine under Windows / Linux on the same machine.
>
> --
> Markus
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em(4) on FreeBSD is sometimes annoying

2008-08-11 Thread Jack Vogel
On Mon, Aug 11, 2008 at 7:02 AM, Jeremy Chadwick <[EMAIL PROTECTED]> wrote:
> On Mon, Aug 11, 2008 at 08:19:46AM +, Josh Paetzel wrote:
>> On Friday 08 August 2008 06:31:24 pm Jack Vogel wrote:
>> > OK, I just got access to a machine, am going to install and see if I
>> > can repro this
>> > this afternoon.
>> >
>> > Jack
>>
>> For what it's worth, I have a T60 that dual boots 6.3-R/amd64 and 7.0-R/i386
>> and neither install has this problem.  I can cold boot it with the NIC
>> unplugged, plug in a cable, I get a link light and ifconfig em0 goes to
>> active, dhclient em0 gets an IP successfully.
>
> As promised, I tested said issue out on my T60p (widescreen) tonight,
> using both FreeBSD 7.0-STABLE and 7.0-RELEASE.
>
> I wasn't able to reproduce the issue; so my experience was the same as
> Josh.  I can also boot it with the CAT5 inserted, dhclient fetch an IP,
> no LED oddities -- then yank the cable (LED and link light go off),
> re-insert the cable, and within a moment or so dhclient again gets an
> IP.
>
> I'm left wondering if maybe there's an EEPROM setting that's doing this
> (purely speculative on my part), or possibly some odd BIOS quirk.  My
> T60p (widescreen) is running BIOS 1.14.  It's worth noting that the
> non-widescreen T60p uses a different BIOS.

Cool, it turned out that the laptop I was told I could use was an X61 and it had
an ICH8 NIC rather than 82573 anyway, they were supposed to get me one
today but given the two of you have already gone thru this verification I see
little point in doing the same.

Seems possibly a BIOS thing, if not that bad cable, bad link partner maybe??

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em(4) on FreeBSD is sometimes annoying

2008-08-11 Thread Jack Vogel
On Mon, Aug 11, 2008 at 1:32 PM, Markus Vervier <[EMAIL PROTECTED]> wrote:
> Jack Vogel wrote:
>>
>> Seems possibly a BIOS thing, if not that bad cable, bad link partner
>> maybe??
>>
>
> I had the problem with all sorts of switches / cables. How can I dump my
> EEPROM settings if that helps?

I didn't mean the NIC EEPROM, but the system BIOS, make sure you are
running the version that Jeremy said he was, if that matches you might go
look at settings in the BIOS that are about management.

I thought you told us that when you had a back to back connection that it
worked, no??

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em(4) on FreeBSD is sometimes annoying

2008-08-13 Thread Jack Vogel
On Wed, Aug 13, 2008 at 3:04 AM, Markus Vervier <[EMAIL PROTECTED]> wrote:
> Jack Vogel wrote:
>>
>>  I didn't mean the NIC EEPROM, but the system BIOS, make sure you are
>>  running the version that Jeremy said he was, if that matches you might go
>>  look at settings in the BIOS that are about management.
>>
> I'm now running the latest BIOS for my X60 version 2.22 with the same
> results. Jeremy runs version 1.15 but on a T60.
>>
>>  I thought you told us that when you had a back to back connection that it
>>  worked, no??
>
> Sorry, it does not work when having a b2b connection, never said that. But I
> noticed another thing:
>
> It is important that the device was up without a cable connected:
>
> 1. power off completely
> 2. boot freebsd without a cable connected
> 3. in a rootshell do ifconfig em0 up
> 4. connect the cable
> 5. no link

Hmmm, well let me see if I can get ahold of an X60.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em(4) on FreeBSD is sometimes annoying

2008-08-13 Thread Jack Vogel
On Wed, Aug 13, 2008 at 8:22 AM, Jack Vogel <[EMAIL PROTECTED]> wrote:
> On Wed, Aug 13, 2008 at 3:04 AM, Markus Vervier <[EMAIL PROTECTED]> wrote:
>> Jack Vogel wrote:
>>>
>>>  I didn't mean the NIC EEPROM, but the system BIOS, make sure you are
>>>  running the version that Jeremy said he was, if that matches you might go
>>>  look at settings in the BIOS that are about management.
>>>
>> I'm now running the latest BIOS for my X60 version 2.22 with the same
>> results. Jeremy runs version 1.15 but on a T60.
>>>
>>>  I thought you told us that when you had a back to back connection that it
>>>  worked, no??
>>
>> Sorry, it does not work when having a b2b connection, never said that. But I
>> noticed another thing:
>>
>> It is important that the device was up without a cable connected:
>>
>> 1. power off completely
>> 2. boot freebsd without a cable connected
>> 3. in a rootshell do ifconfig em0 up
>> 4. connect the cable
>> 5. no link
>
> Hmmm, well let me see if I can get ahold of an X60.
>
> Jack
>

Markus,

I have reproduced the problem, you are correct. Thank you for
persisting thru my doubts :)
There is a flip side to the problem, once the interface is up and
active, if you remove the
cable it never shows inactive :(

It must be some interrupt handling/media status issue, I'll be looking
into a fix this
afternoon.

OH, I should note that as long as you put in a cable before you
ifconfig up its fine so
its not that hard to work around the issue.

Stay tuned

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


HEADS UP: E1000 networking changes in STABLE/7.1 RELEASE

2008-08-13 Thread Jack Vogel
There is a change that has been in the STABLE tree for a few months,
but many might have missed it,
its an important change to note and possibly for some prepare for with
the 7.1 RELEASE.

There is a new E1000 network driver, igb, which supports two family of
adapters so far: the
82575 and the new 82576.  In FreeBSD 7 the support for 82575 was in
the em driver, however
due to support issues across all the OS's we do drivers for, the
decision was made to split the
newer drivers off from those before. There are big differences in the
register set, and in things
like the descriptor format that made this expedient.

I made the support for the 82575 available very early in FreeBSD,
earlier even than Linux so
that certain vendors could have a working driver early. At first I
thought to avoid splitting the
driver, but support issues are going to make that impractical. So
there is a split, its been in
HEAD since Febuary, and now 7.1 will release with this change.

What this means is that if you have 82575 adapters in a system they
are going to change
from being 'emX' to being 'igbX', in the RELEASE the install kernel
will have both drivers
in it, but on an upgrade path, or using a set of scripts that get
postinstalled you need to
be ready to make this change.

How can you tell if you have such a device: Simple, use pciconf, there
are only 3 ID's
that are effected:  0x10A7, 0x10A9, and 0x10D6.

If you have questions feel free to email me.

Cheers,

Jack Vogel
Intel Lan Access Division
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: HEADS UP: E1000 networking changes in STABLE/7.1 RELEASE

2008-08-13 Thread Jack Vogel
On Wed, Aug 13, 2008 at 8:13 PM, Paul <[EMAIL PROTECTED]> wrote:
> Hi Jack.  Will the em driver ever support the multiple hardware queues of
> the 82571 or are we just stuck with the standard em driver?
> Has there been any significant change in the em driver itself ?
> I have a feeling that the 82571 should be in the igb driver but for some
> reason it isn't. I am curious because we have a LOT of 4 port 82571 PCI-E
> cards and they are not cheap.  :]

Hey Paul,

   You are right, quad port cards aren't cheap, Intel has sold them
even back in a
PCI-X based system. And the one you are talking about was released since I
have been in this job, so pretty new :) However, they will never
support multiple
queues, the reason being that in order to do this you need multple MSI vectors,
in other words, MSIX. Still, they are very handy, they do support MSI, but just
one per interface.  Oh, and I debated about where to make the cutoff line on the
em/igb split and decided the best thing to do was to follow Linux. The biggest
difference between the two drivers is that those in the igb use a
different descriptor
format, called 'advanced descriptors'.

   The split does NOT mean that em is now fixed. Quite the contrary,
the em driver
that was just MFC'd into STABLE has support for what I think is going to be the
coolest new consumer adapter out there, called 82574, code name Hartwell, and
also for ICH10. Both these two new interfaces have IEEE 1588 Precision Time
Protocol support, something that is becoming important for networked multimedia
applications. Oh, and Hartwell is the first adapter in the em driver
that actually
does multiqueue, although in a limited fashion, it does MSIX. This
adapter is made
at a lower cost point, but it has really nice features. I only wish
the new motherboard
that I bought had them rather than ICH9 but oh well :) I predict they
are going to
be selling like hotcakes before the end of year.

I will continue to share fixes between the em and igb drivers as they
are applicable.
I still think a real legacy driver for the oldest adapters would be a
good idea, just so
they don't get broken by new development, with as many as we support regression
testing is already not being done adequately. I just have not had time
to think about
this yet, it may be coming, it would probably be for everything that
was not PCI Express.

Hope the info was helpful, I'm always happy to answer questions,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em(4) on FreeBSD is sometimes annoying

2008-08-14 Thread Jack Vogel
I fought with this issue all day today, trying to root cause it, and while I
don't have a solution I do have a better understanding of it.

I was wrong about it being the interrupt handler, at least if there's any
issue with it its not the primary cause. I actually found out using a
Fedora Live CD that Linux seems to have the same issue, but its
symptoms are slightly different due to driver architecture differences.
If you boot Linux with no cable in, and then modprobe the e1000
driver you get no errors, however, when you follow that with an
'ifconfig eth0' it will fail saying that it cannot find the device!!

Do the same with the cable in and it all works.

The same 82573 NIC on a standalone adapter has NONE of these
problems, it all works fine.

So, right now my theory is the power system of the laptop is
putting the phy into a sleep state due to having no link and
neither our driver or the Linux driver has a way to bring it back
out of that state.

Until this gets worked out all I can tell you is "keep that cable
IN"  :)

Jack


On Thu, Aug 14, 2008 at 4:33 AM, Markus Vervier <[EMAIL PROTECTED]> wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Jack Vogel wrote:
> | I have reproduced the problem, you are correct. Thank you for
> | persisting thru my doubts :)
> Always persisting to help improving FreeBSD. Another odd thing I noticed
> today:
> When dual-booting Windows on the same machine and doing a warm-reboot from
> Windows to FreeBSD,
> you _do_ get a link with the procedure I described yesterday.. So there
> seems to be some setting which is lost
> after cold-boot or some EEPROM setting which is changed somehow.
>
> | OH, I should note that as long as you put in a cable before you
> | ifconfig up its fine so
> | its not that hard to work around the issue.
> I completly forgot about the issue until now, after I noticed it some time
> ago when being overworked. :-(
> The situation just does not occur very often in times of wireless LAN /
> Docking Stations. :-)
>
> - --
> Markus
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v2.0.9 (FreeBSD)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkikGBMACgkQFhK2gHeM2QOLpACfdX4IyNSivy+TgAJBhKgZUwP2
> iiIAoNrPUTE0veViP7Zklm7jD25m7Aad
> =DrBs
> -END PGP SIGNATURE-
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: fxp multicast forwarding problems

2008-09-23 Thread Jack Vogel
LOL, sorry to disappoint you but I'm not responsible for fxp, Intel didn't write
it, and i've never touched it :)  Now that wouldnt mean that I can't look at it,
but I am very busy right now, so unless there's no alternative I'd rather not.

Jack


On Tue, Sep 23, 2008 at 3:06 AM, Jeremy Chadwick <[EMAIL PROTECTED]> wrote:
> On Tue, Sep 23, 2008 at 10:46:25AM +0100, Bruce M Simpson wrote:
>> Hi,
>>
>> Whilst doing some QA work on XORP on my desktop, which has fxp0 and
>> msk0, fxp0 got totally hosed.
>> I was running PIM-SM and IGMPv2 router-mode on the box at the time.
>>
>> I wonder if this is related to the problems with fxp multicast
>> transmission I saw back in April.
>> I'm a bit concerned about this as fxp is still a very widespread and
>> useful network chip.
>>
>> I am running 7.0-RELEASE-p4/amd64.
>> sysctls for dev.fxp.0 are set to their default values.
>>
>> I'm not expert on the fxp driver internals, but perhaps someone else has
>> seen this kind of problem before. Multicast-promiscuous mode (aka
>> ALLMULTI) was enabled on the interface. I know some NICs have problems
>> with this, or don't even support it.
>>
>> The errors look like this:
>> fxp0: SCB timeout: 0x10 0x0 0x80 0x0
>> fxp0: SCB timeout: 0x10 0x0 0x80 0x0
>> fxp0: DMA timeout
>> ... repeated ...
>>
>> Attempted workarounds which don't work to un-wedge the chip:
>> Reload the fxp0 microcode with "ifconfig fxp0 link0"
>> Forcibly unloading the kernel module and reloading it
>> Unpatching and repatching at the switch (a cheap 10/100 one)
>> Enabling and disabling promiscuous mode
>> Twiddling dev.fxp.0.noflow
>>
>> The link status looks fine, but the card will not send or receive traffic.
>> A warm reboot was enough to get things back up again.
>>
>> regards,
>> BMS
>
> Adding Jack Vogel, who's responsible for fxp(4).
>
> --
> | Jeremy Chadwickjdc at parodius.com |
> | Parodius Networking   http://www.parodius.com/ |
> | UNIX Systems Administrator  Mountain View, CA, USA |
> | Making life hard for others since 1977.  PGP: 4BD6C0CB |
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


7.1 RC E1000 fix testing

2008-11-25 Thread Jack Vogel
Anyone running 7.1 and using E1000 hardware that has the time, I would
appreciate
any testing you can do. This has an important fix for SuperMicro servers but
any
regression test of the code would be helpful.

The email was getting rejected due to the tarball size, so contact me and I
will send
you the code directly.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


7.1 RC E1000 fix

2008-11-25 Thread Jack Vogel
Anyone running 7.1 and using E1000 hardware that has the time, I would
appreciate
any testing you can do. This has an important fix for SuperMicro servers but
any
regression test of the code would be helpful.

Backup the contents of /usr/src/sys/dev/e1000 and then overwrite with this
tarball.

Send feedback to me,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: igb on a Nehalem system, buildworld stats

2009-01-08 Thread Jack Vogel
I have not seen a problem like this ever, what is the link partner
of each NIC and if you switch the ports what happens?

We have Nehalem's in the validation lab but I have not had an
excuse to install on one so far, I guess now I do :)

Jack

On Thu, Jan 8, 2009 at 6:16 AM, Mars G Miro  wrote:

> Hi guys,
>
>   I just got on my hands today a NEHALEM system:
>
> 2 x 5560 Nehalem CPU (2.8GHz, 8MB cache memory, 6.4GT/sec [QPI])
> 12GB 1333Mhz DDR3 Memory
> 1 x 500GB SATA HDD
>
>  FreeBSD 7.1-RELEASE/amd64 install fine, however I seemed to be
> having problems w/ its built-in Intel NICs:
>
> igb0: flags=8843 metric 0 mtu 1500
>options=19b
>ether 00:30:48:c5:db:e2
>inet6 fe80::230:48ff:fec5:dbe2%igb0 prefixlen 64 scopeid 0x1
>media: Ethernet autoselect (100baseTX )
>status: active
> igb1: flags=8843 metric 0 mtu 1500
>options=19b
>ether 00:30:48:c5:db:e3
>inet6 fe80::230:48ff:fec5:dbe3%igb1 prefixlen 64 scopeid 0x2
>inet 172.17.32.32 netmask 0x broadcast 172.17.255.255
>media: Ethernet autoselect (1000baseTX )
>status: active
>
> The first NIC would always want 100baseTX no matter how I'd ifconfig
> down/up it, so I just had to use the 2nd NIC. Unfortunately, this too
> is having problems. Like being unable to 'see' some machines on the
> same network segment. Some other machines are accessible. And yes I've
> double-checked the network stuff (cables, switch, IP settings) and my
> conclusion is b0rky NICs.
>
> pciconf -lvc:
> i...@pci0:1:0:0:class=0x02 card=0x10c915d9 chip=0x10c98086
> rev=0x01 hdr=0x00
>vendor = 'Intel Corporation'
>class  = network
>subclass   = ethernet
>cap 01[40] = powerspec 3  supports D0 D3  current D0
>cap 05[50] = MSI supports 1 message, 64 bit, vector masks
>cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
>cap 10[a0] = PCI-Express 2 endpoint
> i...@pci0:1:0:1:class=0x02 card=0x10c915d9 chip=0x10c98086
> rev=0x01 hdr=0x00
>vendor = 'Intel Corporation'
>class  = network
>subclass   = ethernet
>cap 01[40] = powerspec 3  supports D0 D3  current D0
>cap 05[50] = MSI supports 1 message, 64 bit, vector masks
>cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
>cap 10[a0] = PCI-Express 2 endpoint
>
> So anyone else having igb problems? I'm downloading 200812-CURRENT now
> (is tehre gonna be a 200901-CURRENT ISO soon? :-p), I'd like to try
> that, but checking cvs seem only a handful of changes.
>
> Also I did some buildworlds:
>  make -j8 buildworld
>2846.900u 2266.188s 15:50.43 537.9% 6375+2082k 10084+7937io
> 1482pf+0w
>  make -j16 buildworld
>3518.254u 2175.593s 14:23.29 659.5% 6656+2147k 26165+8546io
> 4300pf+0w
>  make -j32 buildworld
>3582.897u 4437.710s 18:03.88 739.9% 6528+2125k 5725+7930io 1555pf+0w
>
> Verbose dmesg: http://pastebin.com/f5f799561
>
> Thanks!
>
>
> --
> cheers
> mars
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: igb on a Nehalem system, buildworld stats

2009-01-08 Thread Jack Vogel
So it wasn't identified during install but was in the kernel you built
afterward, is that
what you're saying? Even if that's true I don't think its relevant to the
failure.

I have made a couple queries internally, there are a lot of variations on
Nehalem
systems, at least one other engineer in my group had an encounter with one
like yours, I have two managers looking for me, hopefully I can find one.

Jack


On Thu, Jan 8, 2009 at 10:50 AM, Mars G Miro  wrote:

> On Fri, Jan 9, 2009 at 2:33 AM, Jack Vogel  wrote:
> > Well, I am at Intel you know, and even we don't seem to have any systems
> > with
> > 82576 down in my group here. The way link works I can be about 99.9% sure
> > in saying its not the driver. Its preproduction so there are lots of
> > possibilities,
> > and the biggest problem is its going to be difficult to help when I don't
> > have any
> > such hardware :(
> >
> > I've heard from the 1G product team that they have seen EEPROM mismatches
> > on systems that will result in things not working in funny ways.
>
>
> Jahh, I've seen those but not w/ Intel NICs. I believe it was from
> Broadcom on some IBM x3455? (IIRC) and it was indeed quite amusing ;-)
>
>
> >
> > If you have a back to back connection to another NIC on Port 0, no
> switch,
> > does
> > it still autoneg to 100?
> >
>
> I will have do that tomorrow as I am @home now ;-)
>
> btw, another data point, during sysinstall, we encountered:
>
>   on both the igbs.
>
> Thanks.
>
> > Jack
> >
> > On Thu, Jan 8, 2009 at 10:19 AM, Mars G Miro 
> wrote:
> >>
> >> On Fri, Jan 9, 2009 at 12:44 AM, Jack Vogel  wrote:
> >> > I have not seen a problem like this ever, what is the link partner
> >> > of each NIC and if you switch the ports what happens?
> >> >
> >>
> >> Hi Jack,
> >>
> >>   They're connected to a GigE switch. It was just one w/ the first
> >> NIC, but having seen that it only connects at 100baseTX, I wired the
> >> 2nd and saw that it can now do 1000baseTX. Unfortunately w/ problems
> >> as it can 'see' some machines but unable to see others (in the same
> >> physical network segment). I've changed cables, and plugged them in
> >> different ports in the switch but still the same behavior.
> >>
> >>  IIRC, this is the first time I had igb problems and only on this
> >> box. I believe I encountered igb NICs in the newer HP DL380/385 but
> >> those work fine.
> >>
> >>  btw, this is a Supermicro Intel Engineering sample box (major
> >> vendors don't have Nehalems in the market yet) so there prolly are
> >> hardware/driver bugs lurking? I dunno.
> >>
> >>  Thanks.
> >>
> >>
> >> > We have Nehalem's in the validation lab but I have not had an
> >> > excuse to install on one so far, I guess now I do :)
> >> >
> >> > Jack
> >> >
> >> > On Thu, Jan 8, 2009 at 6:16 AM, Mars G Miro 
> >> > wrote:
> >> >>
> >> >> Hi guys,
> >> >>
> >> >>   I just got on my hands today a NEHALEM system:
> >> >>
> >> >> 2 x 5560 Nehalem CPU (2.8GHz, 8MB cache memory, 6.4GT/sec [QPI])
> >> >> 12GB 1333Mhz DDR3 Memory
> >> >> 1 x 500GB SATA HDD
> >> >>
> >> >>  FreeBSD 7.1-RELEASE/amd64 install fine, however I seemed to be
> >> >> having problems w/ its built-in Intel NICs:
> >> >>
> >> >> igb0: flags=8843 metric 0 mtu
> >> >> 1500
> >> >>
> >> >>  options=19b
> >> >>ether 00:30:48:c5:db:e2
> >> >>inet6 fe80::230:48ff:fec5:dbe2%igb0 prefixlen 64 scopeid 0x1
> >> >>media: Ethernet autoselect (100baseTX )
> >> >>status: active
> >> >> igb1: flags=8843 metric 0 mtu
> >> >> 1500
> >> >>
> >> >>  options=19b
> >> >>ether 00:30:48:c5:db:e3
> >> >>inet6 fe80::230:48ff:fec5:dbe3%igb1 prefixlen 64 scopeid 0x2
> >> >>inet 172.17.32.32 netmask 0x broadcast 172.17.255.255
> >> >>media: Ethernet autoselect (1000baseTX )
> >> >>status: active
> >> >>
> >> >> The first NIC would always want 100baseTX no matter how I'd ifconfig
> >> >> down/up it, so I just 

Re: igb on a Nehalem system, buildworld stats

2009-01-09 Thread Jack Vogel
On Fri, Jan 9, 2009 at 12:02 AM, Mars G Miro  wrote:

> On Fri, Jan 9, 2009 at 2:50 AM, Mars G Miro 
> wrote:
> > On Fri, Jan 9, 2009 at 2:33 AM, Jack Vogel  wrote:
> >> Well, I am at Intel you know, and even we don't seem to have any systems
> >> with
> >> 82576 down in my group here. The way link works I can be about 99.9%
> sure
> >> in saying its not the driver. Its preproduction so there are lots of
> >> possibilities,
> >> and the biggest problem is its going to be difficult to help when I
> don't
> >> have any
> >> such hardware :(
> >>
> >> I've heard from the 1G product team that they have seen EEPROM
> mismatches
> >> on systems that will result in things not working in funny ways.
> >
> >
> > Jahh, I've seen those but not w/ Intel NICs. I believe it was from
> > Broadcom on some IBM x3455? (IIRC) and it was indeed quite amusing ;-)
> >
> >
> >>
> >> If you have a back to back connection to another NIC on Port 0, no
> switch,
> >> does
> >> it still autoneg to 100?
> >>
>
> Connected back to back w/ another box w/ a GigE NIC, it now does
> 1000baseTX:
>
> igb0: flags=8843 metric 0 mtu 1500
>options=19b
>ether 00:30:48:c5:db:e2
>inet6 fe80::230:48ff:fec5:dbe2%igb0 prefixlen 64 scopeid 0x1
> inet 192.168.70.2 netmask 0xff00 broadcast 192.168.70.255
> media: Ethernet autoselect (1000baseTX )
>status: active
>
> But still not without problems. I hafta ifconfig down/up it several
> times until I can see the other end. W/c is the same for igb1.
>


OK, so you have some switch issue.  What do you mean "see the other end",
if its back to back and boots up I assume it gets link, if you have the
address
assigned in rc.conf, and you run tcpdump on the partner do you see the arp
when it comes online, and at that point can the other side ping it?

Oh, and what is the link partner hardware?

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Driver for Intel 10GbE adapter

2009-02-11 Thread Jack Vogel
On Wed, Feb 11, 2009 at 3:21 PM, pluknet  wrote:

> Hi.
>
> 2009/2/12 Greg Rivers 
> 
> >:
> > I'm trying to light an Intel 10GbE adapter in an HP DL380 G5 using very
> > recent 7.1-STABLE amd64 with GENERIC kernel.  I expected the ixbg(4)
> driver
> > to attach, but it does not.
> >
> > The labels on the card show:
> >INTEL(R) 10GbE XF SR 2 PORT SERVER ADAPTER
> >893135
> >EXPX9502FXSRGP5
> >
> >001B211170BE 028AD E15728-003
> >
> >
> > A verbose boot shows the card on the PCI bus, but no driver attaches:
> >
> > pcib11:  at device 6.0 on pci0
> > pcib11:   domain0
> > pcib11:   secondary bus 23
> > pcib11:   subordinate bus   23
> > pcib11:   I/O decode0x6000-0x6fff
> > pcib11:   memory decode 0xfde0-0xfdff
> > pcib11:   no prefetched decode
> > pci23:  on pcib11
> > pci23: domain=0, physical bus=23
> > found-> vendor=0x8086, dev=0x10c6, revid=0x01
> >domain=0, bus=23, slot=0, func=0
> >class=02-00-00, hdrtype=0x00, mfdev=1
> >cmdreg=0x0047, statreg=0x0010, cachelnsz=16 (dwords)
> >lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
> >intpin=a, irq=10
> >powerspec 3  supports D0 D3  current D0
> >MSI supports 1 message, 64 bit
> >MSI-X supports 18 messages in map 0x1c
> >map[10]: type Memory, range 32, base 0xfdfe, size 17, enabled
> > pcib11: requested memory range 0xfdfe-0xfdff: good
> >map[14]: type Memory, range 32, base 0xfdf8, size 18, enabled
> > pcib11: requested memory range 0xfdf8-0xfdfb: good
> >map[18]: type I/O Port, range 32, base 0x6000, size  5, enabled
> > pcib11: requested I/O range 0x6000-0x601f: in range
> >map[1c]: type Memory, range 32, base 0xfdf7, size 14, enabled
> > pcib11: requested memory range 0xfdf7-0xfdf73fff: good
> > pcib11: matched entry for 23.0.INTA
> > pcib11: slot 0 INTA hardwired to IRQ 19
> > found-> vendor=0x8086, dev=0x10c6, revid=0x01
> >domain=0, bus=23, slot=0, func=1
> >class=02-00-00, hdrtype=0x00, mfdev=1
> >cmdreg=0x0047, statreg=0x0010, cachelnsz=16 (dwords)
> >lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
> >intpin=b, irq=5
> >powerspec 3  supports D0 D3  current D0
> >MSI supports 1 message, 64 bit
> >MSI-X supports 18 messages in map 0x1c
> >map[10]: type Memory, range 32, base 0xfdf4, size 17, enabled
> > pcib11: requested memory range 0xfdf4-0xfdf5: good
> >map[14]: type Memory, range 32, base 0xfdf0, size 18, enabled
> > pcib11: requested memory range 0xfdf0-0xfdf3: good
> >map[18]: type I/O Port, range 32, base 0x6020, size  5, enabled
> > pcib11: requested I/O range 0x6020-0x603f: in range
> >map[1c]: type Memory, range 32, base 0xfdef, size 14, enabled
> > pcib11: requested memory range 0xfdef-0xfdef3fff: good
> > pcib11: matched entry for 23.0.INTB
> > pcib11: slot 0 INTB hardwired to IRQ 16
> > pci23:  at device 0.0 (no driver attached)
> > pci23:  at device 0.1 (no driver attached)
> >
> >
> > pciconf shows:
> >
> > no...@pci0:23:0:0:  class=0x02 card=0xa15f8086 chip=0x10c68086
> > rev=0x01 hdr=0x00
> >vendor = 'Intel Corporation'
> >class  = network
> >subclass   = ethernet
> > no...@pci0:23:0:1:  class=0x02 card=0xa15f8086 chip=0x10c68086
> > rev=0x01 hdr=0x00
> >vendor = 'Intel Corporation'
> >class  = network
> >subclass   = ethernet
>
> You probably want to load ixgbe(4), not ixgb(4) (latter is afaik an
> older PCI-X version driver).
> The labels on the card are close to the description of ixgbe.
> Note, it's not in GENERIC.
>

Yes, its an Oplin, 82598, it uses my ixgbe driver rather than ixgb.

Cheers,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Driver for Intel 10GbE adapter

2009-02-11 Thread Jack Vogel
Somehow that error was corrected but just AFTER the release. Its a simple
fix, look at
ixgbe.h in CVS to see it,  you just get rid of the "tcp_lro.h" and change it
to 

There will be a new code drop soon also.

Jack


On Wed, Feb 11, 2009 at 5:16 PM, Greg Rivers

> wrote:

> On Thu, 12 Feb 2009, pluknet wrote:
>
>  You probably want to load ixgbe(4), not ixgb(4) (latter is afaik an older
>> PCI-X version driver). The labels on the card are close to the description
>> of ixgbe. Note, it's not in GENERIC.
>>
>>
> On Wed, 11 Feb 2009, Jack Vogel wrote:
>
>  Yes, its an Oplin, 82598, it uses my ixgbe driver rather than ixgb.
>>
>>
> On Wed, 11 Feb 2009, Kip Macy wrote:
>
>  see ixgbe(4)
>>
>>
> I saw ixgbe in the source tree and would have tried it, but I found that no
> kernel module is built for ixgbe on RELENG_7.  Neither is the device listed
> even as a comment in any stock config file.  This and the lack of a manual
> page led me to believe it wasn't available.
>
>
> On Thu, 12 Feb 2009, pluknet wrote:
>
>  BTW I'm afraid ixgbe manpage still to be merged to 7.
>>
>>
> It may be that more than just the man page remains to be merged.  When I
> add "device ixgbe" to the GENERIC config and try to build a new kernel I
> get:
>
> In file included from /usr/src/sys/dev/ixgbe/ixgbe.c:39:
> /usr/src/sys/dev/ixgbe/ixgbe.h:87:21: error: tcp_lro.h: No such file or
> directory
> mkdep: compile failed
> *** Error code 1
>
> Stop in /usr/obj/usr/src/sys/GENERIC.
> *** Error code 1
>
> Stop in /usr/src.
> *** Error code 1
>
> Stop in /usr/src.
>
>
> Is there any hope of seeing ixgbe fully merged into RELENG_7 soon, or is my
> best bet to switch to CURRENT?  Thank you all for your help.
>
> --
> Greg Rivers
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 82573 xfers pause, no watchdog timeouts, DCGDIS ineffective (7.2-R)

2009-11-12 Thread Jack Vogel
It is critically important on these systems that you get the latest BIOS on
them, so
maybe that's the difference between you two.  I am going to be putting out a
new
em driver to CURRENT soon, it might be an option to try that as well, it
sounds
like a hang, management/os race in the driver is a possibility.

Jack


On Thu, Nov 12, 2009 at 12:47 PM, Jeremy Chadwick
wrote:

> On Thu, Nov 12, 2009 at 10:36:16AM -0900, Royce Williams wrote:
> > We have servers with dual 82573 NICs that work well during low-throughput
> activity, but during high-volume activity, they pause shortly after
> transfers start and do not recover.  Other sessions to the system are not
> affected.
>
> Please define "low-throughput" and "high-volume" if you could; it might
> help folks determine where the threshold is for problems.
>
> > These systems are being repurposed, jumping from 6.3 to 7.2.  The same
> system and its kin do not exhibit the symptom under 6.3-RELEASE-p13.  The
> symptoms appear under freebsd-updated 7.2-RELEASE GENERIC kernel with no
> tuning.
> >
> > Previously, we've been using DCGDIS.EXE (from Jack Vogel) for this
> symptom.  The first system to be repurposed accepts DCGDIS with 'Updated'
> and subsequent 'update not needed', with no relief.
> >
> > Notably, there are no watchdog timeout errors - unlike our various
> Supermicro models still running FreeBSD 6.x.  All of our other 7.x
> Supermicro flavors had already received the flash update and haven't show
> the symptom.
> >
> > Details follow.
> >
> > Kernel:
> >
> > rand# uname -a
> > FreeBSD rand.acsalaska.net 7.2-RELEASE-p4 FreeBSD 7.2-RELEASE-p4 #0: Fri
> Oct  2 12:21:39 UTC 2009 
> r...@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC
>  i386
> >
> > sysctls:
> >
> > rand# sysctl dev.em
> > dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 6.9.6
> > dev.em.0.%driver: em
> > dev.em.0.%location: slot=0 function=0
> > dev.em.0.%pnpinfo: vendor=0x8086 device=0x108c subvendor=0x15d9
> subdevice=0x108c class=0x02
> > dev.em.0.%parent: pci13
> > dev.em.0.debug: -1
> > dev.em.0.stats: -1
> > dev.em.0.rx_int_delay: 0
> > dev.em.0.tx_int_delay: 66
> > dev.em.0.rx_abs_int_delay: 66
> > dev.em.0.tx_abs_int_delay: 66
> > dev.em.0.rx_processing_limit: 100
> > dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 6.9.6
> > dev.em.1.%driver: em
> > dev.em.1.%location: slot=0 function=0
> > dev.em.1.%pnpinfo: vendor=0x8086 device=0x108c subvendor=0x15d9
> subdevice=0x108c class=0x02
> > dev.em.1.%parent: pci14
> > dev.em.1.debug: -1
> > dev.em.1.stats: -1
> > dev.em.1.rx_int_delay: 0
> > dev.em.1.tx_int_delay: 66
> > dev.em.1.rx_abs_int_delay: 66
> > dev.em.1.tx_abs_int_delay: 66
> > dev.em.1.rx_processing_limit: 100
> >
> > kenv:
> >
> > rand# kenv | grep smbios | egrep -v 'socket|serial|uuid|tag|0123456789'
> > smbios.bios.reldate="03/05/2008"
> > smbios.bios.vendor="Phoenix Technologies LTD"
> > smbios.bios.version="6.00"
> > smbios.chassis.maker="Supermicro"
> > smbios.planar.maker="Supermicro"
> > smbios.planar.product="PDSMi "
> > smbios.planar.version="PCB Version"
> > smbios.system.maker="Supermicro"
> > smbios.system.product="PDSMi"
> >
> >
> > The system is not yet production, so I can invasively abuse it if needed.
>  The other systems are in production under 6.3-RELEASE-p13 and can also be
> inspected.
> >
> > Any pointers appreciated.
> >
> > Royce
>
> For what it's worth as a comparison base:
>
> We use the following Supermicro SuperServers, and can confirm that no
> such issues occur for us using RELENG_6 nor RELENG_7 on the following
> hardware:
>
> Supermicro SuperServer 5015B-MTB - amd64 - Intel 82573V + Intel 82573L
> Supermicro SuperServer 5015M-T+B - amd64 - Intel 82573V + Intel 82573L
> Supermicro SuperServer 5015M-T+B - amd64 - Intel 82573V + Intel 82573L
> Supermicro SuperServer 5015M-T+B - i386  - Intel 82573V + Intel 82573L
> Supermicro SuperServer 5015M-T+B - i386  - Intel 82573V + Intel 82573L
>
> The 5015B-MTB system presently runs RELENG_8 -- no issues there either.
>
> Relevant server configuration and network setup details:
>
> - All machines use pf(4).
> - All emX devices are configured for autoneg.
> - All emX devices use RXCSUM, TXCSUM, and TSO4.
> - We do not use polling.
> - All machines use both NICs simultaneously at all times.
> - All machines connected 

Re: 82573 xfers pause, no watchdog timeouts, DCGDIS ineffective (7.2-R)

2009-11-12 Thread Jack Vogel
LOL, glad the problem has been resolved, and no thanks, I do not need
to pursue this any further.

I also want to thank Jeremy for his help and data!!

Thanks guys and good evening,

Jack


On Thu, Nov 12, 2009 at 6:56 PM, Royce Williams wrote:

> On Thu, Nov 12, 2009 at 2:18 PM, Royce Williams
>  wrote:
> > On Thu, Nov 12, 2009 at 11:47 AM, Jeremy Chadwick
> >> - All machines connected to an HP ProCurve 2626 switch (100mbit,
> >>  full-duplex ports, all autoneg).
>
> > No firewall is active on the problem system, and none of this back
> > have been DCGDIS-ified, but otherwise, our setup is identical.
>
> Er, s/back/batch/g, and it's not a ProCurve. ;-)  But we are also
> usually full-duplex and autoneg on both sides.
>
> Based on new (embarrassing) information, I'll leave it to Jack to
> decide whether or not he wants to pursue this further.
>
> The problem box is sitting in my grotty mini-lab, with a subnet
> partially serviced by a 10M hub.  Guess which Ethernet cable I picked
> up.  Guess what happens when I move the system to a 100M/full
> connection.
>
> As my cow-orker put it, "You and the other four people on Earth using
> that NIC on 10M hubs" can probably find workarounds.  My apologies for
> the noise, though it's theoretically possible that the root cause
> might still need addressing.
>
> Jack, let me know if you want me to do any testing for you.  Or I can
> always send you my hub. ;-)
>
> Royce
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: bug with some em nics on RELENG_7

2009-11-18 Thread Jack Vogel
Hey Mike,

Can you check if you see the same behavior on RELENG 8?

There is a systemic problem having to do with when to enable interrupts that
might be behind this. The em driver does not enable them until
em_init_locked(),
this is because until then its not ready to deal with a TX or RX interrupt.
However,
this means that a Link interrupt also will not be seen, BUT, and here is
where it
gets a bit funny, an call to check link happens in attach, it will be either
true or
false, AND, even if you remove or add a cable after that point, until
interrupts
are enabled the state will not change.

In the days before MSIX one interrupt was for everything so it was
impossible
to change this without a radical rework to the driver design, but I suppose
it
would be possible with MSIX to selectively enable the link one earlier, I
seem
to recall discussions with our Linux crew that made me decide not to pursue
that (its of limited value really).

Not sure why this happens on Hartwell (82574) and not on 82571, that's
an interesting bit, the 82574 is the ONLY interface in the em driver that
has MSIX support, unfortunately its kinda hacked in, but it did not really
fit into the igb driver either for various technical reasons.

What if you boot up, then do NOT ping or anything until the interface is
assigned an address (and so init is run), and the cable is plugged in. If
that happens first does it work?

Do let me know if you can check on 8, if not I can have my validation
engineer try this.

Best regards,

Jack


On Wed, Nov 18, 2009 at 1:30 PM, Mike Tancsa  wrote:

>
> On two Intel chipset Supermicro boards (X8STi and X8STE-0) using the
> onboard em nics (dmesg info below), I seem to have run into an issue where
> if I boot the box up with the cables unplugged, I cannot get the NICS to
> properly work post boot up.  This is quite repeatable for me. So at boot
> time, I have
>
> # ifconfig em5
> em5: flags=8843 metric 0 mtu 1500
>options=19b
>ether 00:30:48:d6:ef:13
>inet 3.3.3.3 netmask 0xff00 broadcast 3.3.3.255
>media: Ethernet autoselect
>status: no carrier
>
>
> I then ping something that would be across the wire while the nic is down.
> e.g. ping 3.3.3.1
>
> I then plug in the cable so that the other side has 3.3.3.1
>
> ifconfig shows all looks good
>
> # ifconfig em5
> em5: flags=8843 metric 0 mtu 1500
>options=19b
>ether 00:30:48:d6:ef:13
>inet 3.3.3.3 netmask 0xff00 broadcast 3.3.3.255
>media: Ethernet autoselect (1000baseTX )
>status: active
>
> I try and ping 3.3.3.1 which is on xover (via a switch shows the same
> behaviour), and no response to the pings BUT, I do see the MAC addr show
> up
> # ping -c 2 -S 3.3.3.3 3.3.3.1
> PING 3.3.3.1 (3.3.3.1) from 3.3.3.3: 56 data bytes
>
> --- 3.3.3.1 ping statistics ---
> 2 packets transmitted, 0 packets received, 100.0% packet loss
> # arp -na
> ? (3.3.3.1) at 00:30:48:94:88:20 on em5 [ethernet]
> ? (3.3.3.3) at 00:30:48:d6:ef:13 on em5 permanent [ethernet]
>
> I can see its mac addr ?!?
>
> Furthermore, if I do
>
> # ifconfig em5 3.3.3.55/32 alias
>
> On the other side, I see
>
> 0(ich10)# tcpdump -nei igb0
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on igb0, link-type EN10MB (Ethernet), capture size 96 bytes
> 16:16:03.380886 00:30:48:d6:ef:13 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 60: Request who-has 3.3.3.55 tell 3.3.3.55, length 46
>
>
> and I can ping if I specify the alias as the source IP
>
> # ping -S 3.3.3.55 3.3.3.1
> PING 3.3.3.1 (3.3.3.1) from 3.3.3.55: 56 data bytes
> 64 bytes from 3.3.3.1: icmp_seq=0 ttl=64 time=0.184 ms
> 64 bytes from 3.3.3.1: icmp_seq=1 ttl=64 time=0.051 ms
> 64 bytes from 3.3.3.1: icmp_seq=2 ttl=64 time=0.055 ms
>
>
>
> 16:17:01.603345 00:30:48:d6:ef:13 > 00:30:48:94:88:20, ethertype ARP
> (0x0806), length 60: Reply 3.3.3.55 is-at 00:30:48:d6:ef:13, length 46
> 16:17:01.603349 00:30:48:94:88:20 > 00:30:48:d6:ef:13, ethertype IPv4
> (0x0800), length 98: 3.3.3.1 > 3.3.3.55: ICMP echo reply, id 7946, seq 0,
> length 64
> 16:17:02.603497 00:30:48:d6:ef:13 > 00:30:48:94:88:20, ethertype IPv4
> (0x0800), length 98: 3.3.3.55 > 3.3.3.1: ICMP echo request, id 7946, seq
> 1, length 64
> 16:17:02.603502 00:30:48:94:88:20 > 00:30:48:d6:ef:13, ethertype IPv4
> (0x0800), length 98: 3.3.3.1 > 3.3.3.55: ICMP echo reply, id 7946, seq 1,
> length 64
> 16:17:03.604510 00:30:48:d6:ef:13 > 00:30:48:94:88:20, ethertype IPv4
> (0x0800), length 98: 3.3.3.55 > 3.3.3.1: ICMP echo request, id 7946, seq
> 2, length 64
> 16:17:03.604516 00:30:48:94:88:20 > 00:30:48:d6:ef:13, ethertype IPv4
> (0x0800), length 98: 3.3.3.1 > 3.3.3.55: ICMP echo reply, id 7946, seq 2,
> length 64
>
>
>
> but not using the initial IP addr
>
> 0[iolite3A]# ping -S 3.3.3.3 3.3.3.1
> PING 3.3.3.1 (3.3.3.1) from 3.3.3.3: 56 data bytes
> ^C
> --- 3.3.3.1 ping statistics ---
> 2 packets transmitted, 0 packets received, 100.0% packet loss

Re: bug with some em nics on RELENG_7

2009-11-19 Thread Jack Vogel
Cool, so stable/7 will just need to be updated :) I need to catch up all the

drivers in that stream actually.

Thanks for testing!!

Jack


On Thu, Nov 19, 2009 at 8:58 AM, Mike Tancsa  wrote:

> At 07:29 PM 11/18/2009, Jack Vogel wrote:
>
>> Hey Mike,
>>
>> Can you check if you see the same behavior on RELENG 8?
>>
>
> For RELENG_8. I installed an fxp card and netbooted off it.
>
> I assigned an IP address to the onboard nic (em5). Pinged itself, got a MAC
> and response. Plugged in the cable, pinged the other side, all ok
>
> Did the same, but pinged the other side, and plugged the cable in, all
> worked as expected.
> # ping 10.177.194.18
> PING 10.177.194.18 (10.177.194.18): 56 data bytes
> ping: sendto: Host is down
> ping: sendto: Host is down
> ping: sendto: Host is down
> 64 bytes from 10.177.194.18: icmp_seq=30 ttl=64 time=1329.918 ms
> 64 bytes from 10.177.194.18: icmp_seq=31 ttl=64 time=324.925 ms
> 64 bytes from 10.177.194.18: icmp_seq=32 ttl=64 time=0.054 ms
> 64 bytes from 10.177.194.18: icmp_seq=33 ttl=64 time=0.055 ms
> 64 bytes from 10.177.194.18: icmp_seq=34 ttl=64 time=0.047 ms
> 64 bytes from 10.177.194.18: icmp_seq=35 ttl=64 time=0.050 ms
> 64 bytes from 10.177.194.18: icmp_seq=36 ttl=64 time=0.047 ms
> 64 bytes from 10.177.194.18: icmp_seq=37 ttl=64 time=0.049 ms
> 64 bytes from 10.177.194.18: icmp_seq=38 ttl=64 time=0.043 ms
>
>
> em4:  port 0xcc00-0xcc1f mem
> 0xfaee-0xfaef,0xfaedc000-0xfaed irq 16 at device 0.0 on pci6
>
> em4: Using MSIX interrupts
> em4: [ITHREAD]
> em4: [ITHREAD]
> em4: [ITHREAD]
> em4: Ethernet address: 00:30:48:d6:ef:12
> pcib7:  irq 16 at device 28.1 on pci0
> pci7:  on pcib7
> em5:  port 0xdc00-0xdc1f mem
> 0xfafe-0xfaff,0xfafdc000-0xfafd irq 17 at device 0.0 on pci7
>
> em5: Using MSIX interrupts
> em5: [ITHREAD]
> em5: [ITHREAD]
> em5: [ITHREAD]
> em5: Ethernet address: 00:30:48:d6:ef:13
>
> So the problem is _not_ there under RELENG_8.  I also tested the 2 PCIe
> nics to make sure they are still working, and they are.
>
> Full dmesg below
>
> opyright (c) 1992-2009 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 8.0-RC2 #0: Wed Nov 11 09:54:52 EST 2009
>mdtan...@ich10.sentex.ca:/usr/obj/usr/src/sys/alix i386
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Core(TM) i7 CPU 920  @ 2.67GHz (2660.00-MHz 686-class
> CPU)
>  Origin = "GenuineIntel"  Id = 0x106a5  Stepping = 5
>
>  
> Features=0xbfebfbff
>
>  
> Features2=0x98e3bd
>  AMD Features=0x2810
>  AMD Features2=0x1
>  TSC: P-state invariant
> real memory  = 6446645248 (6148 MB)
> avail memory = 3137355776 (2992 MB)
> ACPI APIC Table: <011209 APIC2037>
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
> FreeBSD/SMP: 1 package(s) x 4 core(s)
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  2
>  cpu2 (AP): APIC ID:  4
>  cpu3 (AP): APIC ID:  6
> ioapic0: Changing APIC ID to 1
> ioapic0  irqs 0-23 on motherboard
> kbd1 at kbdmux0
> cryptosoft0:  on motherboard
> acpi0: <011209 XSDT2037> on motherboard
> acpi0: [ITHREAD]
> acpi0: Power Button (fixed)
> acpi0: reservation of 0, a (3) failed
> acpi0: reservation of 10, bff0 (3) failed
> Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> pcib0:  port 0xcf8-0xcff on acpi0
> pci0:  on pcib0
> pcib1:  at device 1.0 on pci0
> pci1:  on pcib1
> em0:  port 0xac00-0xac1f mem
> 0xface-0xfacf,0xfacc-0xfacd irq 16 at device 0.0 on pci1
>
> em0: Using MSI interrupt
> em0: [FILTER]
> em0: Ethernet address: 00:15:17:78:e6:e0
> em1:  port 0xa880-0xa89f mem
> 0xfac8-0xfac9,0xfac6-0xfac7 irq 17 at device 0.1 on pci1
>
> em1: Using MSI interrupt
> em1: [FILTER]
> em1: Ethernet address: 00:15:17:78:e6:e1
> pcib2:  at device 3.0 on pci0
> pci2:  on pcib2
> pcib3:  at device 5.0 on pci0
> pci3:  on pcib3
> em2:  port 0xbc00-0xbc1f mem
> 0xfade-0xfadf,0xfadc-0xfadd irq 16 at device 0.0 on pci3
>
> em2: Using MSI interrupt
> em2: [FILTER]
> em2: Ethernet address: 00:15:17:cf:26:de
> em3:  port 0xb880-0xb89f mem
> 0xfad8-0xfad9,0xfad6-0xfad7 irq 17 at device 0.1 on pci3
>
> em3: Using MSI interrupt
> em3: [FILTER]
> em3: Ethernet address: 00:15:17:cf:26:df
> pcib4:  at device 7.0 on pci0
> pci4:  on pcib4
> pcib5:  at device 9.0 on pc

Re: em interface slow down on 8.0R

2009-11-30 Thread Jack Vogel
I will look into this Hiroki, as time goes the older hardware does not
always
get test cycles like one might wish.

Jack


On Mon, Nov 30, 2009 at 12:04 AM, Hiroki Sato  wrote:

> Hi,
>
>  I noticed that network connection of one of my boxes got
>  significantly slow just after upgrading it to 8.0R.  The box has an
>  em0 (82547EI) and worked fine with 7.2R.
>
>  The symptoms are:
>
>  - A ping to a host on the same LAN takes 990ms RTT, it reduces
>   gradually to around 1ms, and then it returns to around 1s.  The
>   rate was about 2ms/ping.
>
>  - The response is quite slow, but no packet loss and network services
>   on the box seem to work fine as far as I can check.  There does not
>   seem interrupt storm according to "vmstat -i".  No error message
>   such as "watchdog timeout" appears.
>
>  Any ideas to narrow down the cause?  It maybe a linkup problem with a
>  specific model of hub like full-duplex/half-duplex mismatch, but the
>  link is "1000baseT " and setting it manually did not
>  solve it.  I think it is certain that upgrading to 8.0R triggered it,
>  at least.
>
>  Another box with an em interface works fine after upgrading to 8.0R.
>  It has a different chip (82573E).
>
>  Details of the em interface and vmstat -i are the following:
>
>  e...@pci0:1:1:0: class=0x02 card=0x302c8086 chip=0x10198086 rev=0x00
> hdr=0x00
>vendor = 'Intel Corporation'
>device = 'Gigabit Ethernet Controller (LOM) (82547EI)'
>class  = network
>subclass   = ethernet
>
>  Adapter hardware address = 0xc42e1424
>  em0: CTRL = 0x183c0241 RCTL = 0x8002
>  em0: Packet buffer = Tx=10k Rx=30k
>  em0: Flow control watermarks high = 28672 low = 27172
>  em0: tx_int_delay = 66, tx_abs_int_delay = 66
>  em0: rx_int_delay = 0, rx_abs_int_delay = 66
>  em0: fifo workaround = 0, fifo_reset_count = 0
>  em0: hw tdh = 49, hw tdt = 49
>  em0: hw rdh = 238, hw rdt = 187
>  em0: Num Tx descriptors avail = 250
>  em0: Tx Descriptors not avail1 = 0
>  em0: Tx Descriptors not avail2 = 0
>  em0: Std mbuf failed = 0
>  em0: Std mbuf cluster failed = 0
>  em0: Driver dropped packets = 0
>  em0: Driver tx dma failure in encap = 0
>
>  dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 6.9.14
>  dev.em.0.%driver: em
>  dev.em.0.%location: slot=1 function=0 handle=\_SB_.PCI0.P0P2.TANA
>  dev.em.0.%pnpinfo: vendor=0x8086 device=0x1019 subvendor=0x8086
> subdevice=0x302c class=0x02
>  dev.em.0.%parent: pci1
>  dev.em.0.debug: -1
>  dev.em.0.stats: -1
>  dev.em.0.rx_int_delay: 0
>  dev.em.0.tx_int_delay: 66
>  dev.em.0.rx_abs_int_delay: 66
>  dev.em.0.tx_abs_int_delay: 66
>  dev.em.0.rx_processing_limit: 100
>  dev.em.0.wake: 0
>
>  % vmstat -i
>  interrupt  total   rate
>  irq4: uart0 3585  3
>  irq14: ata0 1811  1
>  irq15: ata1  112  0
>  irq16: uhci0 uhci315  0
>  irq18: em0 uhci2+  92457 99
>  irq19: uhci1   1  0
>  irq23: ehci0   2  0
>  cpu0: timer  1849981   1997
>  cpu1: timer  1849961   1997
>  Total3797925   4101
>
> -- Hiroki
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em interface slow down on 8.0R

2009-12-02 Thread Jack Vogel
We've run into a snag on this problem. The 82547 is a LOM only interface
and my validation engineer has only found two old systems that have it,
and neither of them will even install FreeBSD 8 they are so old :(

I might suggest that you continue using the 7.2 driver with that hardware
if it was working.

To me this is further data on the need to have a frozen legacy version of
em but the problem is which code to use and how to approach it.

Can you give me more specifics on the box you have this installed on??

Regards,

Jack


On Mon, Nov 30, 2009 at 5:29 PM, Hiroki Sato  wrote:

> Jack Vogel  wrote
>  in <2a41acea0911301119j1449be58y183f2fe1d1112...@mail.gmail.com>:
>
> jf> I will look into this Hiroki, as time goes the older hardware does not
> jf> always
> jf> get test cycles like one might wish.
>
>  Thanks!  Please let me know if you need more information.
>
> -- Hiroki
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em interface slow down on 8.0R

2009-12-02 Thread Jack Vogel
Update: the claim to be unable to install was hasty, I went in and looked
into myself and was able to get an install. Here's what I've found so far:

First, the 82547EI will fail due to Invalid Mac Address, so I guess you
hacked around this problem yourself?  I had someone here test all
legacy adapters for this problem and I was told nothing else was exhibiting
it besides the 82542, obviously this is false :)  In any case I will be
making
an official patch to fix that problem soon.

Second, once I had the device working I do indeed see substandard
performance, I am continuing to debug, but wanted you to know that I
have reproduced this.

Jack


On Wed, Dec 2, 2009 at 12:49 PM, Jack Vogel  wrote:

> We've run into a snag on this problem. The 82547 is a LOM only interface
> and my validation engineer has only found two old systems that have it,
> and neither of them will even install FreeBSD 8 they are so old :(
>
> I might suggest that you continue using the 7.2 driver with that hardware
> if it was working.
>
> To me this is further data on the need to have a frozen legacy version of
> em but the problem is which code to use and how to approach it.
>
> Can you give me more specifics on the box you have this installed on??
>
> Regards,
>
> Jack
>
>
>
> On Mon, Nov 30, 2009 at 5:29 PM, Hiroki Sato  wrote:
>
>> Jack Vogel  wrote
>>  in <2a41acea0911301119j1449be58y183f2fe1d1112...@mail.gmail.com>:
>>
>> jf> I will look into this Hiroki, as time goes the older hardware does not
>> jf> always
>> jf> get test cycles like one might wish.
>>
>>  Thanks!  Please let me know if you need more information.
>>
>> -- Hiroki
>>
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em interface slow down on 8.0R

2009-12-05 Thread Jack Vogel
The 82573, when onboard (LOM) is usually special, it is used by system
management
firmware.  Go to the system BIOS and turn off management, see if that
eliminates the
periodic hang.

Jack


On Sat, Dec 5, 2009 at 9:27 PM, Hiroki Sato  wrote:

> John Nielsen  wrote
>  in <1e3c66ea-a6d3-44d7-b28e-bf068fff1...@jnielsen.net>:
>
> jo> On Dec 5, 2009, at 4:40 AM, Hiroki Sato  wrote:
> jo>
> jo> > Hiroki Sato  wrote
> jo> >  in <20091203.182931.129751456@allbsd.org>:
> jo> >
> jo> > hr> And another thing, I noticed a box with 82573E and 82573L
> jo> > sometimes
> jo> > hr>  got stuck after upgrading to 8.0-STABLE.  It has moderate
> network
> jo> > hr> load (average 5-10Mbps) on both NICs.  It worked for a day or two
> jo> > and
> jo> > hr> then got stuck suddenly.  Rebooting the box solved the situation,
> jo> > but
> jo> > hr>  it got stuck again after a day or so.  After it happens, the
> jo> > hr>  interface does not respond.  The other functionalities of
> FreeBSD
> jo> > hr> seemed working.  Doing an up/down cycle for the NICs seemed to
> jo> > send
> jo> > hr> some packets, but it did not recover completely; rebooting was
> jo> > needed
> jo> > hr> for recovery.  This box does not have the RTT problem.  I am
> still
> jo> > hr>  not sure what is the trigger, there seems something wrong.
> jo> >
> jo> > Things turned out for this symptom so far are:
> jo> >
> jo> > - This occurs around once per 1-2 days.
> jo> >
> jo> > - Once it occurs, all of communications including ARP and IPv4 stop.
> jo> >
> jo> > - "ifconfig em0 down/up" can recover the interface. However, on doing
> jo> >   "up" after "down" the following message was displayed:
> jo> >
> jo> >   # ifconfig em0 up
> jo> >   em0: Could not setup receive structures
> jo> >
> jo> >   After trying it several times it worked.
> jo> >
> jo> >   Then, the interface seemed back to normal for a couple of minutes,
> jo> >   but it stopped again.
> jo> >
> jo> > I guess there is a kind of deadlock somewhere but not sure it is
> jo> > really related to the em(4) driver.  I will continue to investigate
> jo> > anyway.
> jo>
> jo> I'm curious, what speed/duplex is your interface using and is it
> jo> statically set or using autoselect?
>
>  No manual configuration.  Two em's are set as the following:
>
>  | media: Ethernet autoselect (1000baseT )
>
>  It is mainly used for NFS server.  The actual communication speed was
>  around 700Mbps at peak.
>
> -- Hiroki
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: State of igb on FreeBSD 8 stable?

2010-01-08 Thread Jack Vogel
I am incorporating some of Pyun's ideas into my new version, although some
changes I
am unconvinced about... There will always be later :)

Jack


On Thu, Jan 7, 2010 at 12:26 PM, Mike Tancsa  wrote:

> At 03:19 PM 1/7/2010, Pyun YongHyeon wrote:
>
>> On Thu, Jan 07, 2010 at 09:46:14AM -0800, alan bryan wrote:
>> > I did some searching last night and found others using igb on Intel
>> Cards having high interrupts and other strange issues and some comments to
>> the effect that igb is soon going to have a lot of work done to it (I
>> believe Jack Vogel is working on it).  So, can someone give an estimation as
>> to how soon that may be and how soon it may make it to 8-stable?  If it's
>> going to be a while I may look into adding a card using a different driver
>> to test.
>> >
>> http://people.freebsd.org/~yongari/igb/igb.buf.patch6<http://people.freebsd.org/%7Eyongari/igb/igb.buf.patch6>
>>
>> The patch would fix unresponsive system under high network load as
>> well as reducing number of interrupts. The patch also contains
>> bus_dma(9) and watchdog timeout fix.
>>
>
> FYI, with the above patch, the driver is quite stable for me on a box I use
> in my lab.
>
>---Mike
>
>
>
> 
> Mike Tancsa,  tel +1 519 651 3400
> Sentex Communications,m...@sentex.net
> Providing Internet since 1994www.sentex.net
> Cambridge, Ontario Canada www.sentex.net/mike
>
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em interface slow down on 8.0R

2010-01-26 Thread Jack Vogel
No, it hasn't, I need time to look it over and be convinced of what he was
doing.

Jack


On Tue, Jan 26, 2010 at 9:14 AM, Nick Rogers  wrote:

> looks like the patch mentioned in kern/141843 has not been applied to the
> tree?
>
> On Tue, Jan 26, 2010 at 9:00 AM, Nick Rogers  wrote:
>
> > Is it advisable to patch 8.0-RELEASE kernel sources with the latest
> > (CURRENT) em driver (i.e., src/sys/dev/e1000)? It looks like there are
> some
> > updates to the driver since 8.0-RELEASE that may fix some problems?
> >
> >
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em interface slow down on 8.0R

2010-01-26 Thread Jack Vogel
I've tried this patch, and it completely breaks IPv6 offloads, which DO work
btw,
our testers have a netperf stress test that does both ipv4 and ipv6, and
that test
fails 100% after this change.

I could go hacking at it myself but as its your code Pyun would you like to
resolve this issue?

Regards,

Jack


On Tue, Jan 26, 2010 at 9:40 AM, Jack Vogel  wrote:

> No, it hasn't, I need time to look it over and be convinced of what he was
> doing.
>
> Jack
>
>
>
> On Tue, Jan 26, 2010 at 9:14 AM, Nick Rogers  wrote:
>
>> looks like the patch mentioned in kern/141843 has not been applied to the
>> tree?
>>
>> On Tue, Jan 26, 2010 at 9:00 AM, Nick Rogers  wrote:
>>
>> > Is it advisable to patch 8.0-RELEASE kernel sources with the latest
>> > (CURRENT) em driver (i.e., src/sys/dev/e1000)? It looks like there are
>> some
>> > updates to the driver since 8.0-RELEASE that may fix some problems?
>> >
>> >
>> ___
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em interface slow down on 8.0R

2010-01-26 Thread Jack Vogel
Well, what our testers do is assign BOTH an ipv4 and ipv6 address to an
interface,
then netperf runs over both, I don't know the internal details but I assume
both TCP
and UDP are going over ipv6.

Prior to your change there is IPv6 handling code in the tx checksum
routine,  so I
assume the hardware offload for that works. With your patch if I disable
TXCSUM
on the interface then it will work... but before your change it works with
that on.

So, am I missing something?

Cheers,

Jack


On Tue, Jan 26, 2010 at 12:12 PM, Pyun YongHyeon  wrote:

> On Tue, Jan 26, 2010 at 11:55:00AM -0800, Jack Vogel wrote:
> > I've tried this patch, and it completely breaks IPv6 offloads, which DO
> work
> > btw,
> > our testers have a netperf stress test that does both ipv4 and ipv6, and
> > that test
> > fails 100% after this change.
> >
> > I could go hacking at it myself but as its your code Pyun would you like
> to
> > resolve this issue?
> >
>
> I wonder how you could test IPv6 checksum offloading/TSO as FreeBSD
> does not have that capability yet. Do we already have that
> capability? I vaguely remember there was an effort to bring the
> support in but I don't know current status. If we have the
> capability I would have to update all other drivers that can do
> IPv6 checksum offloading/TSO for IPv6.
>
> > Regards,
> >
> > Jack
> >
> >
> > On Tue, Jan 26, 2010 at 9:40 AM, Jack Vogel  wrote:
> >
> > > No, it hasn't, I need time to look it over and be convinced of what he
> was
> > > doing.
> > >
> > > Jack
> > >
> > >
> > >
> > > On Tue, Jan 26, 2010 at 9:14 AM, Nick Rogers 
> wrote:
> > >
> > >> looks like the patch mentioned in kern/141843 has not been applied to
> the
> > >> tree?
> > >>
> > >> On Tue, Jan 26, 2010 at 9:00 AM, Nick Rogers 
> wrote:
> > >>
> > >> > Is it advisable to patch 8.0-RELEASE kernel sources with the latest
> > >> > (CURRENT) em driver (i.e., src/sys/dev/e1000)? It looks like there
> are
> > >> some
> > >> > updates to the driver since 8.0-RELEASE that may fix some problems?
> > >> >
> > >> >
> > >> ___
> > >> freebsd-stable@freebsd.org mailing list
> > >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > >> To unsubscribe, send any mail to "
> freebsd-stable-unsubscr...@freebsd.org"
> > >>
> > >
> > >
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em interface slow down on 8.0R

2010-01-26 Thread Jack Vogel
Great, if you can get the changes to me quickly I'd like to incorporate
them.

BTW, I have merged your igb changes into my code and its very stable, should
see that checked in for 7.3 shortly.

Thanks for your hard work Pyun!

Jack


On Tue, Jan 26, 2010 at 12:33 PM, Pyun YongHyeon  wrote:

> On Tue, Jan 26, 2010 at 12:22:01PM -0800, Jack Vogel wrote:
> > Well, what our testers do is assign BOTH an ipv4 and ipv6 address to an
> > interface,
> > then netperf runs over both, I don't know the internal details but I
> assume
> > both TCP
> > and UDP are going over ipv6.
> >
> > Prior to your change there is IPv6 handling code in the tx checksum
> > routine,  so I
> > assume the hardware offload for that works. With your patch if I disable
> > TXCSUM
> > on the interface then it will work... but before your change it works
> with
> > that on.
> >
> > So, am I missing something?
> >
>
> Hmm, then I guess there is bug in the patch. Apparently upper stack
> already computed checksum for IPv6 so the patch should not try to
> offload IPv6 traffic again. I'll see the patch again.
> Thanks for valuable input. :-)
>
> > Cheers,
> >
> > Jack
> >
> >
> > On Tue, Jan 26, 2010 at 12:12 PM, Pyun YongHyeon 
> wrote:
> >
> > > On Tue, Jan 26, 2010 at 11:55:00AM -0800, Jack Vogel wrote:
> > > > I've tried this patch, and it completely breaks IPv6 offloads, which
> DO
> > > work
> > > > btw,
> > > > our testers have a netperf stress test that does both ipv4 and ipv6,
> and
> > > > that test
> > > > fails 100% after this change.
> > > >
> > > > I could go hacking at it myself but as its your code Pyun would you
> like
> > > to
> > > > resolve this issue?
> > > >
> > >
> > > I wonder how you could test IPv6 checksum offloading/TSO as FreeBSD
> > > does not have that capability yet. Do we already have that
> > > capability? I vaguely remember there was an effort to bring the
> > > support in but I don't know current status. If we have the
> > > capability I would have to update all other drivers that can do
> > > IPv6 checksum offloading/TSO for IPv6.
> > >
> > > > Regards,
> > > >
> > > > Jack
> > > >
> > > >
> > > > On Tue, Jan 26, 2010 at 9:40 AM, Jack Vogel 
> wrote:
> > > >
> > > > > No, it hasn't, I need time to look it over and be convinced of what
> he
> > > was
> > > > > doing.
> > > > >
> > > > > Jack
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jan 26, 2010 at 9:14 AM, Nick Rogers 
> > > wrote:
> > > > >
> > > > >> looks like the patch mentioned in kern/141843 has not been applied
> to
> > > the
> > > > >> tree?
> > > > >>
> > > > >> On Tue, Jan 26, 2010 at 9:00 AM, Nick Rogers 
> > > wrote:
> > > > >>
> > > > >> > Is it advisable to patch 8.0-RELEASE kernel sources with the
> latest
> > > > >> > (CURRENT) em driver (i.e., src/sys/dev/e1000)? It looks like
> there
> > > are
> > > > >> some
> > > > >> > updates to the driver since 8.0-RELEASE that may fix some
> problems?
> > > > >> >
> > > > >> >
> > > > >> ___
> > > > >> freebsd-stable@freebsd.org mailing list
> > > > >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > > > >> To unsubscribe, send any mail to "
> > > freebsd-stable-unsubscr...@freebsd.org"
> > > > >>
> > > > >
> > > > >
> > >
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


  1   2   3   4   >