Re: FreeBSD I/OAT (QuickData now?) driver

2011-06-06 Thread Jack Vogel
My proto code is ancient now, like 4 years old I'm guessing, this is the
most I've
ever seen as far as interest in it :) The hardware has evolved so it really
needs to
be updated.

If there's really interest then perhaps I should get something together that
can
actually be checked in?? Yes?

Cheers,

Jack


On Mon, Jun 6, 2011 at 1:23 PM, Matthew Jacob  wrote:

>
> At Panasas we were looking at using that for some background parity
> calculation.
>
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: intel checksum offload

2011-09-19 Thread Jack Vogel
On Sun, Sep 18, 2011 at 7:48 PM, Arnaud Lacombe  wrote:

> Hi,
>
> On Sun, Sep 18, 2011 at 10:01 PM, Luigi Rizzo  wrote:
> > On Sun, Sep 18, 2011 at 06:05:33PM -0400, Arnaud Lacombe wrote:
> >> Hi,
> >>
> >> On Sun, Sep 18, 2011 at 5:06 PM, Luigi Rizzo 
> wrote:
> >> > On Sun, Sep 18, 2011 at 03:19:46PM -0400, Arnaud Lacombe wrote:
> >> >> Hi,
> >> >>
> >> >> On Sat, Sep 17, 2011 at 4:32 PM, YongHyeon PYUN 
> wrote:
> >> >> > On Sat, Sep 17, 2011 at 11:57:10AM +0430, Hooman Fazaeli wrote:
> >> >> >> Hi list,
> >> >> >>
> >> >> >> The data sheet for intel 82576 advertises IP TX/RX checksum
> offload
> >> >> >> but the driver does not set CSUM_IP in ifp->if_hwassist. Does this
> mean that
> >> >> >> driver (and chip) do not support IP TX checksum offload or the
> support for
> >> >> >> TX is not yet included in the driver?
> >> > ...
> >> >> This is slightly off-topic, but still..
> >> >>
> >> >> FWIW, I'm not really impressed by what chips claim to support vs.
> what
> >> >> has been implemented in the driver. As per the product brief, the
> >> > ...
> >> >> [0]: the commit message say "performance was not good", but it is not
> >> >> the driver's developer to decide whether or not a feature is good or
> >> >> not. The developer's job is to implement the chip capabilities, and
> >> >> let it to the user to enable or disable the capabilities. At best,
> the
> >> >> developer can decide whether or not to enable the feature by default.
> >> >
> >> > actually, this is a perfect example where the developer has done the
> >> > right thing: implemented the feature, verified that performance is
> bad,
> >> > hence presumably removed support for the feature from the code (which
> also
> >> > means that the normal code path will run faster because there are no
> >> > run-time decisions to be made).
> >> >
> >> > "optional" features are often costly even when disabled.
> >> >
> >> I forgot to mention that in this case, the code full of
> >> EM_MULTIQUEUE's #ifdef and shared code is still fully compatible with
> >> the multiqueue's architecture. The only thing removed is a conditional
> >> and an assignation in the driver's attachment which was enabling the
> >> feature, ie. the cost you point out is still paid today, without any
> >> benefit.
> >
> > the above suggests that you have a wonderful opportunity: with just
> > a little bit of time and effort you should be able to complete/re-enable
> > the missing code, run tests that you believe significant (given
> > your statement below) and prove or disprove the comment about
> > performance.
> >
> Which I did about a week ago, to finally discover that the NIC only
> had only 3 MSI-X vectors configured in its EEPROM[0], and thus the
> MSI-X PCI capability field ends up also with being assigned with those
> 3 vectors. However, the  82574 datasheet clearly say that up-to 5
> vector can be configured, but I obviously did not find the magic trick
> to make it so. Maybe I'll find some time and try to reprogram the
> EEPROM. Beside that, it was clear that the old multiqueue did not
> support only 3 vector being available and thus fell back on MSI. It is
> not clear in jfv@'s comment whether he really tested multiqueue, or
> did he test the fall-back MSI mode.
>
> As the PCI spec is not public, I've not been able to find out from the
> few public datasheet how the PCI MSI-X capability field is first
> programmed. I'd assume that the BIOS is using the data in the NVM to
> program it at power up.
>
>  - Arnaud
>
> [0]: at least, the MSI_X_NUM field of the NVM at offset 0x1b is 2,
> thus 3 vectors.
>

I give answers to those who treat me with respect, I view them as
collaborators, we improve the drivers for everyone's benefit, rather
than jumping in to throw a critical remark here, a negative innuendo
there...

If you notice, the Linux driver did not enable multiqueue on the hardware
either, so do you think a whole department of software engineers backed
by the hardware engineers who designed the damn thing might have had
a reason?

IN FACT, as I have a bit more freedom with FreeBSD, I went ahead and
tried it for a while just because I could, implementing the code was not
difficult. Over time however that code proved to be a source of instability
and thus was disabled.

I have heard a rumor that the Linux crew may actually be trying a second
time to make it work, and that might give me cause to look at it again too,
but its not clear if I'll have time with other priorities.

Jack
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: 9.1-RC3 IGB dropping connections.

2012-11-27 Thread Jack Vogel
Something in your environment, have lots of users with this driver in very
demanding
environments and I have not been seeing reports of this sort.

Jack


On Tue, Nov 27, 2012 at 2:27 PM, Zaphod Beeblebrox wrote:

> I've got an Intel server motherboard with 4x igb (and 1x em) on it.
> The motherboard in question is the S3420GPRX and the IGB's show up as:
>
> igb0:  port
> 0x3020-0x303f mem 0xb1b2-0xb1b3,0xb1bc4000-0xb1bc7fff irq 19
> at device 0.0 on pci3
> igb0: Using MSIX interrupts with 9 vectors
> igb0: Ethernet address: 00:1e:67:3a:d5:40
> igb0: Bound queue 0 to cpu 0
> igb0: Bound queue 1 to cpu 1
> igb0: Bound queue 2 to cpu 2
> igb0: Bound queue 3 to cpu 3
> igb0: Bound queue 4 to cpu 4
> igb0: Bound queue 5 to cpu 5
> igb0: Bound queue 6 to cpu 6
> igb0: Bound queue 7 to cpu 7
>
> ... now... I have this machine (right now) on the local lan with my
> windows 7 workstation and putty sees the ssh connection as dropped
> often.  I say often --- in that it can happen in a minute or two... it
> often seems to happen when there is active output going to the window
> (like a download counter running), but I also say "often" in that...
> it seems slightly random... but it _is_ incessant... as in very
> "often."
>
> This seems like something that we should ship with 9.1...
> ___
> freebsd-...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: 9.1-RC3 IGB dropping connections.

2012-11-27 Thread Jack Vogel
On Tue, Nov 27, 2012 at 4:04 PM, Zaphod Beeblebrox wrote:

> To Jack Vogel's comment, this problem only seems to occur on systems
> that are exceedingly lightly loaded (in this case, not yet in
> production and I'm the only one using it).
>
> On Tue, Nov 27, 2012 at 6:21 PM, Andre Oppermann 
> wrote:
>
> > r243570 in CURRENT should likely fix this issue.  It's only 27 hours old
> > and hasn't been MFC'd yet.
>
> I'm not sure this addresses what I'm seeing.  It's a pause the the
> traffic in the shell that is "fixed" by causing some traffic on the
> return channel (watching for the pause --- and then hitting enter a
> few times seems to fix it).  I'd expect that TCP retransmission should
> take care of this regularly  ... but in this case, it doesn't... for
> whatever reason ...
>
>
You say it drops the connection but show no specifics, may I see the
system message file from boot til it happens. Also how about a pciconf -lv
while you're at it.

Jack
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: IFCAP_LRO on FreeBSD 7

2008-11-27 Thread Jack Vogel
On Thu, Nov 27, 2008 at 4:51 AM, Robert Watson <[EMAIL PROTECTED]> wrote:

>
> On Thu, 27 Nov 2008, Yony Yossef wrote:
>
>  Is there a native interface for LRO in FreeBSD 7? I can't find any use for
>> IFCAP_LRO but notifying the driver if to use or not to use this offload.
>>
>> If not, is it planned for FreeBSD 8?
>>
>
> IFCAP_LRO is a capability/policy flag allowing drivers to declare support
> for LRO, and for administrators to enable/disable it if present.  Drivers
> can either provide their own implementation (mxge, nxge) or use the system
> implementation (cxgb, igb).  I'm slightly surprised to see that igb
> references tcp_lro_init() but not IFCAP_LRO -- perhaps lro isn't yet fully
> hooked up, or perhaps there's a bug?  I believe all of the above applies to
> 7.1 but not 7.0, except possibly mxge supporting LRO in 7.0.
>

Been so busy internally that I did not even realize that this
capability had been created, I need to change both igb and
ixgbe to use it.

Thanks for pointing this out,

Jack
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Sudden mbuf demand increase and shortage under the load (igb issue?)

2010-02-19 Thread Jack Vogel
This thread is confusing, first he says its an igb problem, then you offer
an em patch :)

I have an important rev of igb that I am about ready to release, anyone that
wishes to
test against a problem they have would be welcome to have early access, just
let me
know.

I am not sure about this ich10 change, there are client NICs that
specifically do NOT
support jumbo frames, I'll need to look into it tomorrow at work.

Jack


On Thu, Feb 18, 2010 at 7:42 PM, Pyun YongHyeon  wrote:

> On Thu, Feb 18, 2010 at 05:05:16PM -0800, Maxim Sobolev wrote:
> > Folks,
> >
> > Indeed, it looks like igb(4) issue. Replacing the card with the
> > desktop-grade em(4)-supported card has fixed the problem for us. The
> > system has been happily pushing 110mbps worth of RTP traffic and 2000
> > concurrent calls without any problems for two days now.
> >
> > e...@pci0:7:0:0: class=0x02 card=0xa01f8086 chip=0x10d38086 rev=0x00
> > hdr=0x00
> > vendor = 'Intel Corporation'
> > class  = network
> > subclass   = ethernet
> >
> > em0:  port 0xec00-0xec1f mem
> > 0xfbee-0xfbef,0xfbe0-0xfbe7,0xfbedc000-0xfbed irq 24
> > at device 0.0 on pci7
> > em0: Using MSIX interrupts
> > em0: [ITHREAD]
> > em0: [ITHREAD]
> > em0: [ITHREAD]
> > em0: Ethernet address: 00:1b:21:50:02:49
> >
> > I really think that this has to be addressed before 7.3 release is out.
> > FreeBSD used to be famous for its excellent network performance and it's
> > shame to see that deteriorating due to sub-standard quality drivers.
> > Especially when there is a multi-billion vendor supporting the driver in
> > question. No finger pointing, but it really looks like either somebody
> > is not doing his job or the said vendor doesn't care so much about
> > supporting FreeBSD. I am pretty sure the vendor in question has access
> > to numerous load-testing tools, that should have catched this issue.
> >
> > This is the second time during the past 6 months I have issue with the
> > quality of the Intel-based drivers - the first one is filed as
> > kern/140326, which has stalled apparently despite me providing all
> > necessary debug information.
> >
>
> I can reproduce this bug on my box and I guess the root cause comes
> from PBA(Packet Buffer Allocation) configuration. Some controllers
> seems to require more TX buffer size to use 9000 MTU. The datasheet
> is not clear which controller has how much amount of Packet Buffer
> storage. This parameter seems to affect performance a lot because
> increasing TX buffer size results in decreasing RX buffer size. The
> attached patch seems to fix the issue for me but Jack may know
> better the hardware details as publicly available datasheet seems
> to be useless here.
>
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: em blues

2006-10-11 Thread Jack Vogel

On 10/11/06, Danny Braniss <[EMAIL PROTECTED]> wrote:

the box is a bit old (Intel Pentium III (933.07-MHz 686-class CPU)
dual cpu.

running iperf -c (receiving):

freebsd-4.100.0-10.0 sec936 MBytes785 Mbits/sec
freebsd-5.4 0.0-10.0 sec413 MBytes346 Mbits/sec
freebsd.6.1 0.0-10.0 sec366 MBytes307 Mbits/sec
freebsd-6.2 0.0-10.0 sec344 MBytes289 Mbits/sec

btw, iperf -s (xmitting) is slightly better
freebsd-4.100.0-10.0 sec664 MBytes558 Mbits/sec
freebsd-5.4 0.0-10.0 sec390 MBytes327 Mbits/sec
freebsd-6.1 0.0-10.0 sec495 MBytes415 Mbits/sec
freebsd-6.2 0.0-10.0 sec487 MBytes408 Mbits/sec

so, it seems that as the release number increases, the em
throughput gets worse - or iperf is.


You arent measuring em, you're measuring RELEASES on
your hardware, is this a surprise on a P3, no.

I still do 930ish Mb/s on a P4 with a PCI-E or PCI-X adaptors
running 6.1, in fact can do that with a 4 port adaptor I believe.


Regards,

Jack
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Network stack changes

2013-08-28 Thread Jack Vogel
Very interesting material Alexander, only had time to glance at it now,
will look in more
depth later, thanks!

Jack



On Wed, Aug 28, 2013 at 11:30 AM, Alexander V. Chernikov <
melif...@yandex-team.ru> wrote:

> Hello list!
>
> There is a lot constantly raising  discussions related to networking stack
> performance/changes.
>
> I'll try to summarize current problems and possible solutions from my
> point of view.
> (Generally this is one problem: stack is slooow**,
> but we need to know why and what to do).
>
> Let's start with current IPv4 packet flow on a typical router:
> http://static.ipfw.ru/images/**freebsd_ipv4_flow.png
>
> (I'm sorry I can't provide this as text since Visio don't have any
> 'ascii-art' exporter).
>
> Note that we are using process-to-completion model, e.g. process any
> packet in ISR until it is either
> consumed by L4+ stack or dropped or put to egress NIC queue.
>
> (There is also deferred ISR model implemented inside netisr but it does
> not change much:
> it can help to do more fine-grained hashing (for GRE or other similar
> traffic), but
> 1) it uses per-packet mutex locking which kills all performance
> 2) it currently does not have _any_ hashing functions (see absence of
> flags in `netstat -Q`)
> People using 
> http://static.ipfw.ru/patches/**netisr_ip_flowid.diff(or
>  modified PPPoe/GRE version)
> report some profit, but without fixing (1) it can't help much
> )
>
> So, let's start:
>
> 1) Ixgbe uses mutex to protect each RX ring which is perfectly fine since
> there is nearly no contention
> (the only thing that can happen is driver reconfiguration which is rare
> and, more signifficant, we do this once
> for the batch of packets received in given interrupt). However, due to
> some (im)possible deadlocks current code
> does per-packet ring unlock/lock (see ixgbe_rx_input()).
> There was a discussion ended with nothing: http://lists.freebsd.org/**
> pipermail/freebsd-net/2012-**October/033520.html
>
> 1*) Possible BPF users. Here we have one rlock if there are any readers
> present
> (and mutex for any matching packets, but this is more or less OK.
> Additionally, there is WIP to implement multiqueue BPF
> and there is chance that we can reduce lock contention there). There is
> also an "optimize_writers" hack permitting applications
> like CDP to use BPF as writers but not registering them as receivers
> (which implies rlock)
>
> 2/3) Virtual interfaces (laggs/vlans over lagg and other simular
> constructions).
> Currently we simply use rlock to make s/ix0/lagg0/ and, what is much more
> funny - we use complex vlan_hash with another rlock to
> get vlan interface from underlying one.
>
> This is definitely not like things should be done and this can be changed
> more or less easily.
>
> There are some useful terms/techniques in world of software/hardware
> routing: they have clear 'control plane' and 'data plane' separation.
> Former one is for dealing control traffic (IGP, MLD, IGMP snooping, lagg
> hellos, ARP/NDP, etc..) and some data traffic (packets with TTL=1, with
> options, destined to hosts without ARP/NDP record, and similar). Latter one
> is done in hardware (or effective software implementation).
> Control plane is responsible to provide data for efficient data plane
> operations. This is the point we are missing nearly everywhere.
>
> What I want to say is: lagg is pure control-plane stuff and vlan is nearly
> the same. We can't apply this approach to complex cases like
> lagg-over-vlans-over-vlans-**over-(pppoe_ng0-and_wifi0)
> but we definitely can do this for most common setups like (igb* or ix* in
> lagg with or without vlans on top of lagg).
>
> We already have some capabilities like VLANHWFILTER/VLANHWTAG, we can add
> some more. We even have per-driver hooks to program HW filtering.
>
> One small step to do is to throw packet to vlan interface directly (P1),
> proof-of-concept(working in production):
> http://lists.freebsd.org/**pipermail/freebsd-net/2013-**April/035270.html
>
> Another is to change lagg packet accounting: http://lists.freebsd.org/**
> pipermail/svn-src-all/2013-**April/067570.html
> Again, this is more like HW boxes do (aggregate all counters including
> errors) (and I can't imagine what real error we can get from _lagg_).
>
> 4) If we are router, we can do either slooow ip_input() -> ip_forward() ->
> ip_output() cycle or use optimized ip_fastfwd() which falls back to 'slow'
> path for multicast/options/local traffic (e.g. works exactly like 'data
> plane' part).
> (Btw, we can consider net.inet.ip.fastforwarding to be turned on by
> default at least for non-IPSEC kernels)
>
> Here we have to