Re: em driver input errors

Manish Vachharajani Fri, 04 Sep 2009 10:41:46 -0700

Just decided to follow this thread as it seems to be related to some
issues we are seeing as well.

It appears that under heavy packet loads, the kernel cannot pull
packets off the NIC fast enough and thus is slow to free up
descriptors into which the NIC can DMA packets.  This causes the NIC
to drop the packet after it's internal queue fills up (and record the
packet as missed) because the hardware does not have enough
descriptors to write the packets into.  We ahve this issue with the
ixgbe 10 Gb/s card though the absolute packet rates at which we see a
problem are higher than those reported here.

In our test scenario the problem gets worse with many simultaneous TCP
connections, but the issue is the same.  Under high packet rates, the
driver cannot keep up and the NIC reports missed packets.  The issue
is not related to data throughput though as turning on jumbo frames
solves our issue for a fixed number of connections, and it seems here
that reducing the packet rate makes the misses go away.  More
importantly, in our tests, only the receiver sees a problem, the
transmitter is fine.

There was also another thread about problems with UDP throughput that
I suspect are caused by the same type of packet rate spikes.

The question is, why is the kernel stack slow to handle these packet
rates, doing some back of the envelope calculations, they don't seem
too bad?  Where is the time going?  And, are our problem, the UDP
issue, and this problem all caused by the same source of slowness or
are they three unrelated issues.

Manish

On Fri, Sep 4, 2009 at 11:14 AM, <alexpalias-bsd...@yahoo.com> wrote:
> --- On Fri, 9/4/09, Artis Caune <artis.ca...@gmail.com> wrote:
>
>> Is it still actual?
>
> Hello.  Yes, this is still actual.
>
> 1> netstat -nbhI em0 ; uptime
> Name    Mtu Network       Address              Ipkts Ierrs     Ibytes    
> Opkts Oerrs     Obytes  Coll
> em0    1500 <Link#1>      00:14:22:17:80:dc      31G   93M        18T      
> 36G     0        27T     0
>  7:50PM  up 23 days, 15:40, 1 user, load averages: 0.84, 1.05, 1.16
>
> The huge number of input errors is due to a 80-100kpps flood we received via 
> that interface, which got the errors/sec numbers up in the 50k/s range for a 
> few minutes.
>
>> You didn't mention if you are using pf or other firewall.
>
> Sorry if I didn't mention it.  I am using pf, but have tried "kldunload pf" 
> and the errors didn't disappear.
>
>> I have similar problem with two boxes replicating zfs
>> pools, when I
>> noticed input errors.
>> After some investigation turns out it was pf overhead, even
>> though I
>> was skipping on interfaces where zfs sedn/recv.
>>
>> With pf enables (and skip) I can copy 50-80MB/s with
>> 50-80Kpps and
>> 0-100+ input drops per second.
>> With pf disabled I can copy constantly with 102 or 93 MB/s
>> and
>> 110-131Kpps, few drops (because 1 CPU almost eaten).
>
> This is the kind of traffic I am seeing:
>
> Errors/second (5 minute average) per interface:
> http://www.dataxnet.ro/alex/errors.png
> Packets/second (5 minute average) per interface:
> http://www.dataxnet.ro/alex/packets.png
>
> Those graphs were saved a few minutes ago, times are EEST (GMT+3)
>
> I'm sorry I don't have the Mbits/s graphs up, I haven't been collecting that 
> data per interface recently (it's collected per vlan).
>
> Alex
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: em driver input errors

Reply via email to