On 05.12.2012 13:58, Barney Cordoba wrote:

--- On Tue, 12/4/12, Adrian Chadd <adr...@freebsd.org> wrote:

From: Adrian Chadd <adr...@freebsd.org>
Subject: Re: Latency issues with buf_ring
To: "Andre Oppermann" <opperm...@networx.ch>
Cc: "Barney Cordoba" <barney_cord...@yahoo.com>, "John Baldwin" 
<j...@freebsd.org>, freebsd-net@freebsd.org
Date: Tuesday, December 4, 2012, 4:31 PM
On 4 December 2012 12:02, Andre
Oppermann <opperm...@networx.ch>
wrote:

Our IF_* stack/driver boundary handoff isn't up to the
task anymore.

Right. well, the current hand off is really "here's a
packet, go do
stuff!" and the legacy if_start() method is just plain
broken for SMP,
preemption and direct dispatch.

Things are also very special in the net80211 world, with the
stack
layer having to get its grubby fingers into things.

I'm sure that the other examples of layered protocols (eg
doing MPLS,
or even just straight PPPoE style tunneling) has the same
issues.
Anything with sequence numbers and encryption being done by
some other
layer is going to have the same issue, unless it's all
enforced via
some other queue and a single thread handling the network
stack
"stuff".

I bet direct-dispatch netgraph will have similar issues too,
if it
ever comes into existence. :-)

Also the interactions are either poorly defined or
understood in many
places.  I've had a few chats with yongari@ and am
experimenting with
a modernized interface in my branch.

The reason I stumbled across it was because I'm
extending the hardware
offload feature set and found out that the stack and
the drivers (and
the drivers among themself) are not really in sync with
regards to behavior.
For most if not all ethernet drivers from 100Mbit/s the
TX DMA rings
are so large that buffering at the IFQ level doesn't
make sense anymore
and only adds latency.  So it could simply
directly put everything into
the TX DMA and not even try to soft-queue.  If the
TX DMA ring is full
ENOBUFS is returned instead of filling yet another
queue.  However there
are ALTQ interactions and other mechanisms which have
to be considered
too making it a bit more involved.
net80211 has slightly different problems. We have
requirements for
per-node, per-TID/per-AC state (not just for QOS, but
separate
sequence numbers, different state machine handling for
things like
aggregation and (later) U-APSD handling, etc) so we do need
to direct
frames into different queues and then correctly serialise
that mess.

I'm coming up with a draft and some benchmark results
for an updated
stack/driver boundary in the next weeks before xmas.
Ok. Please don't rush into it though; I'd like time to think
about it
after NY (as I may actually _have_ a holiday this xmas!) and
I'd like
to try and rope in people from non-ethernet-packet-pushing
backgrounds
to comment.
They may have much stricter and/or stranger requirements
when it comes
to how the network layer passes, serialises and pushes
packets to
other layers.

Thanks,


Adrian
Something I'd like to see is a general modularization of function,
which will make all of the other stuff much easier. A big issue with
multipurpose OSes is that they tend to be bloated with stuff that almost
nobody uses. 99.9% of people are running either bridge/filters or straight
TCP/IP, and there is a different design goal for a single nic web server
and a router or firewall.

By modularization, I mean making the "pieces" threadable. The requirements
for threading vary by application, but the ability to control it can
make a world of difference in performance. Having a dedicate transmit
thread may make no sense on a web server, on a dual core system or
with a single queue adapter, but other times it might. Instead of having
one big honking routine that does everything, modularizing it not only
cleans up the code, but also makes the system more flexible without
making it a mess.

The design for the 99% should not be hindered by the need to support
stuff like ALTQ. The hooks for ALTQ should be possible, but the locking
and queuing only required for such outliers should be separable.

I'd also like to see a unification of all of the projects. Is it really
necessary to have 34 checks for different "ideas" in if_ethersubr.c?

As a developer I know that you always want to work on the next new thing,
but sometimes you need to stop, think, and clean up your code. The cleaner
code opens up new possibilities, and results in a better overall product.
I hear you.

--
Andre

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to