On Tue, 8 Jul 2008, Bruce Evans wrote:

On Mon, 7 Jul 2008, Andre Oppermann wrote:

Bruce Evans wrote:
So it seems that the major overheads are not near the driver (as I already
knew), and upper layers are responsible for most of the cache misses.
The packet header is accessed even in monitor mode, so I think most of
the cache misses in upper layers are not related to the packet header.
Maybe they are due mainly to perfect non-locality for mbufs.

Monitor mode doesn't access the payload packet header.  It only looks
at the mbuf (which has a structure called mbuf packet header).  The mbuf
header it hot in the cache because the driver just touched it and filled
in the information.  The packet content (the payload) is cold and just
arrived via DMA in DRAM.

Why does it use ntohs() then? :-).  From if_ethersubr.c:

...
%       eh = mtod(m, struct ether_header *);

Point outside of mbuf header.

%       etype = ntohs(eh->ether_type);

First access outside of mbuf header.
...
% % /* Allow monitor mode to claim this frame, after stats are updated. */
%       if (ifp->if_flags & IFF_MONITOR) {
%               m_freem(m);
%               return;
%       }

Finally return in monitor mode.

I don't see any stats update before here except for the stray if_imcasts
one.

There are some error stats with printfs, but I've never seen these do
anything except with a buggy sk driver.

Testing verifies that accessing eh above gives a cache miss.  Under
~5.2 receiving on bge0 at 397 kpps:

-monitor: 17% idle 19 cm/p  (18% less idle than under -current)
 monitor: 66% idle  8 cm/p  (17% less idle than under -current)
+monitor: 71% idle  7 cm/p  (idle time under -current not measured)

+monitor is monitor mode with the exit moved to the top of ether_input().

If the cache miss takes the time measured by lmbench2 (42 ns), then
397 k of these per second gives 17 ms or 1.7% CPU, which is vaguely
consistent with the improvement of 5% by not taking this cache miss.
Avoiding most of the 19 cache misses should give much more than a
5% improvement.  Maybe -current gets its 17% improvement by avoiding
some.

More userland stats weirdness in userland:
- in monitor mode, em0 gives byte counts delayed while bge0 gives byte
  counts always 0.
- netstat -I <interface> 1 seems to be broken in ~5.2 in all modes -- it
  gives output for interfaces with drivers but no hardware.

All this is for UP.  An SMP kernel on the same UP system loses < 5% for at
least tx.

Bruce
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to