"David S. Miller" wrote:
> 
> Andrew Morton writes:
>  > Changing the memory copy function did make some difference
>  > in my setup.  But the performance drop on send(8k) is only approx 10%,
>  > partly because I changed the way I'm testing it - `cyclesoak' is
>  > now penalised more heavily by cache misses, and amount of cache
>  > missing which networking causes cyclesoak is basically the same,
>  > whether or not the ZC patch is applied.
> 
> Ok ok ok, but are we at the point where there are no sizable "over the
> wire" performance anomalies anymore?  That is what is important, what
> are the localhost bandwidth measurements looking like for you now
> with/without the patch applied?

Using 2.4.2-pre3 + zerocopy-2.4.2p3-1.diff

All numbers in megs/sec

zcc/zcs is doing read(8k)/send(8k) to localhost.

On the dual 500MHz PII:

                   zcc/zcs                 bw_tcp

  Unpatched:       70                       66     
  Patched:         67                       66

Single 500MHz PII:

  Unpatched:       58                       54
  Patched:         49                       52 

Single 650MHz PIII Coppermine:

  Unpatched:       140                      180-250 
  Patched:         107                      159  


With or without ZC, there is Wierd Stuff happening with local
networking. Throughput is all over the place.

- With zcs reporting throughput once per second, the numbers were jumping
  around by +/-10%.  Had to bump the averaging period to 5 seconds to
  make much sense of it.   With a real network, they're rock solid.

- The difference between the PII and PIII is far beyond anything I see
  with any other workload.

- The difference between zcc/zcs and bw_tcp on the PIII is interesting.
  It's still apparent when zcc/zcs uses a 64k transfer buffer, like bw_tcp.
  zcc/zcs is doing file system reads, whereas bw_tcp isn't.  But the
  discrepancy isn't there on the PII.

- On the unpatched kernel, I saw one bw_tcp run after a reboot report
  410 Mbytes/sec.  Thereafter it's around 210.  err..  make that 180. No,
  make that 254. WTF?

Amongst all the noise it seems there's a problem on the PIII but
not the PII.

It's getting very lonely testing this stuff. It would be useful if
someone else could help out - at least running the bw_tcp tests. It's
pretty simple:

        bw_tcp -s ; bw_tcp 0


> I want to reach a known state where we can conclude "over the wire is
> about as good or better than before, but there is a cpu/cache usage
> penalty from the zerocopy stuff".
> 
> This is important.  It lets us get to the next stage which is to
> use your tools, numbers, and some profiling to see if we can get
> some of that cpu overhead back.

Seems, with the 100baseT NIC the performance drop on the Coppermine
is only half that of the Mendocino.  I _think_ the Mendocino is
only 4-way associative, but reports vary on this.  Coppermine is 8-way.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to