Andrew Gallatin wrote:

Between TSO and your sendfile changes, things are looking up!
Here are some Myri10GbE 1500 byte results from a 1.8GHz UP
FreeBSD/amd64 machine (AMD Athlon(tm) 64 Processor 3000+) sending to a
2.0GHz SMP Linux/x86_64 machine (AMD Athlon(tm) 64 X2 Dual Core Processor
3800+) running 26.17.7smp and our 1.1.0 Myri10GE driver (with LRO).
I used a linux receiver because LRO is the only way to receive
standard frames at line rate (without a TOE).

These tests are all for sendfile of a 10MB file in /var/tmp:
  % netperf242 -Hrome-my -tTCP_SENDFILE -F /var//tmp/zot -T,1 -c -C  -- -s393216

You should use -m5M as well.  netperf is kinda dumb and does only
socket buffer sized sendfile calls whereas sendfile really works
best (especially new-sendfile) when it chew on a really big chunk
of file without having to return to userland for every ~380k in this
case.

The -T,1 is required to force the netserver to use a different core
than the interrupt handler is bound to on the linux machine.  BTW,
it would be really nice if FreeBSD supported CPU affinity for processes
and interrupt handlers..

I have a gross version of that in my tree.  The kernel itself supports
it but it's not yet exposed to userland for manual intervention.

I did a number of runs with TSO and the patch applied and found that
setting the send-side socket buffer size to 393216 gave the best
performance in that case.  I used this size for all tests, but it is
possible there is a different sweet spot for other configurations.
Note that linux auto-tunes socket buffer sizes, so I omitted the --
-s393216 for linux.

We're getting there too.  First for the send buffer.  Again some gross
code in my tree.  Not really tested yet though.


Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

Without patch:
87380 393216 393216 10.00 2163.08 100.00 19.35 3.787 1.466 Without patch + TSO: 87380 393216 393216 10.00 4367.18 71.54 42.07 1.342 1.578 With patch: 87380 393216 393216 10.01 1882.73 86.15 18.43 3.749 1.604 With patch + TSO: 87380 393216 393216 10.00 6961.08 47.69 60.11 0.561 1.415

Be a bit careful with the CPU usage figures.  The numbers netperf reports
differ quite a bit from those reported by time(1) on the high side.  And
there are some differences in the approach how FreeBSD and Linux do their
statistical measurements of user and system time.  This doesn't change the
throughput number though.  But see the -m5M option.  New sendfile is really
optimized to chew on a large file (larger than the socket buffer size) as
it normally happens in reality.

For comparision, if I reboot the sender into RHEL (Linux 2.6.9-11.EL x86_64):
87380 65536 65536 10.01 9333.00 28.98 75.23 0.254 1.321

The above results are the median result for 5 runs at each setting.

How large is the variance between the runs?

--
Andre

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to