On Sunday 21 January 2007 07:25, Bruce Evans wrote:
> nfs writes much less well with bge NICs than with other NICs (sk, fxp,

Do you use hardware checksumming on the bge?  There is an XXX in 
bge_start_locked() that looks a bit suspicious to me.

> xl, even rl).  Sometimes writing a 20K source file from vi seems to
> take about 2 seconds instead of seeming to be instantaneous (this gets
> faster as the system warms up).  Iozone shows the problem more
> reproducibly.  E.g.:
>
> 100Mbps fxp server -> 1Gbps bge 5701 client, udp:
> %%%
>       IOZONE: Performance Test of Sequential File I/O  --  V1.16 (10/28/92)
>               By Bill Norcott
>
>       Operating System: FreeBSD -- using fsync()
>
> IOZONE: auto-test mode
>
>       MB      reclen  bytes/sec written   bytes/sec read
>       1       512     1516885             291918639
>       1       1024    1158783             491354263
>       1       2048    1573651             715694105
>       1       4096    1223692             917431957
>       1       8192    729513              1097929467
>       2       512     1694809             281196631
>       2       1024    1379228             507917189
>       2       2048    1659521             789608264
>       2       4096    4606056             1064567574
>       2       8192    1142288             1318131028
>       4       512     1242214             298269971
>       4       1024    1853545             492110628
>       4       2048    2120136             742888430
>       4       4096    1896792             1121799065
>       4       8192    850210              1441812403
>       8       512     1563847             281422325
>       8       1024    1480844             492749552
>       8       2048    1658649             850165954
>       8       4096    2105283             1211348180
>       8       8192    2098425             1554875506
>       16      512     1508821             296842294
>       16      1024    1966239             527850530
>       16      2048    2036609             842656736
>       16      4096    1666138             1200594889
>       16      8192    2293378             1620824908
> Completed series of tests
> %%%
>
> Here bge barely reaches 10Mbps speeds (~1.2 MB/S) for writing.  Reading
> is cached well and fast.  100Mbps xl on the same client with the same
> server goes at full 100Mbps speed (11.77 MB/S for all file sizes
> including larger ones since the disk is not the limit at 100Mbps).
> 1Gbps sk on a different client with the same server goes at full
> 100Nbps speed.
>
> Switching to tcp gives full 100 Mbps speed.  However, when the bge link
> speed is reduced to 100Mbps, udp becomes about 10 times slower than the
> above and tcp becomes about as slow as the above (maybe a bit faster,
> but far below 11.77 MB/S).
>
> bge is also slow at nfs serving:
>
> 1Gbps bge 5701 server -> 1Gbps sk client:
> %%%
>
>       IOZONE: Performance Test of Sequential File I/O  --  V1.16 (10/28/92)
>               By Bill Norcott
>
>       Operating System: FreeBSD -- using fsync()
>
> IOZONE: auto-test mode
>
>       MB      reclen  bytes/sec written   bytes/sec read
>       1       512     36255350            242114472
>       1       1024    3051699             413319147
>       1       2048    22406458            632021710
>       1       4096    22447700            851162198
>       1       8192    3522493             1047562648
>       2       512     3270779             48125247
>       2       1024    28992179            46693718
>       2       2048    5956380             753318255
>       2       4096    27616650            1053311658
>       2       8192    5573338             48290208
>       4       512     9004770             47435659
>       4       1024    9576276             45601645
>       4       2048    30348874            85116667
>       4       4096    8635673             86150049
>       4       8192    9356773             47100031
>       8       512     9762446             46424146
>       8       1024    10054027            58344604
>       8       2048    9197430             60253061
>       8       4096    15934077            59476759
>       8       8192    8765470             47647937
>       16      512     5670225             46239891
>       16      1024    9425169             45950990
>       16      2048    9833515             46242945
>       16      4096    14812057            51313693
>       16      8192    9203742             47648722
> Completed series of tests
> %%%
>
> Now the available bandwidth is 10 times larger and about 9/10 of it is
> still not used, with a high variance.  For larger files, the variance
> is lower and the average speed is about 10MB/S.  The disk can only do
> about 40MB/S and the slowest of the 1Gbps NICS (sk) can only sustain
> 80MB/S through udp and about 50MB/S through tcp (it is limited by the
> 33 MHz 32-bit PCI bus and by being less smart than the bge interface).
> When the bge NIC was on the system which is now the server with the fxp
> NIC, bge and nfs worked unsurprisingly, just slower than I would have
> liked.  The write speed was 20-30MB/S for large files and 30-40MB/S for
> medium-sized files, with low variance.  This is the only configuration
> in which nfs/bge worked as expected.
>
> The problem is very old and not very hardware dependent.  Similar
> behaviour happens when some of the following are changed:
>
> OS -> FreeBSD-~5.2 or FreeBSD-6
> hardware -> newer amd64 CPU (Turion X2) with 5705 (iozone output for
> this below) instead of old amd64 CPU with 5701.  The newer amd64
> normally runs an i386-SMP current kernel while the old amd64 was
> running an amd64-UP current kernel in the above tests, but normally
> runs ~5.2 amd64-UP and behaves similarly with that. The combination
> that seemed to work right was an AthlonXP for the server with the same
> 5701 and any kernel.  The only strangeness with that was that current
> kernels gave a 5-10% slower nfs server despite giving a 30-90% larger
> packet rate for small packets.
>
>       IOZONE: Performance Test of Sequential File I/O  --  V1.16 (10/28/92)
>               By Bill Norcott
>
>       Operating System: FreeBSD -- using fsync()
>
> 100Mbps fxp server -> 1Gbps bge 5705 client:
> %%%
> IOZONE: auto-test mode
>
>       MB      reclen  bytes/sec written   bytes/sec read
>       1       512     2994400             185462027
>       1       1024    3074084             337817536
>       1       2048    2991691             576792985
>       1       4096    3074759             884740798
>       1       8192    3078019             1176892296
>       2       512     4262096             186709962
>       2       1024    2994468             339893080
>       2       2048    5112176             584846610
>       2       4096    4754187             909815165
>       2       8192    5100574             1212919611
>       4       512     5298715             187129017
>       4       1024    5302620             344445041
>       4       2048    4985597             590579630
>       4       4096    3703618             927711124
>       4       8192    5236177             1240896243
>       8       512     5142274             186899396
>       8       1024    6207933             345564808
>       8       2048    6162773             593088329
>       8       4096    6031445             936751120
>       8       8192    6072523             1224102288
>       16      512     5427113             186797193
>       16      1024    5065901             345544445
>       16      2048    5462338             595487384
>       16      4096    5256552             937013065
>       16      8192    5097101             1226320870
> Completed series of tests
> %%%
>
> rl on a system with 1/20 as much CPU is faster than this.
>
> The problem doesn't seem to affect much besides writes on nfs.  The
> bge 5701 works very well for most things.  It has a much better bus
> interface than the 5705 and works even better after moving it to the
> old amd64 system (it can now saturate 1Gbps where on the AthlonXP it
> only got 3/4 of the way, while the 5705 only gets 1/4 of the way).
> I've been working on minimising network latency and maximising packet
> rate, and normally have very low network latency (60-80 uS for ping)
> and fairly high packet rates.  The changes for this are not the caause
> of the bug :-), since the behaviour is not affected by running kernels
> without these changes or by sysctl''ing the changes to be null. 
> However, the problem looks like ones caused by large latencies combined
> with non-streaming protocols.  To write at just 11.77 MB/S, at least
> 8000 packets/second must be set from the client to the server.  Working
> clients sustain this rate, but broken clients the rate is much lower
> and not sustained:
>
> Output from netstat -s 1 on server while writing a ~1GB file via
> 5701/udp: %%%
>              input        (Total)           output
>     packets  errs      bytes    packets  errs      bytes colls
>         900     0    1513334        142     0      33532     0
>        1509     0    2564836        236     0      57368     0
>        1647     0    2295802        259     0      51106     0
>        1603     0    1502736        252     0      32926     0
>        1055     0     637014        163     0      13938     0
>         558     0    1542510         86     0      34340     0
>         984     0     989854        155     0      21816     0
>         864     0    1320786        135     0      38152     0
>         883     0    1558060        165     0      34340     0
>        1177     0    3780102        203     0      85850     0
>        2087     0     954212        331     0      21210     0
>        1187     0    1413568        190     0      31310     0
>         650     0    3320604        101     0      75346     0
>        1565     0    1706542        246     0      37976     0
>        2055     0    2360620        329     0      52318     0
>        1554     0    2416996        244     0      54226     0
>        1402     0    2579894        220     0      58176     0
>        1690     0     774488        267     0      16968     0
>        1323     0    3690650        209     0      83830     0
>         591     0    4519858         92     0     103110     0
> %%%
>
> There is no sign of any packet loss or switch problems.  Forcing
> 1000baseTX full-duplex has no effect.  Forcing 100baseTX full-duplex
> makes the problem more obvious.  The mtu is 1500 throughout since
> only bge-5701 and sk support jumbo frames and I want to use udp for
> nfs.
>
> 5705/udp is better:
> %%%
>              input        (Total)           output
>     packets  errs      bytes    packets  errs      bytes colls
>        5209     0    6607758        846     0     151702     0
>        4763     0    6684546        773     0     153520     0
>        4758     0    6618498        769     0     151298     0
>        3582     0    7057568        576     0     162498     0
>        4935     0    5115068        800     0     116756     0
>        4924     0    6622026        798     0     152802     0
>        4095     0    6018462        657     0     137450     0
>        4647     0    5270442        751     0     120594     0
>        4673     0    5451948        758     0     123624     0
>        2340     0    6001986        372     0     138168     0
>        3750     0    6150610        604     0     140996     0
> %%%
>
> sk/udp works right:
> %%%
>              input        (Total)           output
>     packets  errs      bytes    packets  errs      bytes colls
>        8638     0   12384676       1440     0     293062     0
>        8636     0   12415646       1439     0     293708     0
>        8637     0   12415646       1441     0     293708     0
>        8637     0   12415646       1439     0     293708     0
>        8637     0   12417160       1440     0     293708     0
>        8636     0   12413162       1439     0     293506     0
>        8637     0   12414132       1439     0     293708     0
>        8636     0   12417160       1440     0     293708     0
>        8637     0   12415646       1439     0     293708     0
>        8636     0   12417160       1440     0     293708     0
>        8637     0   12414676       1439     0     293506     0
> %%%
>
> sk is under ~5.2 with latency/throughput/efficiency optimizations
> that don't have much effect here.
>
> Bruce
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"

-- 
/"\  Best regards,                      | [EMAIL PROTECTED]
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | [EMAIL PROTECTED]
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

Attachment: pgpQueEjCj0Ad.pgp
Description: PGP signature

Reply via email to