It does seem extreme for data center IB latency but it may not be in the data center. The LNet write should take 2 RTT latencies, and 3 for reads so you could double/triple those times plus any overhead.
Carl can you clarify if you are using QDR IB and/or any campus or wide area IB extenders? Jeremy On Mon, Feb 20, 2012 at 8:14 PM, Kevin Van Maren <[email protected]>wrote: > While it's possible the default credits (8 as I recall) is not enough for > peak performance, it seems to me that something else is wrong: > Each 1MB RPC should take ~300uS (based on MPI/IB xfer rates of 3.2+ GB/s), > so that means there is another 400uS overhead per RPC that is not masked > with 8 concurrent RPCs, in addition to the overhead masked when he > increased concurrency. This is crazy, with a 1uS network latency. > > Unless the RPCs are are being broken into tiny chunks or something -- does > lnet do single-page xfers and not use a rendezvous protocol for full-sized > RPCs? It definitely seems that something is broken when o2iblnd gets ~1/3 > of the MPI BW, given that lnd was designed for high-speed xfers. > > The max_rpcs_in_flight normally needs tweaking to improve disk > concurrency, where a single client needs to drive a high queue depth. Still > finding it hard to believe 8 1MB concurrent RPCs can't handle the network. > > Kevin > > > On Feb 20, 2012, at 5:44 PM, "Jeremy Filizetti" > <<[email protected]><[email protected]> > [email protected]> wrote: > > Am I reading your earlier post correctly that you have a single server > acting as the MDS and OSS? Have you changed your peer_credits and credits > for ko2iblnd kernel module on the server and client? You also mentioned > changing osc.*.max_dirty_mb, you probably need to adjust > osc.*.max_rpcs_in_flight as well. Can you post your rpc stats "lctl > get_param osc.*.rpc_stats"? I would guess they are bunching up around 7-8 > if your running with the default max_rpcs_in_flight=8. > > Jeremy > > > On Mon, Feb 20, 2012 at 4:59 PM, Barberi, Carl E > <<[email protected]><[email protected]><[email protected]> > [email protected]> wrote: > >> Thank you. This did help. With the concurrency set to 16, I was able >> to get a max write speed of 1138 MB/s. Any ideas on how we can make that >> faster, though? Ideally, we’d like to get to 1.5 GB/s.**** >> >> ** ** >> >> Carl**** >> >> ** ** >> >> *From:* Liang Zhen [mailto: <[email protected]> >> <[email protected]><[email protected]> >> [email protected]] >> *Sent:* Thursday, February 16, 2012 1:45 AM >> *To:* Barberi, Carl E >> *Cc:* ' >> <[email protected]><[email protected]><[email protected]> >> [email protected]' >> *Subject:* EXTERNAL: Re: [Lustre-discuss] LNET Performance Issue**** >> >> ** ** >> >> Hi, I assume you are using "size=1M" for brw test right? performance >> could increase if you set "concurrency" while adding brw test, i.e: >> --concurrency=16**** >> >> ** ** >> >> Liang**** >> >> ** ** >> >> On Feb 16, 2012, at 3:30 AM, Barberi, Carl E wrote:**** >> >> >> >> **** >> >> We are having issues with LNET performance over Infiniband. We have a >> configuration with a single MDT and six (6) OSTs. The Lustre client I am >> using to test is configured to use 6 stripes (lfs setstripe -c 6 >> /mnt/lustre). When I perform a test using the following command:**** >> >> **** >> >> dd if=/dev/zero of=/mnt/lustre/test.dat bs=1M count=2000* >> *** >> >> **** >> >> I typically get a write rate of about 815 MB/s, and we never exceed 848 >> MB/s. When I run obdfilter-survey, we easily get about 3-4GB/s write >> speed, but when I run a series of lnet-selftests, the read and write rates >> range from 850MB/s – 875MB/s max. I have performed the following >> optimizations to increase the data rate:**** >> >> **** >> >> On the Client:**** >> >> lctl set_param osc.*.checksums=0**** >> >> lctl set_param osc.*.max_dirty_mb=256**** >> >> **** >> >> On the OSTs**** >> >> lctl set_param obdfilter.*.writethrough_cache_enable=0**** >> >> lctl set_param obdfilter.*.read_cache_enable=0**** >> >> **** >> >> echo 4096 > /sys/block/<devices>/queue/nr_requests**** >> >> **** >> >> I have also loaded the ib_sdp module, which also brought an increase in >> speed. However, we need to be able to record at no less than 1GB/s, which >> we cannot achieve right now. Any thoughts on how I can optimize LNET, >> which clearly seems to be the bottleneck?**** >> >> **** >> >> Thank you for any help you can provide,**** >> >> Carl Barberi**** >> >> _______________________________________________ >> Lustre-discuss mailing list >> <[email protected]> >> <[email protected]><[email protected]> >> [email protected] >> >> <http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss**** >> >> ** ** >> >> _______________________________________________ >> Lustre-discuss mailing list >> <[email protected]> >> <[email protected]><[email protected]> >> [email protected] >> >> <http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> > _______________________________________________ > Lustre-discuss mailing list > <[email protected]> <[email protected]> > [email protected] > <http://lists.lustre.org/mailman/listinfo/lustre-discuss><http://lists.lustre.org/mailman/listinfo/lustre-discuss> > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > Confidentiality Notice: This e-mail message, its contents and any > attachments to it are confidential to the intended recipient, and may > contain information that is privileged and/or exempt from disclosure under > applicable law. If you are not the intended recipient, please immediately > notify the sender and destroy the original e-mail message and any > attachments (and any copies that may have been made) from your system or > otherwise. Any unauthorized use, copying, disclosure or distribution of > this information is strictly prohibited. Email addresses that end with a > “-c” identify the sender as a Fusion-io contractor. > >
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
