|
Hi, all
I am setting up Lustre on local cluster based on infiniband HDR(200Gb, single port) network. I could successfuly setup Lustre using 5 servers(1-MDS, 4-OSS)
Even though I have verified the IB HDR bandwidth(200Gb/s) with 'ib_read_bw' or 'ib_write_bw' tools, (I used CPU#0 for the test) when I run LNet-Selftest between servers, it only shows around 100Gb/s(around 12GB/s, just half of maximum bandwidth) (~12GB/s in case of read, ~13GB/s for write test)
so I tried to change LNet tunables and fixed the CPT: "[0]" for IB with the following kernel module options. but it doesn't show big difference in lnet-self test.
It seems like LNet is not fully compatible with HDR or PCIe gen4 interfaces. Is there anyone who can give me advice why the LNet performance is not reaching HDR BW? or, Are there specific options or tunables that I have to modify? Please share your experience if you have setup Lustre with HDR network.
Thank you.
-----------lustre.conf--------------- options lnet networks=o2ib0(ib0)[0]
-----------ko2iblnd.conf------------- options ko2iblnd peer_credits=256 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=0 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=1
----------lnet tunables-------------- tunables:
I listed up some HW/SW environment that I used.
------------------------------------------------------------ [Environment] - CPUs: Epyc 7302 *2 socket, supports PCIe Gen4 - OS: CentOS 8.3 (kernel: 4.18.0-240.1.1.el8_lustre.x86_64) - Lustre: 2.14.0 (downloaded from repository https://downloads.whamcloud.com/public/lustre/lustre-2.14.0-ib/ ) - OFED driver: tried 2 different versions MLNX_OFED_LINUX-5.2-1.0.4.0, MLNX_OFED_LINUX-5.4-1.0.3.0
Finally, I used the following LNet selftest script for test. I tried to change concurrency, but the bandwidth is saturated when CN>=4
---------------------------------------------------------- # Concurrency # The LST "from" list -- e.g. Lustre clients. Space separated list of NIDs. ### End of customisation. export LST_SESSION=$$
|
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
