Hi Ramiro,

The invalid MR size looks like you're running into a limit with your cards 
setting up the RDMA (o2ib) LND when bringing up the network. There may be 
adjustments or workarounds for it possibly including setting map_on_demand=0 as 
an argument to the lnet module there.

And since you are using older IB hardware on a newer OS, just a heads up: we 
recently ran into an issue with connectx-3 IB cards after upgrading our 
operating systems where we found RMDA communication to be unreliable possibly 
because they often would exceed the amount of connection queue pairs they could 
create. For us, the workaround was to use the ksocklnd instead of o2iblnd. If 
you have trouble getting the o2ib lustre network driver to work with this older 
hardware due to RDMA problems, that could be a workaround although it may not 
be feasible to implement depending on your networking setup.

Best,
Jesse


________________________________________
From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of 
Ramiro Alba Queipo <ramiro.a...@upc.edu>
Sent: Thursday, February 6, 2025 3:34 AM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] Lnet not going up with InfiniHost III Lx HCA card


Hi all,

I am testing Ubuntu 24.04 (6.8.0-52-generic) client with Lustre 2.16.1 over 
Infiniband and using an old Mellanox DDR card (InfiniHost III Lx HCA).

- # ip -br a

      options lnet networks=o2ib0(ib0)

- # modprobe lnet
- # lctl network up

     LNET configure error 100: Network is down

- # tail -10 /var/log/kernel.log

     LNetError: 5071:0:(o2iblnd.c:2866:kiblnd_hdev_get_attr()) Invalid mr size: 
0xffffffffffffffff
     LNetError: 5071:0:(o2iblnd.c:3103:kiblnd_dev_failover()) Can't get device 
attributes: -22
     LNetError: 5071:0:(o2iblnd.c:3831:kiblnd_startup()) ko2iblnd: Can't 
initialize device: rc = -22
     LNetError: Error -100 starting up LNI o2ib

Lustre 2.15.0 and Ubuntu 20.04 (kernel 5.4.0-198-generic) is working fine with 
the same hardware

Can anyone give me some advice or idea to make it work?

Thans in advance
Best regards

--
Ramiro Alba

Centre Tecnològic de Tranferència de Calor
http://www.cttc.upc.edu<https://urldefense.com/v3/__http://www.cttc.upc.edu__;!!Mak6IKo!On9vgnDU5CEln4C9zazniBI1hEgioSxBPqr7Fd5blSIUQcojlPmtCAmRsP3OMqt4ZdEii93FRWH2FtVn8993JZ4Ixw$>

Escola Tècnica Superior d'Enginyeries
Industrial i Aeronàutica de Terrassa
Colom 11, E-08222, Terrassa, Barcelona, Spain
Tel: (+34) 93 739 8928
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to