Hi,
We have a tcp1 config, but lnet.conf looks like this:
net:
- net type: tcp1
local NI(s):
- nid: <IP>@tcp1
status: up
interfaces:
0: eth0
replace <IP> with NID IP. I guess you need "- net type" instead of just
"- net".
Cheers,
Hans Henrik
On 17/09/2024 11.50, Steve Brasier wrote:
Hi.
I've got an /etc/lnet.conf on a RockyLinux 9.4 client running
lustre 2.15.5-1.el9 which has this lnet.conf:
[root@stg-login-0 rocky]# cat /etc/lnet.conf
net:
- net: tcp1
interfaces:
0: eth0
Running systemctl start lnet just hangs forever, with the syslog just
showing
Sep 13 15:31:35 stg-login-0 systemd[1]: Starting lnet management...
and its actually the below which hangs:
[root@stg-login-0 rocky]# /usr/sbin/lnetctl import /etc/lnet.conf
i.e. module load and lnet configure work OK.
However it looks like it autoconfigured an interface on tcp (not tcp1):
[root@stg-login-0 rocky]# lnetctl net show
net:
- net type: lo
local NI(s):
- nid: 0@lo
status: up
- net type: tcp
local NI(s):
- nid: 10.179.2.45@tcp
status: up
So:
1. How can I debug this hanging please?
2. Do the client and server NIDs need to be in the same IPv4 subnet? I
have a client NID of 10.179.2.45@tcp1 and a server NID
of 10.167.128.1@tcp1, with IP routing between them such that icmp ping
works between them, is that OK?
many thanks for any help!
http://stackhpc.com/
Please note I work Tuesday to Friday.
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org