Right, when you format a Lustre target, it registers itself with the MGS. Part 
of that registration is telling the MGS what NIDs the target can be reached at 
(the MGS, in turn, passes this information to the clients). If you add or 
delete NIDs then you need to ensure that information is updated with the MGS. 
This is the procedure I linked in the Ops manual.

lctl list_nids does not tell you which NIDs are registered with the MGS. It 
only tells you what NIDs are currently defined on the local host. There is some 
way to inspect the config log to see what NIDs are in there, but I can’t recall 
the specifics off the top of my head.

Chris Horn

From: lustre-discuss <[email protected]> on behalf of 
Laura Hild via lustre-discuss <[email protected]>
Date: Thursday, November 30, 2023 at 8:22 AM
To: Philipp Grau <[email protected]>
Cc: Lustre User Discussion Mailing List <[email protected]>
Subject: Re: [lustre-discuss] Lustre mds/ods Server with IB/omnipath and 
Ethernet clients (dual homed?)
Hi Philipp-

I don't do this a ton so I'm hazy, but do you set nids or nets when you 
mkfs.lustre?  So then maybe you have to tunefs those in when you add more?

-Laura


________________________________________
Od: lustre-discuss <[email protected]> v imenu Philipp 
Grau <[email protected]>
Poslano: sreda, 29. november 2023 06:37
Za: [email protected]
Zadeva: [lustre-discuss] Lustre mds/ods Server with IB/omnipath and Ethernet 
clients (dual homed?)

Hello,

some questions regarding network connection setup for ethernet based
clients.

We have a working Luste installation with two MDS servers and seven
ODS systems connected to our cluster via omnipath/ib. This part is
working fine.

Now we want to add some clients that have only a ethernet connection
to the Lustre servers (with the ethernet cards in the servers).

Our MDS and ODS servers have the following lnet setup:

net:
    - net type: lo
      local NI(s):
        - nid: 0@lo
          status: up
    - net type: o2ib
      local NI(s):
        - nid: 10.149.0.XXX@o2ib # IP of the local ib interface
          status: up
          interfaces:
              0: ib0
    - net type: tcp
      local NI(s):
        - nid: xxx.xxx.5.XXX@tcp # IP of the local ethernet interface
          status: up
          interfaces:
              0: eno1


Our test ethernet node:

lnetctl net show
net:
    - net type: lo
      local NI(s):
        - nid: 0@lo
          status: up
    - net type: tcp
      local NI(s):
        - nid: xxx.xxx.4.XXX@tcp # same subnet as above, it is a /23
          status: up
          interfaces:
              0: enp225s0f0

So far so good.

I'm able to lnetctl ping in both directions:

Ping the client:

lnetctl ping xxx.xxx.4.xxx@tcp
ping:
    - primary nid: xxx.xxx.4.xxx@tcp
      Multi-Rail: True
      peer ni:
        - nid: xxx.xxx.4.xxx@tcp

Ping the server:

lnetctl ping xxx.xxx.5.xxx@tcp
ping:
    - primary nid: xxx.xxx.5.xxx@tcp
      Multi-Rail: True
      peer ni:
        - nid: 10.149.0.183@o2ib
        - nid: xxx.xxx.5.xxx@tcp

But the mount fails, output from dmesg (are there other sources of
debug information?):

LustreError: 25758:0:(ldlm_lib.c:494:client_obd_setup()) can't add initial 
connection
LustreError: 25758:0:(obd_config.c:559:class_setup()) setup 
scratch-MDT0000-mdc-ffff8b63003d4000 failed (-2)
LustreError: 25758:0:(obd_config.c:1835:class_config_llog_handler()) 
MGCxxx.xxx.5.xxx@tcp: cfg command failed: rc = -2
Lustre:    cmd=cf003 0:scratch-MDT0000-mdc  1:scratch-MDT0000_UUID  
2:10.149.0.183@o2ib
LustreError: 15c-8: MGC160.45.5.246@tcp: The configuration from log 
'scratch-client' failed (-2). This may be the result of communication errors 
between this node and the MGS, a bad configuration, or other errors. See the 
syslog for more information.
LustreError: 25734:0:(obd_config.c:610:class_cleanup()) Device 3 not setup
Lustre: Unmounted scratch-client
LustreError: 25734:0:(obd_mount.c:1604:lustre_fill_super()) Unable to mount  
(-2)

Does some one have some ideas or reference documentation on this topic?

Do I need some "lnetctl route" stuff?

Do I need some "lnetctl peer add ..." to make the Lustre servers and
clients known to each other?

Any hints are welcome!

Kind regards,

Philipp

--
 Philipp Grau               | Freie Universitaet Berlin
 [email protected]  | FU-IT - Infrastruktur
 Tel: +49 (30) 838 56583    | Fabeckstr. 32
 Fax: +49 (30) 838 56721    | 14195 Berlin

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to