Hello,
some questions regarding network connection setup for ethernet based
clients.
We have a working Luste installation with two MDS servers and seven
ODS systems connected to our cluster via omnipath/ib. This part is
working fine.
Now we want to add some clients that have only a ethernet connection
to the Lustre servers (with the ethernet cards in the servers).
Our MDS and ODS servers have the following lnet setup:
net:
- net type: lo
local NI(s):
- nid: 0@lo
status: up
- net type: o2ib
local NI(s):
- nid: 10.149.0.XXX@o2ib # IP of the local ib interface
status: up
interfaces:
0: ib0
- net type: tcp
local NI(s):
- nid: xxx.xxx.5.XXX@tcp # IP of the local ethernet interface
status: up
interfaces:
0: eno1
Our test ethernet node:
lnetctl net show
net:
- net type: lo
local NI(s):
- nid: 0@lo
status: up
- net type: tcp
local NI(s):
- nid: xxx.xxx.4.XXX@tcp # same subnet as above, it is a /23
status: up
interfaces:
0: enp225s0f0
So far so good.
I'm able to lnetctl ping in both directions:
Ping the client:
lnetctl ping xxx.xxx.4.xxx@tcp
ping:
- primary nid: xxx.xxx.4.xxx@tcp
Multi-Rail: True
peer ni:
- nid: xxx.xxx.4.xxx@tcp
Ping the server:
lnetctl ping xxx.xxx.5.xxx@tcp
ping:
- primary nid: xxx.xxx.5.xxx@tcp
Multi-Rail: True
peer ni:
- nid: 10.149.0.183@o2ib
- nid: xxx.xxx.5.xxx@tcp
But the mount fails, output from dmesg (are there other sources of
debug information?):
LustreError: 25758:0:(ldlm_lib.c:494:client_obd_setup()) can't add initial
connection
LustreError: 25758:0:(obd_config.c:559:class_setup()) setup
scratch-MDT0000-mdc-ffff8b63003d4000 failed (-2)
LustreError: 25758:0:(obd_config.c:1835:class_config_llog_handler())
MGCxxx.xxx.5.xxx@tcp: cfg command failed: rc = -2
Lustre: cmd=cf003 0:scratch-MDT0000-mdc 1:scratch-MDT0000_UUID
2:10.149.0.183@o2ib
LustreError: 15c-8: MGC160.45.5.246@tcp: The configuration from log
'scratch-client' failed (-2). This may be the result of communication errors
between this node and the MGS, a bad configuration, or other errors. See the
syslog for more information.
LustreError: 25734:0:(obd_config.c:610:class_cleanup()) Device 3 not setup
Lustre: Unmounted scratch-client
LustreError: 25734:0:(obd_mount.c:1604:lustre_fill_super()) Unable to mount
(-2)
Does some one have some ideas or reference documentation on this topic?
Do I need some "lnetctl route" stuff?
Do I need some "lnetctl peer add ..." to make the Lustre servers and
clients known to each other?
Any hints are welcome!
Kind regards,
Philipp
--
Philipp Grau | Freie Universitaet Berlin
[email protected] | FU-IT - Infrastruktur
Tel: +49 (30) 838 56583 | Fabeckstr. 32
Fax: +49 (30) 838 56721 | 14195 Berlin
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org