Hi, We also stumbled into this. It is described here:
https://jira.whamcloud.com/browse/LU-11840 The best workaround we found was to disable discovery on 2.12 clients: # lnetctl set discovery 0 Cheers, Hans Henrik On 04.07.2020 09.26, Tung-Han Hsieh wrote: > Dear All, > > We have Lustre servers (MDS, OSS) with Lustre-2.10.7 installed, with > both tcp and o2ib interfaces: > > [ 193.016516] Lustre: Lustre: Build Version: 2.10.7 > [ 193.486408] LNet: Added LNI 192.168.62.151@o2ib [8/256/0/180] > [ 193.538200] LNet: Added LNI 192.168.60.151@tcp [8/256/0/180] > [ 193.538372] LNet: Accept secure, port 988 > > We have several clients, all with Lustre-2.12.4. Some have both tcp > and o2ib interfaces. These clients can mount Lustre server with o2ib > interface without any problem, i.e., > > mount -t lustre -o flock 192.168.62.151@o2ib:/chome /home > (this is OK) > > However, we have another client with Lustre-2.12.4, too, which only > has tcp interface. It cannot mount server through tcp interface: > > mount -t lustre -o flock 192.168.60.151@tcp:/chome /home > (this is failed with "Input/output error, Is the MGS running ?") > > Checking the dmesg message of this client, it reads: > > ========================================================================= > [3106477.006512] LNetError: > 15970:0:(lib-move.c:1999:lnet_handle_find_routed_path()) no route to > 192.168.62.151@o2ib from <?> > [3106483.142436] LustreError: > 122230:0:(mgc_request.c:249:do_config_log_add()) MGC192.168.60.151@tcp: > failed processing log, type 1: rc = -5 > [3106492.293968] LustreError: 122238:0:(mgc_request.c:599:do_requeue()) > failed processing log: -5 > [3106513.861586] LustreError: 15c-8: MGC192.168.60.151@tcp: The configuration > from log 'chome-client' failed (-5). This may be the result of communication > errors between this node and the MGS, a bad configuration, or other errors. > See the syslog for more information. > [3106513.862052] Lustre: Unmounted chome-client > [3106513.862281] LustreError: 122230:0:(obd_mount.c:1608:lustre_fill_super()) > Unable to mount (-5) > ========================================================================= > > Surprisingly that, although I have specified the tcp interface to > mount, but Lustre itself still tries to mount with o2ib interface. > > I also tested whether LNet works or not. > (Server NID: 192.168.60.151@tcp, Client NID: 192.168.60.30@tcp) > > From the server side: > # /opt/lustre/sbin/lctl ping 192.168.60.30 > 12345-0@lo > 12345-192.168.60.30@tcp > > From the client side: > # /opt/lustre/sbin/lctl ping 192.168.60.151 > 12345-0@lo > 12345-192.168.62.151@o2ib > 12345-192.168.60.151@tcp > > Hence it looks fine. > > The module options (/etc/modprobe.d/lustre.conf) for server and client are: > - Server: > options lnet networks="o2ib0(ib0),tcp0(eth0)" > - Client: > options lnet networks="tcp0(eth0)" > > The building options for server and client are: > - Server (Lustre-2.10.7): > ./configure --prefix=/opt/lustre \ > --with-linux=<linux_kernel_path> \ > --with-o2ib=<compat-rdma-path> > > - Client (Lustre-2.12.4): > ./configure --prefix=/opt/lustre \ > --with-linux=<linux_kernel_path> \ > --with-o2ib=no \ > --disable-server > > Could anyone suggest how to solve this problem ? > > > Thanks very much. > > > T.H.Hsieh > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
