What error do you get when you run "modprobe lnet"? --Rick
On 9/26/23, 12:29 PM, "lustre-discuss on behalf of Jan Andersen" <[email protected] <mailto:[email protected]> on behalf of [email protected] <mailto:[email protected]>> wrote: I have come a bit further with this problem - it seems the lnet module can't load: [root@rocky8 lustre-release]# depmod lnet depmod: ERROR: Bad version passed lnet I deleted the VMs and reinstalled Rocky 8.8, then built lustre 2.15.3 and installed it, everything without any error messages. I haven't been able to find any indication of what this message means through google, but I assume it would mean that the kernel source doesn't match the running kernel? But how well must they match? This is my running kernel: [root@rocky8 lustre]# uname -r 4.18.0-477.10.1.el8_8.x86_64 And this is the kernel source: [root@rocky8 lustre]# ll /usr/src/kernels total 4 drwxr-xr-x. 23 root root 4096 Sep 26 12:34 4.18.0-477.27.1.el8_8.x86_64/ IOW, they diverge just after '477.' - is that the problem? /jan Hi, I've built and installed lustre on two VirtualBoxes running Rocky 8.8 and formatted one as the MGS/MDS and the other as OSS, following a presentation from Oak Ridge National Laboratory: "Creating a Lustre Test System from Source with Virtual Machines" (sorry, no link; it was a while ago I downloaded them). I can mount the filesystems on the MDS, but when I try from the OSS, it just times out - from dmesg: [root@oss1 log]# dmesg | grep -i lustre [ 564.028680] Lustre: Lustre: Build Version: 2.15.58_42_ga54a206 [ 625.567672] LustreError: 15f-b: lustre-OST0000: cannot register this server with the MGS: rc = -110. Is the MGS running? [ 625.567767] LustreError: 1789:0:(tgt_mount.c:2216:server_fill_super()) Unable to start targets: -110 [ 625.567851] LustreError: 1789:0:(tgt_mount.c:1752:server_put_super()) no obd lustre-OST0000 [ 625.567894] LustreError: 1789:0:(tgt_mount.c:132:server_deregister_mount()) lustre-OST0000 not registered [ 625.588244] Lustre: server umount lustre-OST0000 complete [ 625.588251] LustreError: 1789:0:(tgt_mount.c:2365:lustre_tgt_fill_super()) Unable to mount (-110) Both 'nmap' and 'netstat -nap' show that there is nothing listening on port 988: [root@mds ~]# netstat -nap | grep -i listen tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 806/sshd tcp6 0 0 :::111 :::* LISTEN 1/systemd tcp6 0 0 :::22 :::* LISTEN 806/sshd What should be listening on 988? /jan _______________________________________________ lustre-discuss mailing list [email protected] <mailto:[email protected]> https://urldefense.us/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwICAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=SpEwA4Pnyq7nH7aMGq8KpA&m=CgNxrHlVi8E080Wn9FedFf9aFiNoDLgThFJTOZPuDDQhPM4NButKWaORGrnA5Wpp&s=8Km2w08u3C_u5IhtX97HQ8K535wZx5OcHElSsUbsNCA&e= <https://urldefense.us/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwICAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=SpEwA4Pnyq7nH7aMGq8KpA&m=CgNxrHlVi8E080Wn9FedFf9aFiNoDLgThFJTOZPuDDQhPM4NButKWaORGrnA5Wpp&s=8Km2w08u3C_u5IhtX97HQ8K535wZx5OcHElSsUbsNCA&e=> _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
