Dear Lustre Community, since some time, we are not able to mount ZFS based snapshots for one filesystem anymore. Some details about the setup:
- OS: Centos 7, 3.10.0-1127.8.2.el7_lustre.x86_64 - Lustre: 2.12.5 - Servers: 2x MDT, 11xOST - Network: only TCP - MDTs share machines with OSTs. We have two separate Lustre filesystems (meteo0 and meteo1), both running on the same hardware. One works perfectly (meteo1), including mounting snapshots, for the other one (meteo0), we are not able to mount snapshots anymore. The only difference is, that we registered a changelog user on the one that is not able to mount snapshots anymore. Removing the changelog user did not change the situation. Mounting a snapshot on a client just hangs forever without anything meaningful in the system log. Once the mount process is canceled by crtl-c, the following shows up in the logs: kernel: LustreError: 21966:0:(lmv_obd.c:1415:lmv_statfs()) 7a21a072-MDT0000-mdc-ffff9988697c8800: can't stat MDS #0: rc = -11 kernel: LustreError: 21966:0:(lov_obd.c:839:lov_cleanup()) 7a21a072-clilov-ffff9988697c8800: lov tgt 0 not cleaned! deathrow=0, lovrc=1 kernel: LustreError: 21966:0:(lov_obd.c:839:lov_cleanup()) Skipped 10 previous similar messages kernel: Lustre: Unmounted 7a21a072-client kernel: LustreError: 21966:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount (-11) On the server side, mounting look just normal: kernel: Lustre: 7a21a072-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 kernel: Lustre: 7a21a072-MDT0000: nosquash_nids set to 10.153.52.[28-41]@tcp ... kernel: Lustre: 7a21a072-MDT0000: root_squash is set to 65534:65534 kernel: Lustre: 7a21a072-MDT0000: set dev_rdonly on this device A network problem seems unlikely, because the actual filesystem, of which the snapshots are made, works normal. Clients can mount it without any problems. I tried to look at debug messages on the client to verify that there is no connection problem: lctl set_param debug=+rpctrace; lctl set_param debug=+net; lctl clear lctl mark "debug start" mount -t lustre -o ro met-sv-lustre@tcp:/7a21a072 /mnt lctl mark "debug finish" lctl set_param debug=-rpctrace; lctl set_param debug=-net lctl dk > /tmp/log The connection to the MGS works, the connection to the second MDT works as well: 00000100:00080000:8.0:1595229534.555839:0:6848:0:(import.c:86:import_set_state_nolock()) 00000000e70957a2 MGS: changing import state from CONNECTING to FULL 00000100:00080000:0.0:1595229534.934793:0:6848:0:(import.c:86:import_set_state_nolock()) 00000000ba745de3 7a21a072-MDT0001_UUID: changing import state from CONNECTING to FULL But the connection to the first MDT fails, which is confusing, because the ZFS snapshot dataset is mounted on the server that is shown in the log message: 00000100:00100000:0.0:1595229534.934627:0:6848:0:(client.c:2719:ptlrpc_free_committed()) 7a21a072-MDT0000-mdc-ffff9981d3d1d800: committing for last_committed 0 gen 1 00000100:00080000:0.0:1595229534.934632:0:6848:0:(import.c:86:import_set_state_nolock()) 00000000d6a94009 7a21a072-MDT0000_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:0.0:1595229534.934635:0:6848:0:(import.c:1382:ptlrpc_connect_interpret()) recovery of 7a21a072-MDT0000_UUID on 10.153.52.30@tcp failed (-11) If I try to mount the actual filesystem, which of course runs on the very some servers, then all connections are successful. On the server, all ZFS snapshot datasets are mounted, no errors are shown. Every MDT dataset and every OST dataset has a snapshot any they are all mounted. Looking on the servers into debug messages shows some connection problems: 00000100:00080000:19.0:1595238913.879579:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e66a1357800 7a21a072-MDT0001_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:19.0:1595238913.881596:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e46fc751000 7a21a072-OST0004_UUID: changing import state from CONNECTING to FULL 00000100:00080000:19.0:1595238913.881734:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e46aa7cf000 7a21a072-OST0005_UUID: changing import state from CONNECTING to FULL 00000100:00080000:19.0:1595238913.881946:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e5f4e923000 7a21a072-OST0006_UUID: changing import state from CONNECTING to FULL 00000100:00080000:19.0:1595238913.882106:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e5f4e927800 7a21a072-OST0007_UUID: changing import state from CONNECTING to FULL 00000100:00080000:19.0:1595238913.882444:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e5692d33000 7a21a072-OST0008_UUID: changing import state from CONNECTING to FULL 00000100:00080000:19.0:1595238913.882587:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e5692d35800 7a21a072-OST0009_UUID: changing import state from CONNECTING to FULL 00000100:00080000:19.0:1595238913.883326:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e36524e4800 7a21a072-OST000a_UUID: changing import state from CONNECTING to FULL 00000100:00080000:19.0:1595238913.889291:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e36524e6800 7a21a072-MDT0000_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:19.0:1595238918.888697:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e5692d34000 7a21a072-OST0003_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:19.0:1595238918.888703:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e66a1352800 7a21a072-OST0002_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:19.0:1595238918.888707:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e46aa7c9000 7a21a072-OST0001_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:19.0:1595238918.888709:0:9139:0:(import.c:86:import_set_state_nolock()) ffff8e5fb3f2f000 7a21a072-OST0000_UUID: changing import state from CONNECTING to DISCONN Both MDTs and OST[0-3] are running on one HA server pair, the other OSTs that connect successfully are running on different server pairs. Did anyone experience comparable problems? Any suggestions that we could try next? We tried already: - Removing the changelog user with lctl changelog_deregister. This did not change anything. But I'm not sure if it really removed all changelog related information. If you register a new user, the the number of the user is incremented. After removing cl1, adding a new user results in cl2, not again in cl1. That means not all traces of cl1 have been removed. The size of the changelog, as seen by 'lctl get_param "*.*.changelog_size"' is also larger than zero after removing the changelog user. Any ideas how to completely remove the changelog users? - Running lfsck. Some things are corrected, but apparently nothing related to the snapshot problem. - Rewriting the configuration using tunefs.lustre --writeconf. Also no effect. Because the Filesystem meteo1, which shares all servers with meteo0 works perfectly and the only difference I know about is the changelog user, I would guess, that the changelog user plays a role. But that is only speculation. I would be very happy to get some useful advice from you! Robert -- Dr. Robert Redl Scientific Programmer, "Waves to Weather" (SFB/TRR165) Meteorologisches Institut Ludwig-Maximilians-Universität München Theresienstr. 37, 80333 München, Germany
signature.asc
Description: OpenPGP digital signature
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
