+1 to the pacemaker/lustre startup problem after unexpected reboot (power loss in my case). rocky9.5 + lustre(2.16.52) + pacemaker(2.1.8) + corosync(3.1.8) + pcs(0.11.8)
"pcs status" afer "pcs cluster start --all" shows the following errors: Failed Resource Actions: * lustre-mgt start on mds2-ha could not be executed (Timed Out: Resource agent did not complete within 20s) at Thu Mar 13 08:20:21 2025 after 20.001s * lustre-mdt00 start on mds2-ha could not be executed (Timed Out: Resource agent did not complete within 20s) at Thu Mar 13 08:20:21 2025 after 20.002s * lustre-mdt01 start on mds2-ha could not be executed (Timed Out: Resource agent did not complete within 20s) at Thu Mar 13 08:20:01 2025 after 20.001s * lustre-mgt start on mds1-ha could not be executed (Timed Out: Resource agent did not complete within 20s) at Thu Mar 13 08:20:01 2025 after 20.002s * lustre-mdt00 start on mds1-ha could not be executed (Timed Out: Resource agent did not complete within 20s) at Thu Mar 13 08:20:01 2025 after 20.001s * lustre-mdt01 start on mds1-ha could not be executed (Timed Out: Resource agent did not complete within 20s) at Thu Mar 13 08:20:21 2025 after 20.001s /var/log/messages: Mar 13 08:20:01 mds1 Lustre(lustre-mgt)[4213]: INFO: Starting to mount /dev/mapper/mgt Mar 13 08:20:01 mds1 Lustre(lustre-mdt00)[4224]: INFO: Starting to mount /dev/mapper/mdt00 Mar 13 08:20:01 mds1 kernel: LDISKFS-fs warning (device dm-2): ldiskfs_multi_mount_protect:334: MMP interval 42 higher than expected, please wait. Mar 13 08:20:01 mds1 kernel: LDISKFS-fs warning (device dm-3): ldiskfs_multi_mount_protect:334: MMP interval 42 higher than expected, please wait. Mar 13 08:20:21 mds1 kernel: LDISKFS-fs warning (device dm-2): ldiskfs_multi_mount_protect:338: MMP startup interrupted, failing mount Mar 13 08:20:21 mds1 kernel: LustreError: 4222:0:(osd_handler.c:8348:osd_mount()) MGS-osd: can't mount /dev/mapper/mgt: -110 Mar 13 08:20:21 mds1 kernel: LustreError: 4222:0:(obd_config.c:777:class_setup()) setup MGS-osd failed (-110) Mar 13 08:20:21 mds1 kernel: LustreError: 4222:0:(obd_mount.c:193:lustre_start_simple()) MGS-osd setup error -110 Mar 13 08:20:21 mds1 kernel: LustreError: 4222:0:(tgt_mount.c:2203:server_fill_super()) Unable to start osd on /dev/mapper/mgt: -110 Mar 13 08:20:21 mds1 kernel: LustreError: 4222:0:(super25.c:170:lustre_fill_super()) llite: Unable to mount <unknown>: rc = -110 Mar 13 08:20:21 mds1 pacemaker-controld[3039]: error: Result of start operation for lustre-mgt on mds1-ha: Timed Out after 20s (Resource agent did not complete within 20s) Mar 13 08:20:21 mds1 kernel: LDISKFS-fs warning (device dm-3): ldiskfs_multi_mount_protect:338: MMP startup interrupted, failing mount The only way to proceed is to stop HA-cluster (and sometimes it just didnot stop - had to reset the server), manually mount mgt/mdt/ost, unmount, start HA-cluster. Same for both ldiskfs/zfs(2.2.7) backends. Another problem is the following error: # pcs resource describe ocf:lustre:Lustre Error: Unable to process agent 'ocf:lustre:Lustre' as it implements unsupported OCF version '1.0.1', supported versions are: '1.0', '1.1' Error: Errors have occurred, therefore pcs is unable to continue Was thinking to increase some lustre resource agent timeouts (because it seems start timeout = 20s, MMP interval complains at 42), but it seems not possible because of pcs' error above. Thanks, Alex чт, 6 мар. 2025 г. в 19:20, Cameron Harr via lustre-discuss <lustre-discuss@lists.lustre.org>: > > To add to this, instead of issuing a straight reboot, I prefer running > 'pcs stonith fence <node>' which will fail over resources appropriately > AND reboot the node (if doable) or otherwise power it off. The advantage > to doing it this way is that it keeps Pacemaker in-the-know about the > state of the node so it doesn't (usually) shoot it as it's trying to > boot back up. When you're doing maintenance on a node without letting > Pacemaker know about it, results can be unpredictable. > > Cameron > > On 3/5/25 2:12 PM, Laura Hild via lustre-discuss wrote: > > I'm not sure what to say about how Pacemaker *should* behave, but I *can* > > say I virtually never try to (cleanly) reboot a host from which I have not > > already evacuated all resources, e.g. with `pcs node standby` or by putting > > Pacemaker in maintenance mode and unmounting/exporting everything manually. > > If I can't evacuate all resources and complete a lustre_rmmod, the host is > > getting power-cycled. > > > > So maybe I can say, my guess would be that in the host's shutdown process, > > stopping the Pacemaker service happens before filesystems are unmounted, > > and that Pacemaker doesn't want to make an assumption whether its own > > shut-down means it should standby or initiate maintenance mode, and > > therefore the other host ends up knowing only that its partner has > > disappeared, while the filesystems have yet to be unmounted. > > _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org