+1 to the pacemaker/lustre startup problem after unexpected reboot
(power loss in my case).
rocky9.5 + lustre(2.16.52) + pacemaker(2.1.8) + corosync(3.1.8) + pcs(0.11.8)

"pcs status" afer "pcs cluster start --all" shows the following errors:
Failed Resource Actions:
  * lustre-mgt start on mds2-ha could not be executed (Timed Out:
Resource agent did not complete within 20s) at Thu Mar 13 08:20:21
2025 after 20.001s
  * lustre-mdt00 start on mds2-ha could not be executed (Timed Out:
Resource agent did not complete within 20s) at Thu Mar 13 08:20:21
2025 after 20.002s
  * lustre-mdt01 start on mds2-ha could not be executed (Timed Out:
Resource agent did not complete within 20s) at Thu Mar 13 08:20:01
2025 after 20.001s
  * lustre-mgt start on mds1-ha could not be executed (Timed Out:
Resource agent did not complete within 20s) at Thu Mar 13 08:20:01
2025 after 20.002s
  * lustre-mdt00 start on mds1-ha could not be executed (Timed Out:
Resource agent did not complete within 20s) at Thu Mar 13 08:20:01
2025 after 20.001s
  * lustre-mdt01 start on mds1-ha could not be executed (Timed Out:
Resource agent did not complete within 20s) at Thu Mar 13 08:20:21
2025 after 20.001s

/var/log/messages:
Mar 13 08:20:01 mds1 Lustre(lustre-mgt)[4213]: INFO: Starting to mount
/dev/mapper/mgt
Mar 13 08:20:01 mds1 Lustre(lustre-mdt00)[4224]: INFO: Starting to
mount /dev/mapper/mdt00
Mar 13 08:20:01 mds1 kernel: LDISKFS-fs warning (device dm-2):
ldiskfs_multi_mount_protect:334: MMP interval 42 higher than expected,
please wait.
Mar 13 08:20:01 mds1 kernel: LDISKFS-fs warning (device dm-3):
ldiskfs_multi_mount_protect:334: MMP interval 42 higher than expected,
please wait.
Mar 13 08:20:21 mds1 kernel: LDISKFS-fs warning (device dm-2):
ldiskfs_multi_mount_protect:338: MMP startup interrupted, failing
mount
Mar 13 08:20:21 mds1 kernel: LustreError:
4222:0:(osd_handler.c:8348:osd_mount()) MGS-osd: can't mount
/dev/mapper/mgt: -110
Mar 13 08:20:21 mds1 kernel: LustreError:
4222:0:(obd_config.c:777:class_setup()) setup MGS-osd failed (-110)
Mar 13 08:20:21 mds1 kernel: LustreError:
4222:0:(obd_mount.c:193:lustre_start_simple()) MGS-osd setup error
-110
Mar 13 08:20:21 mds1 kernel: LustreError:
4222:0:(tgt_mount.c:2203:server_fill_super()) Unable to start osd on
/dev/mapper/mgt: -110
Mar 13 08:20:21 mds1 kernel: LustreError:
4222:0:(super25.c:170:lustre_fill_super()) llite: Unable to mount
<unknown>: rc = -110
Mar 13 08:20:21 mds1 pacemaker-controld[3039]: error: Result of start
operation for lustre-mgt on mds1-ha: Timed Out after 20s (Resource
agent did not complete within 20s)
Mar 13 08:20:21 mds1 kernel: LDISKFS-fs warning (device dm-3):
ldiskfs_multi_mount_protect:338: MMP startup interrupted, failing
mount

The only way to proceed is to stop HA-cluster (and sometimes it just
didnot stop  - had to reset the server), manually mount mgt/mdt/ost,
unmount, start HA-cluster.
Same for both ldiskfs/zfs(2.2.7) backends.

Another problem is the following error:
# pcs resource describe ocf:lustre:Lustre
Error: Unable to process agent 'ocf:lustre:Lustre' as it implements
unsupported OCF version '1.0.1', supported versions are: '1.0', '1.1'
Error: Errors have occurred, therefore pcs is unable to continue

Was thinking to increase some lustre resource agent timeouts (because
it seems start timeout = 20s, MMP interval complains at 42), but it
seems not possible because of pcs' error above.

Thanks,
Alex



чт, 6 мар. 2025 г. в 19:20, Cameron Harr via lustre-discuss
<lustre-discuss@lists.lustre.org>:
>
> To add to this, instead of issuing a straight reboot, I prefer running
> 'pcs stonith fence <node>' which will fail over resources appropriately
> AND reboot the node (if doable) or otherwise power it off. The advantage
> to doing it this way is that it keeps Pacemaker in-the-know about the
> state of the node so it doesn't (usually) shoot it as it's trying to
> boot back up. When you're doing maintenance on a node without letting
> Pacemaker know about it, results can be unpredictable.
>
> Cameron
>
> On 3/5/25 2:12 PM, Laura Hild via lustre-discuss wrote:
> > I'm not sure what to say about how Pacemaker *should* behave, but I *can* 
> > say I virtually never try to (cleanly) reboot a host from which I have not 
> > already evacuated all resources, e.g. with `pcs node standby` or by putting 
> > Pacemaker in maintenance mode and unmounting/exporting everything manually. 
> >  If I can't evacuate all resources and complete a lustre_rmmod, the host is 
> > getting power-cycled.
> >
> > So maybe I can say, my guess would be that in the host's shutdown process, 
> > stopping the Pacemaker service happens before filesystems are unmounted, 
> > and that Pacemaker doesn't want to make an assumption whether its own 
> > shut-down means it should standby or initiate maintenance mode, and 
> > therefore the other host ends up knowing only that its partner has 
> > disappeared, while the filesystems have yet to be unmounted.
> >
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to