On 9/15/20 11:35 AM, Alexandre DERUMIER wrote:
> Hi,
> 
> I have finally reproduce it !
> 
> But this is with a corosync restart in cron each 1 minute, on node1
>
> Then: lrm was stuck for too long for around 60s and softdog have been 
> triggered on multiple other nodes.
> 
> here the logs with full corosync debug at the time of last corosync restart. 
> 
> node1 (where corosync is restarted each minute)
> https://gist.github.com/aderumier/c4f192fbce8e96759f91a61906db514e
> 
> node2
> https://gist.github.com/aderumier/2d35ea05c1fbff163652e564fc430e67
> 
> node5
> https://gist.github.com/aderumier/df1d91cddbb6e15bb0d0193ed8df9273
> 
> I'll prepare logs from the previous corosync restart, as the lrm seem to be 
> already stuck before.

Yeah that would be good, as yes the lrm seems to get stuck at around 10:46:21

> Sep 15 10:47:26 m6kvm2 pve-ha-lrm[3736]: loop take too long (65 seconds)


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to