Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown

Alexandre DERUMIER Tue, 15 Sep 2020 03:16:42 -0700

here the previous restart log

node1 -> corosync restart at  10:46:15
-----
https://gist.github.com/aderumier/0992051d20f51270ceceb5b3431d18d7



node2
-----
https://gist.github.com/aderumier/eea0c50fefc1d8561868576f417191ba



node5
------
https://gist.github.com/aderumier/f2ce1bc5a93827045a5691583bbc7a37

----- Mail original -----
De: "Thomas Lamprecht" <t.lampre...@proxmox.com>
À: "aderumier" <aderum...@odiso.com>, "Proxmox VE development discussion" 
<pve-devel@lists.proxmox.com>
Cc: "dietmar" <diet...@proxmox.com>
Envoyé: Mardi 15 Septembre 2020 11:46:51
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown

On 9/15/20 11:35 AM, Alexandre DERUMIER wrote: 
> Hi, 
> 
> I have finally reproduce it ! 
> 
> But this is with a corosync restart in cron each 1 minute, on node1 
> 
> Then: lrm was stuck for too long for around 60s and softdog have been 
> triggered on multiple other nodes. 
> 
> here the logs with full corosync debug at the time of last corosync restart. 
> 
> node1 (where corosync is restarted each minute) 
> https://gist.github.com/aderumier/c4f192fbce8e96759f91a61906db514e 
> 
> node2 
> https://gist.github.com/aderumier/2d35ea05c1fbff163652e564fc430e67 
> 
> node5 
> https://gist.github.com/aderumier/df1d91cddbb6e15bb0d0193ed8df9273 
> 
> I'll prepare logs from the previous corosync restart, as the lrm seem to be 
> already stuck before. 

Yeah that would be good, as yes the lrm seems to get stuck at around 10:46:21 

> Sep 15 10:47:26 m6kvm2 pve-ha-lrm[3736]: loop take too long (65 seconds) 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown

Reply via email to