>>But the pve logs look ok, and there is no indication
>>that we stopped updating the watchdog. So why did the
>>watchdog trigger? Maybe an IPMI bug?

do you mean an ipmi bug on all 13 servers at the same time ?
(I also have 2 supermicro servers in this cluster, but they use same ipmi 
watchdog driver. (ipmi_watchdog)



I had same kind of with bug once (when stopping a server), on another cluster, 
6 months ago.
This was without HA, but different version of corosync, and that time, I was 
really seeing quorum split in the corosync logs of the servers.


I'll try to reproduce with a virtual cluster with 14 nodes (don't have enough 
hardware)


Could I be a bug in proxmox HA code, where watchdog is not resetted by LRM 
anymore?

----- Mail original -----
De: "dietmar" <diet...@proxmox.com>
À: "aderumier" <aderum...@odiso.com>
Cc: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>, 
"pve-devel" <pve-de...@pve.proxmox.com>
Envoyé: Dimanche 6 Septembre 2020 06:21:55
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown

> >>So you are using ipmi hardware watchdog? 
> 
> yes, I'm using dell idrac ipmi card watchdog 

But the pve logs look ok, and there is no indication 
that we stopped updating the watchdog. So why did the 
watchdog trigger? Maybe an IPMI bug? 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to