I have produce it again, with the coredump this time
restart corosync : 17:05:27 http://odisoweb1.odiso.net/pmxcfs-corosync2.log bt full https://gist.github.com/aderumier/466dcc4aedb795aaf0f308de0d1c652b coredump http://odisoweb1.odiso.net/core.7761.gz ----- Mail original ----- De: "Thomas Lamprecht" <t.lampre...@proxmox.com> À: "aderumier" <aderum...@odiso.com>, "Proxmox VE development discussion" <pve-devel@lists.proxmox.com> Envoyé: Mercredi 16 Septembre 2020 16:45:12 Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown On 9/16/20 3:15 PM, Alexandre DERUMIER wrote: > I have reproduce it again, with pmxcfs in debug mode > > corosync restart at 15:02:10, and it was already block on other nodes at > 15:02:12 > > The pmxcfs was still logging after the lock. > > > here the log on node1 where corosync has been restarted > > http://odisoweb1.odiso.net/pmxcfs-corosync.log > thanks for those, I need a bit to sift through them. Seem like either dfsm gets out of sync or we do not get a ACK reply from cpg_send. A full core dump would be still nice, in gdb: generate-core-file PS: instead of manually switching to threads you can do: thread apply all bt full to get a backtrace for all threads in one command _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel