--- Begin Message ---
Greetings,
I was pointed here to discuss the StorPool storage plugin[0] with the
dev team.
If I understand correctly, there is a concern with the our HA watchdog
daemon, and I'd like to explain the why and how.
As a distributed storage system, StorPool has its own internal
clustering mechanisms; it can run
on networks that are independent from the PVE cluster one, and thus
remain unaffected by network
partitions or other problems that would cause the standard PVE watchdog
to reboot a node.
In the case of HCI (compute + storage) nodes, this reboot can interrupt
the normal operation of the
StorPool cluster, causing reduced performance or downtime, which could
be avoided if the host is not restarted.
This is why we do our best to avoid such behavior across the different
cloud management platforms.
Currently, when our daemon detects an unexpected exit of a resource
manager, it will SIGKILL PVE
HA services and running VMs on the node, which should prevent 2
instances of the same VM running at
the same time. PVE services and our block storage client daemon are
restarted as well.
We're open to discussion and suggestions for our approach and
implementation.
[0] https://github.com/storpool/pve-storpool
--
Ivaylo Markov
Quality & Automation Engineer
StorPool Storage
https://www.storpool.com
--- End Message ---
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel