Sure. I can propose a new user story. And then are you thinking of including this user story in the scope of what masakari would be looking at ?
Greg. From: Adam Spiers <aspi...@suse.com> Reply-To: "openstack-dev@lists.openstack.org" <openstack-dev@lists.openstack.org> Date: Wednesday, May 17, 2017 at 10:08 AM To: "openstack-dev@lists.openstack.org" <openstack-dev@lists.openstack.org> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat / Healthcheck Monitoring Thanks for the clarification Greg. This sounds like it has the potential to be a very useful capability. May I suggest that you propose a new user story for it, along similar lines to this existing one? http://specs.openstack.org/openstack/openstack-user-stories/user-stories/proposed/ha_vm.html Waines, Greg <greg.wai...@windriver.com<mailto:greg.wai...@windriver.com>> wrote: Yes that’s correct. VM Heartbeating / Health-check Monitoring would introduce intrusive / white-box type monitoring of VMs / Instances. I realize this is somewhat in the gray-zone of what a cloud should be monitoring or not, but I believe it provides an alternative for Applications deployed in VMs that do not have an external monitoring/management entity like a VNF Manager in the MANO architecture. And even for VMs with VNF Managers, it provides a highly reliable alternate monitoring path that does not rely on Tenant Networking. You’re correct, that VM HB/HC Monitoring would leverage https://wiki.libvirt.org/page/Qemu_guest_agent that would require the agent to be installed in the images for talking back to the compute host. ( there are other examples of similar approaches in openstack ... the murano-agent for installation, the swift-agent for object store management ) Although here, in the case of VM HB/HC Monitoring, via the QEMU Guest Agent, the messaging path is internal thru a QEMU virtual serial device. i.e. a very simple interface with very few dependencies ... it’s up and available very early in VM lifecycle and virtually always up. Wrt failure modes / use-cases · a VM’s response to a Heartbeat Challenge Request can be as simple as just ACK-ing, this alone allows for detection of: o a failed or hung QEMU/KVM instance, or o a failed or hung VM’s OS, or o a failure of the VM’s OS to schedule the QEMU Guest Agent daemon, or o a failure of the VM to route basic IO via linux sockets. · I have had feedback that this is similar to the virtual hardware watchdog of QEMU/KVM ( https://libvirt.org/formatdomain.html#elementsWatchdog ) · However, the VM Heartbeat / Health-check Monitoring o provides a higher-level (i.e. application-level) heartbeating § i.e. if the Heartbeat requests are being answered by the Application running within the VM o provides more than just heartbeating, as the Application can use it to trigger a variety of audits, o provides a mechanism for the Application within the VM to report a Health Status / Info back to the Host / Cloud, o provides notification of the Heartbeat / Health-check status to higher-level cloud entities thru Vitrage § e.g. VM-Heartbeat-Monitor - to - Vitrage - (EventAlarm) - Aodh - ... - VNF-Manager - (StateChange) - Nova - ... - VNF Manager Greg. From: Adam Spiers <aspi...@suse.com<mailto:aspi...@suse.com>> Reply-To: "openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>" <openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>> Date: Tuesday, May 16, 2017 at 7:29 PM To: "openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>" <openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>> Subject: Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat / Healthcheck Monitoring Waines, Greg <greg.wai...@windriver.com<mailto:greg.wai...@windriver.com><mailto:greg.wai...@windriver.com><mailto:greg.wai...@windriver.com%3e>> wrote: thanks for the pointers Sam. I took a quick look. I agree that the VM Heartbeat / Health-check looks like a good fit into Masakari. Currently your instance monitoring looks like it is strictly black-box type monitoring thru libvirt events. Is that correct ? i.e. you do not do any intrusive type monitoring of the instance thru the QUEMU Guest Agent facility correct ? That is correct: https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/instancemonitor/instance.py I think this is what VM Heartbeat / Health-check would add to Masaraki. Let me know if you agree. OK, so you are looking for something slightly different I guess, based on this QEMU guest agent? https://wiki.libvirt.org/page/Qemu_guest_agent That would require the agent to be installed in the images, which is extra work but I imagine quite easily justifiable in some scenarios. What failure modes do you have in mind for covering with this approach - things like the guest kernel freezing, for instance? __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org<mailto:openstack-dev-requ...@lists.openstack.org><mailto:openstack-dev-requ...@lists.openstack.org>?subject:unsubscribe<mailto:openstack-dev-requ...@lists.openstack.org%3e?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org<mailto:openstack-dev-requ...@lists.openstack.org>?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org<mailto:openstack-dev-requ...@lists.openstack.org>?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev