Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat / Healthcheck Monitoring

Adam Spiers Tue, 16 May 2017 05:42:37 -0700

Afek, Ifat (Nokia - IL/Kfar Sava) <[email protected]> wrote:

On 16/05/2017, 4:36, "Sam P" <[email protected]> wrote:


   Hi Greg,

    In Masakari [0] for VMHA, we have already implemented some what
   similar function in masakri-monitors.
    Masakari-monitors runs on nova-compute node, and monitors the host,
   process or instance failures.
    Masakari instance monitor has similar functionality with what you
   have described.
    Please see [1] for more details on instance monitoring.
    [0] https://wiki.openstack.org/wiki/Masakari
    [1] 
https://github.com/openstack/masakari-monitors/tree/master/masakarimonitors/instancemonitor

    Once masakari-monitors detect failures, it will send notifications to
   masakari-api to take appropriate recovery actions to recover that VM
   from failures.


You can also find out more about our architectural plans by watching
this talk which Sampath and I gave in Boston:

  
https://www.openstack.org/videos/boston-2017/high-availability-for-instances-moving-to-a-converged-upstream-solution

The slides are here:

  https://aspiers.github.io/openstack-summit-2017-boston-compute-ha/

We didn't go into much depth on monitoring and recovery of individual
VMs, but as Sampath explained, Masakari already handles both of these.

Hi Greg, Sam,

As Vitrage is about correlating alarms that come from different
sources, and is not a monitor by itself – I think that it can benefit
from information retrieved by both Masakari and Zabbix monitors.

Zabbix is already integrated into Vitrage. I don’t know if there are
specific tests for VM heartbeat, but I think it is very likely that
there are.  Regarding Masakari – looking at your documents, I believe
that integrating your monitoring information into Vitrage could be
quite straight forward.


Yes, this makes sense.  Masakari already cleanly decouples
monitoring/alerting from automated recovery, so it could support this
quite nicely.  And the modular converged architecture we explained in
the presentation will maintain that clean separation of
responsibilities whilst integrating Masakari together with other
components such as Pacemaker, Mistral, and maybe Vitrage too.

For example whilst so far this thread has been about VM instance
monitoring, another area where Vitrage could integrate with Masakari
is compute host monitoring.

If you watch this part of our presentation where we explained the next
generation architecture, you'll see that we propose a new
"nova-host-alerter" component which has a driver-based mechanism for
alerting different services when a compute host experiences a failure:

   https://youtu.be/YPKE1guti8E?t=32m43s

So one obvious possibility would be to add a driver for Vitrage, so
that Vitrage can be alerted when Pacemaker spots a host failure.

Similarly, we could extend Pacemaker configurations to alert Vitrage
when individual processes such as nova-compute or libvirtd fail.

If you would like to discuss any of this further or have any more
questions, in addition to this mailing list we are also available to
talk on the #openstack-ha IRC channel!

Cheers,
Adam

P.S. I've added the [HA] badge to this thread since this discussion is
definitely related to high availability.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] [nova] [HA] VM Heartbeat / Healthcheck Monitoring

Reply via email to