Re: [Openstack-operators] Uptime and SLA's

Melvin Hillsman Thu, 02 Jun 2016 16:05:16 -0700

Hey Matt,

I am looking into Monasca and would like to know your recommendation for 
resources regarding a) understanding and b) installing the project; especially 
since there is no install guide on the project wiki. Additionally, can you shed 
some light on whether this setup would run behind a loadbalancer in an HA 
configuration; I am looking at using three servers which will house a 
“stack”/“toolchain” for such activities.


Kind regards,
--
Melvin Hillsman
Ops Technical Lead
OpenStack Innovation Center
[email protected]
phone: (210) 312-1267
mobile: (210) 413-1659
Learner | Ideation | Belief | Responsibility | Command
http://osic.org

From:  Matt Fischer
Date:  Thursday, June 2, 2016 at 5:29 PM
To:  "Kingshott, Daniel"
Cc:  OpenStack Operators
Subject:  Re: [Openstack-operators] Uptime and SLA's

We do this a few different ways, some of which may meet your needs.

For API calls we measure a simple, quick, and impactless call for each service 
(like heat stack-list) and we monitor East from West and vice versa. The goal 
here is nothing added to the DBs, so nothing like neutron net-create. The 
downside here is that some of these calls work even when the service isn't 100% 
healthy so keep that in mind.

Then we also have a set of "what would a user do" calls like "spin up a VM and 
attach a FIP and ssh in" or "create and delete a volume". These run less often. 

Finally we have a reference cloud application that uses our LBaaS, GSLB, HA 
routers, and multiple front-end/back-end nodes. This has the highest 
expectation of uptime and is used as an example for our customers of how you 
can run an app with "more nines" than the underlying infra.

On any of these, especially the first two I mentioned, time series data is 
super useful. It's good to know that your create volume times (for example) are 
40% slower after your deploy. We use Monasca and Grafana for that.


On Thu, Jun 2, 2016 at 2:37 PM, Kingshott, Daniel 
<[email protected]> wrote:
We¹re currently in the process of writing up an internal SLA for our
openstack cloud, I¹d be interested to hear what others have done and what
metrics folks are capturing.

My initial thoughts are success / fail spawning instances, creating and
attaching volumes, API availability and so on.

Can anyone on the list share their insights?

Thanks,

Dan


Daniel Kingshott
Cloud Dude
(425) 623 4359 - Cell

Best Buy Co. Inc.
Technology Development Center
1000 Denny Way | 8th Floor | Seattle, WA | 98109 | USA


_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

_______________________________________________ OpenStack-operators mailing 
list [email protected] 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] Uptime and SLA's

Reply via email to