Unfortunately, I can only confirm the sorry state of Ceilometer.
We tried it on a very small setup (6 compute nodes) and run in so many issues, 
we dropped it and created our own solution based on a mix of scripts that read 
from the nova/neutron DB, iptables and collectd data. No need for more 
collection agents than what we are already running for the systems monitoring.

We tried the version in Havana and, later, in Icehouse. For starters the 
documentation was suggesting MySQL as default backend. MySQL will last just a 
few days and then break down under the size of the tables. We tried MongoDB, 
but were still not satisfied with performance on such a small cluster.
Then there is the metering agent. It is yet another daemon, not integrated in 
Neutron and there is no documentation about what it is actually measuring. What 
if I have multiple routers? Ingress and Egress? From which point of view?
The same applies to Cinder, it requires and external agent (to be run via 
cron!).

Some metrics were not recorded, we couldn't understand why and, again, no 
documentation and no tooling to help us understand whether we were just missing 
some config options somewhere in nova-compute or there was some other problem 
with KVM/libvirt versions.
And even when we had some data and wanted to generate just a proof-of-concept 
report with some information about tenant resource usage, we found problems 
with the API. The fact that no one had bothered to write a simple proof of 
concept script that uses the API to actually do something useful was really 
off-putting.

We had to dig in libvirt to understand what some of the metrics actually mean.
We found that we could read those same metrics from our (more efficient, 
well-known) monitoring system.

For some time we run just the agents and aggregated the data in an 
elasticsearch instance through the UDP msgpack pipeline (more bugs, message 
format is inconsistent, different agents generate different fields, in slightly 
different formats).
It works. But for our needs it was just too much work. Most of the data is 
already available from other sources with well-known APIs.

Ah, also there is a long standing bug open: Sahara and Ceilometer cannot be 
used together. And we use Sahara.

I opened bugs for some of these issues, but since then I lost interest.

In the end, I think it really depends on what kind of data you need and what 
(developer) resources you can throw at the problem.
Unless in Juno things changed dramatically, Ceilometer will not work out of the 
box. You will have to lose time because of the non-existent documentation, you 
will have to develop code and scripts anyway and finally you will have to 
create something between your billing system and the ceilometer API, because to 
the best of my knowledge there is nothing that uses it.

eBay has the resources to do all that. We don't.



-----Original Message-----
From: George Shuklin [mailto:george.shuk...@gmail.com] 
Sent: Thursday 12 February 2015 02:59
To: openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] [Ceilometer] Real world experience with 
Ceilometer deployments - Feedback requested

Ceilometer is in sad state.

1. Collector leaks memory. We ran it on same host with mongo, and it grab 29Gb 
out of 32, leaving mongo with less than gig memory available.
2. Metering agent cause huge load on neutron-server. o(n) of metering rules and 
tenants. Few bugs reported, one bugfix in review.
3. Metering agent simply do no work on multi-network-nodes installation. 
It exepects all routers be on same host. Fixed or not - I don't know, we have 
our own crude fix.
4. Many rough edges. Ceilometer much less tested than nova. Sometimes it traces 
and skip counting. Fresh example: if metadata has '.' in the name, ceilometer 
trace on it and did not count in glance usage.
5. Very slow on reports (using mongo's mapreduce).

Overall feeling: barely usable, but with my experience with cloud billings, not 
the worst thing I saw in my life.

About load: except reporting and memory leaks, it use rather small amount of 
resources.

On 02/11/2015 09:37 PM, Maish Saidel-Keesing wrote:
> Is Ceilometer ready for prime time?
>
> I would be interested in hearing from people who have deployed 
> OpenStack clouds with Ceilometer, and their experience. Some of the 
> topics I am looking for feedback on are:
>
> - Database Size
> - MongoDB management, Sharding, replica sets etc.
> - Replication strategies
> - Database backup/restore
> - Overall useability
> - Gripes, pains and problems (things to look out for)
> - Possible replacements for Ceilometer that you have used instead
>
>
> If you are willing to share - I am sure it will be beneficial to the 
> whole community.
>
> Thanks in Advance
>
>
> With best regards,
>
>
> Maish Saidel-Keesing
> Platform Architect
> Cisco
>
>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operator
> s


_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to