Cern is running ceilometer at scale with many thousands of compute nodes. I think their blog goes into some detail about it [1], but I don’t have a direct link to it.
[1] - http://openstack-in-production.blogspot.com/ ___________________________________________________________________ Kris Lindgren Senior Linux Systems Engineer GoDaddy From: Bill Jones <[email protected]<mailto:[email protected]>> Date: Tuesday, June 14, 2016 at 9:03 AM To: "openstack-oper." <[email protected]<mailto:[email protected]>> Subject: [Openstack-operators] Scaling Ceilometer compute agent? Has anyone had any experience with scaling ceilometer compute agents? We're starting to see messages like this in logs for some of our compute agents: WARNING ceilometer.openstack.common.loopingcall [-] task <function interval_task at 0x2092cf8> run outlasted interval by 293.25 sec This is an indication that the compute agent failed to execute its pipeline processing within the allotted interval (in our case 10 min). The result of this is that less instance samples are generated per hour than expected, and this causes billing issues for us due to the way we calculate usage. It looks like we have three options for addressing this: make the pipeline run faster, increase the interval time, or scale the compute agents. I'm investigating the latter. I think I read in the ceilometer architecture docs that the agents are designed to scale, but I don't see anything in the docs on how to facilitate that. Any pointers would be appreciated. Thanks, Bill
_______________________________________________ OpenStack-operators mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
