Hi Ryota, Thanks for your response, please see my comments below.
Ifat. > -----Original Message----- > From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com] > > Hi, > > > Sorry for my late response... > > It seems like a fundamental question whether we should have rich > function or intelligence in on-the-fly event alarm evaluation. I think > we can add simple operations (like aggregating alarm) in aodh > evaluator, and other operations (like deducing with referring some > external DB) should be done outside of the evaluation process to reduce > impact on other evaluations. But, if we separate too much, then there > will be many interactions between two services that makes slow to > finish sequence of alarm handling. > > One approach we can take, is that you configure aodh to pass each row > event (e.g. each VM downed) wrapped in alarm notification to vitrage, > then do some operation (e.g. deducing, aggregating) and store resource- > level alarm without any alarm_actions, so that users can see the alarms > in horizon view. This may not require alarm evaluation, so we can > forget the problem I raised (cache refresh interval). Let me see if I got this right: are you suggesting that we create on-the-fly alarm definitions with no alarm_actions, for every deduced alarm that we want to raise? And this will spare us the extra alarm evaluation in AODH? It does make sense. My next question is how exactly we should create these resource-level alarms. Can we create an alarm definition with no rule, no actions, and initial state set to "alarm"? (I'm not sure it can be done in the current AODH API) Another question is our need to get alarms from other sources, like Nagios, zabbix, ganglia, etc. We thought that Vitrage would query these Alarms from each source directly, and then create alarms in AODH in the same way as our deduced alarms: for example create nagios_ovs_vswitchd alarm if nagios check_ovs_vswitchd test failed. An alternative could be to integrate nagios directly with AODH. What do you think? > BTW, is it useful to have on-the-fly evaluation of combination alarm > with event alarms for alarm aggregation or other cases? I'm not sure I understand. Can you give a detailed example? > Horizon view is the different topic. Maybe we can reduce the number of > alarms listed in user view by creating raw alarms in admin space that > is not visible from end user, or using relevant severity or tag so that > user can filter out uninterested alarms. Referring to this[1] blueprint, do you have specific concerns regarding the usability/performance of Horizon view when there are many alarms? I think that your ideas make sense, and we can implement them if there is a need. In addition, in Vitrage we plan to handle alarm aggregation by creating aggregation rule templates, for example based on the RCA information. The user will be able to see only the root cause alarms, and then drill down to all specific alarms. But I doubt if this will be done for Mitaka. [1] https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page Thanks, Ifat. __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev