Re: [openstack-dev] [Heat] event table is a ticking time bomb

Angus Salkeld Mon, 12 Aug 2013 15:43:26 -0700

On 12/08/13 16:52 -0400, Doug Hellmann wrote:

On Mon, Aug 12, 2013 at 4:11 PM, Clint Byrum <[email protected]> wrote:

Excerpts from Doug Hellmann's message of 2013-08-12 12:08:58 -0700:
> On Fri, Aug 9, 2013 at 11:56 AM, Clint Byrum <[email protected]> wrote:
>
> > Excerpts from Sandy Walsh's message of 2013-08-09 06:16:55 -0700:
> > >
> > > On 08/08/2013 11:36 PM, Angus Salkeld wrote:
> > > > On 08/08/13 13:16 -0700, Clint Byrum wrote:
> > > >> Last night while reviewing a feature which would add more events
to
> > the
> > > >> event table, it dawned on me that the event table really must be
> > removed.
> > > >
> > > >
> > > >>
> > > >> https://bugs.launchpad.net/heat/+bug/1209492
> > > >>
> > > >> tl;dr: users can write an infinite number of rows to the event
table
> > at
> > > >> a fairly alarming rate just by creating and updating a very large
> > stack
> > > >> that has no resources that cost any time or are even billable
(like an
> > > >> autoscaling launch configuration).
> > > >>
> > > >> The table has no purge function, so the only way to clear out old
> > events
> > > >> is to delete the stack, or manually remove them directly in the
> > database.
> > > >>
> > > >> We've all been through this before, logging to a database seems
great
> > > >> until you actually do it.
> > > >>
> > > >> I have some ideas for how to solve it, but I wanted to get a wider
> > > >> audience:
> > > >>
> > > >> 1) Make the event list a ring buffer. Have rows 0 -
$MAX_BUFFER_SIZE
> > in
> > > >> each stack, and simply write each new event to the next open
position,
> > > >> wrapping at $MAX_BUFFER_SIZE. Pros: little change to current code,
> > > >> just need an offset column added and code that will properly wrap
to 0
> > > >> at $MAX_BUFFER_SIZE. Cons: still can incur heavy transactional
load on
> > > >> the database server.A
> > > >>
> > > >> 1.b) Same, but instead of rows, just maintain a blob and append
the
> > rows
> > > >> as json list. Lowers transactional load but would push some load
onto
> > > >> the API servers and such to parse these out, and would make
pagination
> > > >> challenging. Blobs also can be a drain on DB server performance.
> > > >>
> > > >> 2) Write a purge script. Delete old ones. Pros: No code change,
just
> > > >> new code to do purging. Cons: same as 1, plus more vulnerability
to an
> > > >> aggressive attacker who can fit a lot of data in between purges.
Also
> > > >> large scale deletes can be really painful (see: keystone sql token
> > > >> backend).
> > > >>
> > > >> 3) Log events to Swift. I can't seem to find information on how/if
> > > >> appending works there. Tons of tiny single-row files is an option,
> > but I
> > > >> want to hear from people with more swift knowledge if that is a
> > viable,
> > > >> performant option. Pros: Scale to the moon. Can charge tenant for
> > usage
> > > >> and let them purge events as needed. Cons: Adds swift as a
requirement
> > > >> of Heat.
> > > >>
> > > >> 4) Provide a way for users to receive logs via HTTP POST. Pros:
Simple
> > > >> and punts the problem to the users. Cons: users will be SoL if
they
> > > >> don't have a place to have logs posted to.
> > > >>
> > > >> 5) Provide a way for users to receive logs via messaging service
like
> > > >> Marconi.  Pros/Cons: same as HTTP, but perhaps a little more
confusing
> > > >> and ambitious given Marconi's short existence.
> > > >>
> > > >> 6) Provide a pluggable backend for logging. This seems like the
way
> > most
> > > >> OpenStack projects solve these issues, which is to let the
deployers
> > > >> choose and/or provide their own way to handle a sticky problem.
Pros:
> > > >> Simple and flexible for the future. Cons: Would require writing at
> > least
> > > >> one backend provider that does what the previous 5 options
suggest.
> > > >>
> > > >> To be clear: Heat cannot really exist without this, as it is the
only
> > way
> > > >> to find out what your stack is doing or has done.
> > > >
> > > > btw Clint I have ditched that "Recorder" patch as Ceilometer is
> > > > getting a Alarm History api soon, so we can defer to that for that
> > > > functionality (alarm transitions).
> > > >
> > > > But we still need a better way to record events/logs for the user.
> > > > So I make this blueprint a while ago:
> > > > https://blueprints.launchpad.net/heat/+spec/user-visible-logs
> > > >
> > > > I am becomming more in favor of user options rather than deployer
> > > > options if possible. So provide resources for Marconi, Meniscus and
> > > > what ever...
> > > > Although what is nice about Marconi is you could then hook up what
> > > > ever you want to it.
> > >
> > > Logs are one thing (and Meniscus is a great choice for that), but
events
> > > are the very thing CM is designed to handle. Wouldn't it make sense
to
> > > push them back into there?
> > >
> >
> > I'm not sure these events make sense in the current Ceilometer (I
assume
> > that is "CM" above) context. These events are:
> >
> > ... Creating stack A
> > ... Creating stack A resource A
> > ... Created stack A resource A
> > ... Created stack A
> >
> > Users will want to be able to see all of the events for a stack, and
> > likely we need to be able to paginate through them as well.
> >
> > They are fundamental and low level enough for Heat that I'm not sure
> > putting them in Ceilometer makes much sense, but maybe I don't
understand
> > Ceilometer..  or "CM" is something else entirely. :)
> >
>
> CM is indeed ceilometer.
>
> The plan for the event API there is to make it admin-only (at least for
> now). If this is data the user wants to see, that may change the plan for
> the API or may mean storing it in ceilometer isn't a good fit.
>

Visibility into these events is critical to tracking the progress of
any action done to a Heat stack:


+---------------------+----+------------------------+--------------------+----------------------+
| logical_resource_id | id | resource_status_reason | resource_status    |
event_time           |

+---------------------+----+------------------------+--------------------+----------------------+
| AccessPolicy        | 24 | state changed          | CREATE_IN_PROGRESS |
2013-08-12T19:45:36Z |
| AccessPolicy        | 25 | state changed          | CREATE_COMPLETE    |
2013-08-12T19:45:36Z |
| User                | 26 | state changed          | CREATE_IN_PROGRESS |
2013-08-12T19:45:36Z |
| Key                 | 28 | state changed          | CREATE_IN_PROGRESS |
2013-08-12T19:45:38Z |
| User                | 27 | state changed          | CREATE_COMPLETE    |
2013-08-12T19:45:38Z |
| Key                 | 29 | state changed          | CREATE_COMPLETE    |
2013-08-12T19:45:39Z |
| notcompute          | 30 | state changed          | CREATE_IN_PROGRESS |
2013-08-12T19:45:40Z |

+---------------------+----+------------------------+--------------------+----------------------+

So unless there is a plan to make this a user centric service, it does
not seem like a good fit.

> Are these "events" transmitted in the same way as notifications? If so,
we
> may already have the data.
>

The Heat engine records them while working on the stack. They have a
fairly narrow, well defined interface, so it should be fairly easy to
address the storage issue with a backend abstraction.


OK. The term "event" frequently means "notification" for ceilometer, but it
sounds like it's completely different in this case.


Yeah, not really related. But we need to add RPC Notifications to Heat
soon so people can bill on a stack basis (create/update/delete/exist).

What we are talking about here is logging really but we want to get it
to the end user (that does not have access to the infrastructure's

syslog).


-Angus


Doug


_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Heat] event table is a ticking time bomb

Reply via email to