On Thu, Sep 3, 2015 at 3:53 AM Zane Bitter <zbit...@redhat.com> wrote:
> On 02/09/15 04:55, Steven Hardy wrote: > > On Wed, Sep 02, 2015 at 04:33:36PM +1200, Robert Collins wrote: > >> On 2 September 2015 at 11:53, Angus Salkeld <asalk...@mirantis.com> > wrote: > >> > >>> 1. limit the number of resource actions in parallel (maybe base on the > >>> number of cores) > >> > >> I'm having trouble mapping that back to 'and heat-engine is running on > >> 3 separate servers'. > > > > I think Angus was responding to my test feedback, which was a different > > setup, one 4-core laptop running heat-engine with 4 worker processes. > > > > In that environment, the level of additional concurrency becomes a > problem > > because all heat workers become so busy that creating a large stack > > DoSes the Heat services, and in my case also the DB. > > > > If we had a configurable option, similar to num_engine_workers, which > > enabled control of the number of resource actions in parallel, I probably > > could have controlled that explosion in activity to a more managable > series > > of tasks, e.g I'd set num_resource_actions to (num_engine_workers*2) or > > something. > > I think that's actually the opposite of what we need. > > The resource actions are just sent to the worker queue to get processed > whenever. One day we will get to the point where we are overflowing the > queue, but I guarantee that we are nowhere near that day. If we are > DoSing ourselves, it can only be because we're pulling *everything* off > the queue and starting it in separate greenthreads. > worker does not use a greenthread per job like service.py does. This issue is if you have actions that are fast you can hit the db hard. QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30 It seems like it's not very hard to hit this limit. It comes from simply loading the resource in the worker: "/home/angus/work/heat/heat/engine/worker.py", line 276, in check_resource "/home/angus/work/heat/heat/engine/worker.py", line 145, in _load_resource "/home/angus/work/heat/heat/engine/resource.py", line 290, in load resource_objects.Resource.get_obj(context, resource_id) > > In an ideal world, we might only ever pull one task off that queue at a > time. Any time the task is sleeping, we would use for processing stuff > off the engine queue (which needs a quick response, since it is serving > the ReST API). The trouble is that you need a *huge* number of > heat-engines to handle stuff in parallel. In the reductio-ad-absurdum > case of a single engine only processing a single task at a time, we're > back to creating resources serially. So we probably want a higher number > than 1. (Phase 2 of convergence will make tasks much smaller, and may > even get us down to the point where we can pull only a single task at a > time.) > > However, the fewer engines you have, the more greenthreads we'll have to > allow to get some semblance of parallelism. To the extent that more > cores means more engines (which assumes all running on one box, but > still), the number of cores is negatively correlated with the number of > tasks that we want to allow. > > Note that all of the greenthreads run in a single CPU thread, so having > more cores doesn't help us at all with processing more stuff in parallel. > Except, as I said above, we are not creating greenthreads in worker. -A > > cheers, > Zane. > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev