On 18/06/18 13:39, Jay Pipes wrote:
+openstack-dev since I believe this is an issue with the Heat source code.

On 06/18/2018 11:19 AM, Spyros Trigazis wrote:
Hello list,

I'm hitting quite easily this [1] exception with heat. The db server is configured to have 1000 max_connnections and 1000 max_user_connections and in the database section of heat
conf I have these values set:
max_pool_size = 22
max_overflow = 0
Full config attached.

I ended up with this configuration based on this formula:
num_heat_hosts=4
heat_api_workers=2
heat_api_cfn_workers=2
num_engine_workers=4
max_pool_size=22
max_overflow=0
num_heat_hosts * (max_pool_size + max_overflow) * (heat_api_workers + num_engine_workers + heat_api_cfn_workers)
704

What I have noticed is that the number of connections I expected with the above formula is not respected. Based on this formula each node (every node runs the heat-api, heat-api-cfn and heat-engine) should
use up to 176 connections but they even reach 400 connections.

Has anyone noticed a similar behavior?

Looking through the Heat code, I see that there are many methods in the /heat/db/sqlalchemy/api.py module that use a SQLAlchemy session but never actually call session.close() [1] which means that the session will not be released back to the connection pool, which might be the reason why connections keep piling up.

Thanks for looking at this Jay! Maybe I can try to explain our strategy (such as it is) here and you can tell us what we should be doing instead :)

Essentially we have one session per 'task', that is used for the duration of the task. Back in the day a 'task' was the processing of an entire stack from start to finish, but with our new distributed architecture it's much more granular - either it's just the initial setup of a change to a stack, or it's the processing of a single resource. (This was a major design change, and it's quite possible that the assumptions we made at the beginning - and tbh I don't think we really knew what we were doing then either - are no longer valid.)

So, for example, Heat sees an RPC request come in to update a resource, it starts a greenthread to handle it, that creates a database session that is stored in the request context. At the beginning of the request we load the data needed and update the status of the resource in the DB to IN_PROGRESS. Then we do whatever we need to do to update the resource (mostly this doesn't involve writing to the DB, but there are exceptions). Then we update the status to COMPLETE/FAILED, do some housekeeping stuff in the DB and send out RPC messages for any other work that needs to be done. IIUC that all uses the same session, although I don't know if it gets opened and closed multiple times in the process, and presumably the same object cache.

Crucially, we *don't* have a way to retry if we're unable to connect to the database in any of those operations. If we can't connect at the beginning that'd be manageable, because we could (but currently don't) just send out a copy of the incoming RPC message to try again later. But once we've changed something about the resource, we *must* record that in the DB or Bad Stuff(TM) will happen.

The way we handled that, as Spyros pointed out, was to adjust the size of the overflow pool to match the size of the greenthread pool. This ensures that every 'task' is able to connect to the DB, because we won't take the message out of the RPC queue until there is a greenthread, and by extension a DB connection, available. This is infinitely preferable to finding out there are no connections available after you've already accepted the message (and oslo_messaging has an annoying 'feature' of acknowledging the message before it has even passed it to the application). It means stuff that we aren't able to handle yet queues up in the message queue, where it belongs, instead of in memory.

History: https://bugs.launchpad.net/heat/+bug/1491185

Unfortunately now you have to tune the size of the threadpool to trade off not utilising too little of your CPU against not opening too many DB connections. Nobody knows what the 'correct' tradeoff is, and even if we did Heat can't really tune it automatically by default because at startup it only knows the number of worker processes on the local node; it can't tell how many other nodes are [going to be] running and opening connections to the same database. Plus the number of allowed DB connections becomes the bottleneck to how much you can scale out the service horizontally.

What is the canonical way of handling this kind of situation? Retry any DB operation where we can't get a connection, and close the session after every transaction?

Not sure if there's any setting in Heat that will fix this problem. Disabling connection pooling will likely not help since connections are not properly being closed and returned to the connection pool to begin with.

Best,
-jay

[1] Heat apparently doesn't use the oslo.db enginefacade transaction context managers either, which would help with this problem since the transaction context manager would take responsibility for calling session.flush()/close() appropriately.

https://github.com/openstack/oslo.db/blob/43af1cf08372006aa46d836ec45482dd4b5b5349/oslo_db/sqlalchemy/enginefacade.py#L626

Oh, I thought we did: https://review.openstack.org/330800

cheers,
Zane.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to