Re: Psycopg2 pool clarification

israel Sat, 10 Jun 2017 09:40:51 -0700

On 2017-06-08 19:55, Ian Kelly wrote:

On Thu, Jun 8, 2017 at 10:47 AM, Israel Brewster<[email protected]> wrote:
On Jun 7, 2017, at 10:31 PM, dieter <[email protected]> wrote:
israel <[email protected]> writes:
On 2017-06-06 22:53, dieter wrote:
...
As such, using psycopg2's pool is essentially
worthless for me (plenty of use for it, i'm sure, just not for me/my
use case).
Could you not simply adjust the value for the "min" parameter?
If you want at least "n" open connections, then set "min" to "n".
Well, sure, if I didn't care about wasting resources (which, I guessmany people don't). I could set "n" to some magic number that wouldalways give "enough" connections, such that my application never hasto open additional connections, then adjust that number every fewmonths as usage changes. In fact, now that I know how the logic of thepool works, that's exactly what I'm doing until I am confident that mycaching replacement is solid.
Of course, in order to avoid having to open/close a bunch ofconnections during the times when it is most critical - that is, whenthe server is under heavy load - I have to set that number arbitrarilyhigh. Furthermore, that means that much of the time many, if not most,of those connections would be idle. Each connection uses a certainamount of RAM on the server, not to mention using up limitedconnection slots, so now I've got to think about if my server is sizedproperly to be able to handle that load not just occasionally, butconstantly - when reducing server load by reducing the frequency ofconnections being opened/closed was the goal in the first place. Soall I've done is trade dynamic load for static load - increasingperformance at the cost of resources, rather than more intelligentlyusing the available resources. All-in-all, not the best solution,though it does work. Maybe if load was fairly constant it would makemore sense though. So like I said *my* use c
 ase, whi
 ch
is a number of web apps with varying loads, loads that also varyfrom day-to-day and hour-to-hour.
On the other hand, a pool that caches connections using the logic Ilaid out in my original post would avoid the issue. Under heavy load,it could open additional connections as needed - a performance penaltyfor the first few users over the min threshold, but only the firstfew, rather than all the users over a certain threshold ("n"). Thoseconnections would then remain available for the duration of the load,so it doesn't need to open/close numerous connections. Then, duringperiods of lighter load, the unused connections can drop off, freeingup server resources for other uses. A well-written pool could even dosomething like see that the available connection pool is running low,and open a few more connections in the background, thus completelyavoiding the connection overhead on requests while never having morethan a few "extra" connections at any given time. Even if you left ofthe expiration logic, it would still be an improvement, because whileunused connections
 wouldn't
  d
rop, the "n" open connections could scale up dynamically until youhave "enough" connections, without having to figure out and hard-codethat "magic number" of open connections.
Why wouldn't I want something like that? It's not like its hard tocode - took me about an hour and a half to get to a working prototypeyesterday. Still need to write tests and add some polish, but itworks. Perhaps, though, the common thought is just "throw morehardware at it and keep a lot of connections open at all time?" MaybeI was raised to conservatively, or the company I work for is toopoor.... :-D
Psycopg is first and foremost a database adapter. To quote from the
psycopg2.pool module documentation, "This module offers a few pure
Python classes implementing *simple* connection pooling directly in
the client application" (emphasis added). The advertised list of
features at http://initd.org/psycopg/features/ doesn't even mention
connection pooling. In short, you're getting what you paid for.

It sounds like your needs are beyond what the psycopg2.pool module
provides.

Quite possible. Thus the reason I was looking for clarification on howthe module was intended to work - if it doesn't work in the way that Iwant it to, I need to look elsewhere for a solution. My main reason forposting this thread was that I was expecting it to work one way, buttesting showed it working another way, so I was trying to find out ifthat was intentional or user error. Apparently it's intentional, sothere we go - in it's current form at least, my needs are beyond whatthe psycopg2 pool provides. Fair enough.

I suggest looking into a dedicated connection pooler like
PgBouncer. You'll find that it's much more feature-rich and
configurable than psycopg2.pool. It's production-ready, unlike your
prototype. And since it's a proxy, it can take connections from
multiple client apps and tune the pool to your overall load rather
than on an app-by-app basis (and thus risk overloading the backend if
multiple apps unexpectedly peak together).

Very true, and I've looked into that (as well as the related, but morebasic, PgPool product), but it seems to me that any external proxyproduct like these would defeat *my* purpose for using a pool in thefirst place: avoiding the overhead of making/breaking many connectionsquickly. That is, all you have really done is gone from connecting toPostgres to connecting to PgBouncer. You are still making and breakingjust as many connections. Unless connecting to PgBouncer issignificantly cheaper than connecting to Postgres? This may well be thecase, but I haven't yet seen anything to support that. Haven't seenanything to refute that either, however :)

Of course, there may be many other features provided by such tools thatwould make them worthwhile, even for my use case. However, my primarygoal in using a pool was avoiding the connection overhead with eachrequest, so if a tool doesn't do that, then it isn't the right tool forme :)


As for why psycopg2.pool is the way it is, maybe most users don't have
your situation of serving multiple apps with loads varying on
different cycles. Most are probably only serving a single app, or if
serving multiple apps then they likely have common user bases with
similar peak times. You can't dynamically adjust the amount of RAM in
your server, so saving resources like RAM at below-peak times only
matters if you're going to do something else with it. In the scenarios
I described there isn't much else to do with it, so I can understand
if saving RAM isn't a priority.

True, but you would still have to deal with the minconn "magic number",unless you just adjusted it so high from the start that even if yourload/use grows over time you never have to mess with it. Even in thesingle app use case, where you don't care about RAM usage (since thereis nothing else trying to use the RAM) in order to get maximum benefitfrom a pool you'd have to keep an eye on your usage and make sure itnever (or rarely) exceeds whatever arbitrary value you have set forminconn. Not a big deal, especially if you tune it high to begin with,but it is one more thing.

Honestly, some of that is just personal issues. I have problems withcode that is inefficient by design (even if there is nothing to begained from efficiency), or that makes assumptions about things when itcould be dynamic. I have often coded in such a way that a given valuecan be adjusted dynamically, even when the people giving me the specssay it will never change. More than once that has enabled me to respondto a "feature request" or change order by saying "it already does that".On the other hand, I am probably a poster child for "prematureoptimization", and often have to stop myself from optimizing code justbecause I can, when in reality it is not worth my time. By the sametoken, the idea of wasting resources - even when, as you state, there isnothing else to do with them - just rubs me the wrong way. As such, Ireadily acknowledge that some of my requests/statements stem from my ownpersonal desires, and not from any actual lack/need in the product.

--
https://mail.python.org/mailman/listinfo/python-list

Re: Psycopg2 pool clarification

Reply via email to