Re: Handling database connection pooling outside Java, without DBCP et al?

Christopher Schultz Wed, 24 Nov 2021 09:37:10 -0800

JK,

On 11/24/21 08:03, jkla...@iki.fi wrote:

On Tuesday, Nov 23, 2021 at 4:20 PM, Christopher Schultz 
<ch...@christopherschultz.net (mailto:ch...@christopherschultz.net)> wrote:


ProxySQL is, mostly, a load-balancing and caching product. Sure, it can
provide connection-pooling, but that doesn't mean that you want your
application making 1000 simultaneous requests to the proxy.

Could you expand on this? Connection pooling is a fundamental aspect
of how ProxySQL operates, after all. I'm not familiar with how the
DBCP connection pooling works, but what added value would it bring in
this case? What does DBCP pooling at the application level achieve
that ProxySQL wouldn't?

The pooling doesn't add anything necessarily to your Java application,but NOT using the pool can mean that your application could make avirtually unlimited number of connections to your ProxySQL instance.

The DBCP provides not only "pooling" (that is, re-using of physicaldatabase connections) but also an intentionally-limited resource thatyour application can use. If you use a DBCP pool size of 1, only onethread at a time can make a call to the database. (Well, you could sharethe connection across threads, but that would be total chaos.) If youuse an unlimited pool size, then you will allow the application to makeunlimited queries to your ProxySQL instance.

Perhaps you want to limit the number of simultaneous connections to thephysical database to something reasonable, like 100. So you set that upin ProxySQL so you have a pool size of 100. Any number of clients canconnect and query, but you'll only be actively executing at most 100. Sothere is no reason to allow more than 100 incoming connections to ProxySQL.

If you get a flood of requests to your application and every single oneof them requires a database connection to complete, you'll get a hugeflood of requests to ProxySQL. All but 100 of those requests will sitthere waiting at ProxySQL, *with a query pending execution*. If theapplication times-out waiting for the query to complete, either the userwil see an error, or the application will try-again. The user will mostlikely try again if they see an error. In either case, the query hasbeen sent to ProxySQL and it probably doesn't know that the client hasgiven-up, so it has to execute the query anyway. But the client is gone,so the query is just a waste. You have just launched a DOS against yourdatabase.

If, instead, you configure your DBCP pool size to be 100 / N where N isthe number of application servers you have, then the application stalls*waiting for the database connection* and not waiting for a query. Whenthis happens, the error is moved to acquiring the connection and notissuing the query. So your database never has to process unnecessaryqueries.


It's essentially "failing faster" or, IMO, "failing safer."

In case it's relevant, we're planning on installing ProxySQL on theapp servers themselves, an approach generally recommended by e.g.
Percona.

I don't think it's relevant.

My main fear is that having two separate connection pools would make
performance tuning at the pool level potentially nightmarish. We'd
make a config change at the ProxySQL level, test it, and conclude
that it didn't have a meaningful impact–when it actually might have,
had there not been another, poorly understood (by us, that is) pool
at play. DBCP has the potential to muddy the waters: even if it
doesn't actively work against ProxySQL-level tuning, we can't really
prove that it's not a contributing cause any time one of our ProxySQL
config changes fails to improve performance.

I'm sure there is a use-case for ProxySQL, but I find that SQL workloadsfall into a few big categories:

1. Lots of writes. Caches essentially cannot handle this, *especially*when they are distributed (which is what you have when ProxySQL isrunning on the application servers, separately). ProxySQL is nothelpful, here.

2. Few writes, lots of reads. In this case, caching in the applicationmakes more sense to me than constantly requesting the data from thedatabase. Not only do you avoid the actual call to the database, but youalso avoid the necessary conversion between the database's data-modeland the application's data model. ProxySQL is not helpful, here, either.

If I were you, I'd examine the reason behind your decision to useProxySQL in the first place. Maybe it makes sense to use it, but maybeit makes sense to use what you already have, and not introduce a newcomponent into the mix.

I haven't been able to figure out how to entirely disable DBCP so
that ProxySQL could handle the pooling entirely.

You can't. You'd have to tear-out the use of the JNDI-provided
DataSource altogether which isn't worth your time.


Elsewhere in this thread, Mark Thomas said this in response to Olaf
Kock, referring to his idea of a custom "non-pooling connection pool":

DBCP looks exactly like a database driver to the application. [...]
just configure the database driver directly.

This will work, and you won't get a connection pool at all in yourTomcat/application. But it will behave as though you have aconnection-pool with an unlimited number of connections, since everytime the application requests a connection, it will get one, opened toyour ProxySQL instance. You can limit the number of connections toProxySQL, I suppose, but then you trade "pool borrow timeout" errors for"connection refused" errors. You also have to construct a new networkconnection for each "use" of a db connection in the application. Part ofthe reason to use a connection pool (at any level) is to maintain theopen network connection between the application and the database. Bydisabling the pool available to the application, you are reducingperformance in a place where a performance drop is completely unnecessary.

Now, this probably shouldn't be taken as a recommendation or even astatement of practical feasibility. How do you view this option,assuming local ProxySQL installs on the app servers? You can stillassume my complete ignorance about every element between the JNDI
specs and the actual configuration statements in context.xml :-)

Performance tuning is a Dark Art. You are right that if you have twopools in-place, you may find that tuning one of them is difficult in thepresence of the other. But because you *know* that both exist, you don'thave to do that tuning in a vacuum: you can take both pools into accountwhen tuning that particular part of your application. Or, you can decidethat you only need one (architecturally speaking) pool at all. And myargument is that the only pool you need is already configured and readyto go.


YMMV

-chris

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: Handling database connection pooling outside Java, without DBCP et al?

Reply via email to