JK,

On 11/24/21 08:03, jkla...@iki.fi wrote:
On Tuesday, Nov 23, 2021 at 4:20 PM, Christopher Schultz 
<ch...@christopherschultz.net (mailto:ch...@christopherschultz.net)> wrote:

ProxySQL is, mostly, a load-balancing and caching product. Sure, it can
provide connection-pooling, but that doesn't mean that you want your
application making 1000 simultaneous requests to the proxy.

Could you expand on this? Connection pooling is a fundamental aspect
of how ProxySQL operates, after all. I'm not familiar with how the
DBCP connection pooling works, but what added value would it bring in
this case? What does DBCP pooling at the application level achieve
that ProxySQL wouldn't?
The pooling doesn't add anything necessarily to your Java application, but NOT using the pool can mean that your application could make a virtually unlimited number of connections to your ProxySQL instance.

The DBCP provides not only "pooling" (that is, re-using of physical database connections) but also an intentionally-limited resource that your application can use. If you use a DBCP pool size of 1, only one thread at a time can make a call to the database. (Well, you could share the connection across threads, but that would be total chaos.) If you use an unlimited pool size, then you will allow the application to make unlimited queries to your ProxySQL instance.

Perhaps you want to limit the number of simultaneous connections to the physical database to something reasonable, like 100. So you set that up in ProxySQL so you have a pool size of 100. Any number of clients can connect and query, but you'll only be actively executing at most 100. So there is no reason to allow more than 100 incoming connections to ProxySQL.

If you get a flood of requests to your application and every single one of them requires a database connection to complete, you'll get a huge flood of requests to ProxySQL. All but 100 of those requests will sit there waiting at ProxySQL, *with a query pending execution*. If the application times-out waiting for the query to complete, either the user wil see an error, or the application will try-again. The user will most likely try again if they see an error. In either case, the query has been sent to ProxySQL and it probably doesn't know that the client has given-up, so it has to execute the query anyway. But the client is gone, so the query is just a waste. You have just launched a DOS against your database.

If, instead, you configure your DBCP pool size to be 100 / N where N is the number of application servers you have, then the application stalls *waiting for the database connection* and not waiting for a query. When this happens, the error is moved to acquiring the connection and not issuing the query. So your database never has to process unnecessary queries.

It's essentially "failing faster" or, IMO, "failing safer."

In case it's relevant, we're planning on installing ProxySQL on the app servers themselves, an approach generally recommended by e.g.
Percona.
I don't think it's relevant.

My main fear is that having two separate connection pools would make
performance tuning at the pool level potentially nightmarish. We'd
make a config change at the ProxySQL level, test it, and conclude
that it didn't have a meaningful impact–when it actually might have,
had there not been another, poorly understood (by us, that is) pool
at play. DBCP has the potential to muddy the waters: even if it
doesn't actively work against ProxySQL-level tuning, we can't really
prove that it's not a contributing cause any time one of our ProxySQL
config changes fails to improve performance.
I'm sure there is a use-case for ProxySQL, but I find that SQL workloads fall into a few big categories:

1. Lots of writes. Caches essentially cannot handle this, *especially* when they are distributed (which is what you have when ProxySQL is running on the application servers, separately). ProxySQL is not helpful, here.

2. Few writes, lots of reads. In this case, caching in the application makes more sense to me than constantly requesting the data from the database. Not only do you avoid the actual call to the database, but you also avoid the necessary conversion between the database's data-model and the application's data model. ProxySQL is not helpful, here, either.

If I were you, I'd examine the reason behind your decision to use ProxySQL in the first place. Maybe it makes sense to use it, but maybe it makes sense to use what you already have, and not introduce a new component into the mix.

I haven't been able to figure out how to entirely disable DBCP so
that ProxySQL could handle the pooling entirely.
>
You can't. You'd have to tear-out the use of the JNDI-provided
DataSource altogether which isn't worth your time.

Elsewhere in this thread, Mark Thomas said this in response to Olaf
Kock, referring to his idea of a custom "non-pooling connection pool":

DBCP looks exactly like a database driver to the application. [...]
just configure the database driver directly.

This will work, and you won't get a connection pool at all in your Tomcat/application. But it will behave as though you have a connection-pool with an unlimited number of connections, since every time the application requests a connection, it will get one, opened to your ProxySQL instance. You can limit the number of connections to ProxySQL, I suppose, but then you trade "pool borrow timeout" errors for "connection refused" errors. You also have to construct a new network connection for each "use" of a db connection in the application. Part of the reason to use a connection pool (at any level) is to maintain the open network connection between the application and the database. By disabling the pool available to the application, you are reducing performance in a place where a performance drop is completely unnecessary.

Now, this probably shouldn't be taken as a recommendation or even a statement of practical feasibility. How do you view this option, assuming local ProxySQL installs on the app servers? You can still assume my complete ignorance about every element between the JNDI
specs and the actual configuration statements in context.xml :-)
Performance tuning is a Dark Art. You are right that if you have two pools in-place, you may find that tuning one of them is difficult in the presence of the other. But because you *know* that both exist, you don't have to do that tuning in a vacuum: you can take both pools into account when tuning that particular part of your application. Or, you can decide that you only need one (architecturally speaking) pool at all. And my argument is that the only pool you need is already configured and ready to go.

YMMV

-chris

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to