JK,
On 11/24/21 08:03, jkla...@iki.fi wrote:
On Tuesday, Nov 23, 2021 at 4:20 PM, Christopher Schultz
<ch...@christopherschultz.net (mailto:ch...@christopherschultz.net)> wrote:
ProxySQL is, mostly, a load-balancing and caching product. Sure, it can
provide connection-pooling, but that doesn't mean that you want your
application making 1000 simultaneous requests to the proxy.
Could you expand on this? Connection pooling is a fundamental aspect
of how ProxySQL operates, after all. I'm not familiar with how the
DBCP connection pooling works, but what added value would it bring in
this case? What does DBCP pooling at the application level achieve
that ProxySQL wouldn't?
The pooling doesn't add anything necessarily to your Java application,
but NOT using the pool can mean that your application could make a
virtually unlimited number of connections to your ProxySQL instance.
The DBCP provides not only "pooling" (that is, re-using of physical
database connections) but also an intentionally-limited resource that
your application can use. If you use a DBCP pool size of 1, only one
thread at a time can make a call to the database. (Well, you could share
the connection across threads, but that would be total chaos.) If you
use an unlimited pool size, then you will allow the application to make
unlimited queries to your ProxySQL instance.
Perhaps you want to limit the number of simultaneous connections to the
physical database to something reasonable, like 100. So you set that up
in ProxySQL so you have a pool size of 100. Any number of clients can
connect and query, but you'll only be actively executing at most 100. So
there is no reason to allow more than 100 incoming connections to ProxySQL.
If you get a flood of requests to your application and every single one
of them requires a database connection to complete, you'll get a huge
flood of requests to ProxySQL. All but 100 of those requests will sit
there waiting at ProxySQL, *with a query pending execution*. If the
application times-out waiting for the query to complete, either the user
wil see an error, or the application will try-again. The user will most
likely try again if they see an error. In either case, the query has
been sent to ProxySQL and it probably doesn't know that the client has
given-up, so it has to execute the query anyway. But the client is gone,
so the query is just a waste. You have just launched a DOS against your
database.
If, instead, you configure your DBCP pool size to be 100 / N where N is
the number of application servers you have, then the application stalls
*waiting for the database connection* and not waiting for a query. When
this happens, the error is moved to acquiring the connection and not
issuing the query. So your database never has to process unnecessary
queries.
It's essentially "failing faster" or, IMO, "failing safer."
In case it's relevant, we're planning on installing ProxySQL on the
app servers themselves, an approach generally recommended by e.g.
Percona.
I don't think it's relevant.
My main fear is that having two separate connection pools would make
performance tuning at the pool level potentially nightmarish. We'd
make a config change at the ProxySQL level, test it, and conclude
that it didn't have a meaningful impact–when it actually might have,
had there not been another, poorly understood (by us, that is) pool
at play. DBCP has the potential to muddy the waters: even if it
doesn't actively work against ProxySQL-level tuning, we can't really
prove that it's not a contributing cause any time one of our ProxySQL
config changes fails to improve performance.
I'm sure there is a use-case for ProxySQL, but I find that SQL workloads
fall into a few big categories:
1. Lots of writes. Caches essentially cannot handle this, *especially*
when they are distributed (which is what you have when ProxySQL is
running on the application servers, separately). ProxySQL is not
helpful, here.
2. Few writes, lots of reads. In this case, caching in the application
makes more sense to me than constantly requesting the data from the
database. Not only do you avoid the actual call to the database, but you
also avoid the necessary conversion between the database's data-model
and the application's data model. ProxySQL is not helpful, here, either.
If I were you, I'd examine the reason behind your decision to use
ProxySQL in the first place. Maybe it makes sense to use it, but maybe
it makes sense to use what you already have, and not introduce a new
component into the mix.
I haven't been able to figure out how to entirely disable DBCP so
that ProxySQL could handle the pooling entirely.
>
You can't. You'd have to tear-out the use of the JNDI-provided
DataSource altogether which isn't worth your time.
Elsewhere in this thread, Mark Thomas said this in response to Olaf
Kock, referring to his idea of a custom "non-pooling connection pool":
DBCP looks exactly like a database driver to the application. [...]
just configure the database driver directly.
This will work, and you won't get a connection pool at all in your
Tomcat/application. But it will behave as though you have a
connection-pool with an unlimited number of connections, since every
time the application requests a connection, it will get one, opened to
your ProxySQL instance. You can limit the number of connections to
ProxySQL, I suppose, but then you trade "pool borrow timeout" errors for
"connection refused" errors. You also have to construct a new network
connection for each "use" of a db connection in the application. Part of
the reason to use a connection pool (at any level) is to maintain the
open network connection between the application and the database. By
disabling the pool available to the application, you are reducing
performance in a place where a performance drop is completely unnecessary.
Now, this probably shouldn't be taken as a recommendation or even a
statement of practical feasibility. How do you view this option,
assuming local ProxySQL installs on the app servers? You can still
assume my complete ignorance about every element between the JNDI
specs and the actual configuration statements in context.xml :-)
Performance tuning is a Dark Art. You are right that if you have two
pools in-place, you may find that tuning one of them is difficult in the
presence of the other. But because you *know* that both exist, you don't
have to do that tuning in a vacuum: you can take both pools into account
when tuning that particular part of your application. Or, you can decide
that you only need one (architecturally speaking) pool at all. And my
argument is that the only pool you need is already configured and ready
to go.
YMMV
-chris
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org