*1) I think this is the really burning question in my mind: Why is the
server opening NEW connections when there are dozens of them already open?
Does this mean that (as I suspect) all those connections are abandoned?  If
so, why are they not getting cleaned up, when we have removeAbandoned set
to true and removeAbandonedTimeout set to a RIDICULOUSLY low value of 30
seconds?  (We MUST increase this value.  The plan is one hour.)*
Hello Shawn, about this question, are you sure that none of the webapps
running on those tomcats are connecting to the database without using the
pools configured in the context.xml? Creating other pools or performing
direct connections? That could explain while changing the configuration you
are not able to limit the number of connections neither to mitigate the
problem with removedAbandoned configuration.
HTH
Filippo

On Sat, Mar 24, 2018 at 5:08 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> This message is long.  Lots of details, a fair amount of history.
>
> The primary Tomcat version we've got is 7.0.42. Specifically, it is the
> Tomcat that's included with Liferay 6.2. This is why we haven't attempted
> an upgrade even though the version we're running is five years old -- we
> don't want to rock the boat too hard and risk everything not working.
> Liferay is battle-tested with that version of Tomcat.
>
> Background:
>
> Every now and then, we find that our MySQL master has reached the
> connection limit of 600 connections, and websites stop working, because
> they are trying to open new connections and failing.  When I look at the DB
> server, it shows a HUGE number of connections (300 to 500 out of the 600
> max) that have idle times between one hour and eight hours.  The MySQL
> server-side idle timeout is at the default of 28800 seconds -- eight
> hours.  The idle connections are coming from the webservers.
>
> There are five main webservers, two of which run exclusively Liferay 6.2,
> and three that are running older apps on Tomcat version that's probably
> ANCIENT, as well as Liferay 6.1 with Tomcat 7.0.23.  I can't seem to figure
> out what version the really ancient one is.  There are no version numbers
> that I can see in the files in the program directory.  We also have a
> staging server, a dev server, and a few ancillary systems.  The staging and
> dev servers mainly use a dev/staging database server, but do need to
> connect to the main DB server for some things.
>
> When we run out of connections, each of these five webservers has nearly
> 100 (and sometimes MORE than 100) open connections to the database server.
> And the majority of them have been idle for a LONG time.
>
> A number of JNDI database resources are configured in conf/context.xml.
> So we have connection pooling.  But even though there are TONS of
> connections already established from Tomcat, and completely idle from the
> DB server's perspective, the application is STILL trying to open a NEW
> connection when somebody visits a page.  This is the precise issue that
> connection pooling SHOULD be preventing.
>
> I've discussed this with the commons-user mailing list, from a DBCP
> perspective.  I started there because there is some different code that I
> wrote, which doesn't run in Tomcat, and uses DBCP2 natively.  I'm a lot
> more familiar with my code than the code running in Tomcat.
>
> Based on my discussions with commons-user, I really think that the reason
> that the DB pools are trying to create more connections even though there's
> already a lot of them open is because the connections are actually
> abandoned.  I suspect they were never closed by the application, so the
> pool still has them as active, and thinks it can't use them.  Now that the
> discussion directly involves configuring pools using Tomcat's own DBCP
> implementation, they're not really able to help me any further.
>
> If we are dealing with abandoned connections as I suspect, then I need to
> ask why abandoned connection removal isn't actually working, and how to
> configure it so that it DOES work.
>
> We have figured out how to log the number of idle and active connections
> on the datasource, but it's going to take some time to get that logging
> into the production servers, so I don't have definitive proof that the
> connections are actually active.
>
> Here's a redacted configuration that we have in place for one of the DB
> connection pools on the production servers:
>
>         <Resource name="jdbc/REDACTED" auth="Container"
> factory="org.apache.tomcat.dbcp.dbcp.BasicDataSourceFactory"
> driverClassName="com.mysql.jdbc.Driver" type="javax.sql.DataSource"
> maxActive="60" maxIdle="10" maxWait="30000" removeAbandoned="true"
> removeAbandonedTimeout="30" username="REDACTED" password="REDACTED"
> testOnBorrow="true" validationQuery="select 1" url="jdbc:mysql://
> REDACTED.REDACTED.com:3306/REDACTED?autoReconnect=true&amp;zeroDateTimeB
> ehavior=round" />
>
> This is the Tomcat documentation I'm using as a reference:
>
> https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html
>
> Looking at Tomcat 7.0 jdbc documentation, the first thing that I notice is
> that the factory we have configured is not even mentioned in the
> documentation.
>
> One person (who replied privately and not to commons-user) said that BOTH
> factory and type attribute values mentioned in the Tomcat 7.0 docs are
> completely wrong!  And another (who did reply via the list) said that the
> factory mentioned in the documentation is "the alternative pool" and that I
> should be using the one documented here instead.  But this page is REALLY
> lean on details of how to actually set up the resource:
>
> https://tomcat.apache.org/tomcat-7.0-doc/jndi-datasource-
> examples-howto.html#Database_Connection_Pool_(DBCP)_Configurations
>
> =====================================
>
> So now, with the sordid past covered, I have three main questions:
>
> 1) I think this is the really burning question in my mind: Why is the
> server opening NEW connections when there are dozens of them already open?
> Does this mean that (as I suspect) all those connections are abandoned?  If
> so, why are they not getting cleaned up, when we have removeAbandoned set
> to true and removeAbandonedTimeout set to a RIDICULOUSLY low value of 30
> seconds?  (We MUST increase this value.  The plan is one hour.)
>
> 2, background) We tried to reconfigure all the pools on our staging server
> to a new config.  They currently have a config much like what I shared
> above, but with smaller pool size numbers. Here's an example of what we
> tried to put in place:
>
>         <Resource name="jdbc/REDACTED"
>                         auth="Container"
> factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
> driverClassName="com.mysql.jdbc.Driver"
>                         type="javax.sql.DataSource"
>                         maxActive="10"
>                         maxIdle="2"
>                         minIdle="1"
>                         maxAge="7200000"
>                         removeAbandoned="true"
>                         removeAbandonedTimeout="3600"
>                         logAbandoned="true"
>                         username="REDACTED"
>                         password="REDACTED"
>                         testOnBorrow="true"
>                         testOnConnect="true"
>                         testOnReturn="true"
>                         testWhileIdle="true"
>                         validationQuery="SELECT 1"
>                         validationQueryTimeout="5"
>                         validationInterval="5000"
>                         minEvictableIdleTimeMillis="900000"
>                         allowMultiQueries="false"
> url="jdbc:mysql://REDACTED.REDACTED.com:3306/REDACTED?autoRe
> connect=true&amp;zeroDateTimeBehavior=round"
>                         />
>
> When we tried to start the service with a config like this for all of the
> pools (there are half a dozen of them), it failed to correctly start,
> getting too many connection errors.
>
> The size of the pools on this staging server is a LOT smaller than what we
> have configured in production.  The DB server it connects to is a lot less
> beefy, and has a limit of 400 connections.  I am told by the developer that
> did the restart that when this service started up, it had made 140
> connections to the dev/staging DB server before failing.  Which shouldn't
> be possible, with half a dozen pool configs that all have a maxActive set
> to 10.
>
> 2, question) Have I got something wrong in the new config? If I have, and
> there are recommendations for what I should change, please tell me WHY each
> change is recommended, and if possible, point me at relevant
> documentation.  If you can, tell me how I may have misread the Tomcat
> documentation, which I used extensively in coming up with the new config.
>
> Side note/question:  One of the replies I got on the DBCP list referred me
> to this Javadoc:
>
> https://commons.apache.org/proper/commons-dbcp/api-1.4/org/
> apache/commons/dbcp/BasicDataSource.html#getRemoveAbandoned()
>
> Which basically says that the 1.4 version of the standard DBCP library
> won't even try to eliminate abandoned connections on a pool of size 60
> until there are more than 57 connections open on the one pool.  This sounds
> like a REALLY bad idea.  The Tomcat jdbc documentation lists
> "abandonWhenPercentageFull" as an attribute, and the description there
> hints that Tomcat's jdbc implementation doesn't suffer from the same
> problem.  But with the unusual factory we have configured, I don't know if
> our configuration is even properly honored.  Does the tomcat pool have that
> same issue?
>
> Because we have multiple pools defined, and some of those pools have a
> maxActive number that's pretty large, I'm betting that we run out of
> connections long before that one pool has opened 57 of them.
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

Reply via email to