Hello!

We have received an issue from our customer that after some time in
production
they get empty connection pool. After some time we managed to reproduce
their
problem in our test environment with smaller number of pool connections.

So generally we have a problem that if connection pool is under heavy load
(meaning most of the time all the connections are taken - active) it starts
to
loose database connections (we call it bleed): they are lost from the
perspective of Tomcat but they are very much alive when we check then on
the
database side.

We are using Apache Tomcat 7.0.47 running on java 1.6. Database is SQL
Server
2008R2 and we are using Hibernate as persistence provider. Our
configuration
is defined in server.xml in the GlobalNamingResources section (referenced
from
context.xml):

<Resource
    name="jdbc/ChiDS"
    auth="Container"
    type="javax.sql.DataSource"
    factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
    driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver"
    url="jdbc:sqlserver://*****;databaseName=chiTestDB"
    username="*****"
    password="*****"
    initialSize="2"
    maxActive="5"
    maxIdle="4"
    minIdle="2"
    maxWait="15000"
    timeBetweenEvictionRunsMillis="30000"
    testOnBorrow="true"
    testOnReturn="false"
    testOnIdle="false"
    validationQuery="SELECT 1"
    validationInterval="30000"
    removeAbandoned="true"
    removeAbandonedTimeout="60"
    useEquals="false"/>

I will explain in detail what happens. For better understanding I am also
providing graph of idle and active connections from Java VisualVM MBeans
viewer: http://img11.imageshack.us/img11/6888/fm4j.jpg

So we start the server and connect clients to the server that generate
small
database load. After that we order half of our clients to start generating
operations that result in medium database load (I think around 300
transactions
per second). We can see from the graph that connection pool is behaving
normally (sometimes all 5 connections are taken, sometimes not - if we
would
run test like this all day everything would be OK). So after 7 minutes we
increase the load by 100% (other half of clients start to do operations).
The load is bigger than 5 pool connections can handle: pretty much all the
time all 5 connections are active so we can soon expect that server
application
logic will start to receive exceptions that jdbc connection is not
available
(this should happen if application logic does not receive pool connection
in 15
seconds). Indeed this happens and the following exception is thrown:

aused by: org.hibernate.exception.GenericJDBCException: Cannot open
connection
    at
org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:140)
    at
org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:128)
    at
org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
    at
org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:52)
    at
org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:449)
    at
org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:167)
    at org.hibernate.jdbc.JDBCContext.connection(JDBCContext.java:142)
    at
org.hibernate.transaction.JDBCTransaction.begin(JDBCTransaction.java:85)
    at
org.hibernate.impl.SessionImpl.beginTransaction(SessionImpl.java:1463)
    at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:60)
... 18 more
Caused by: java.sql.SQLException: Pool wait interrupted.
    at
org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:655)
    at
org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:188)
    at
org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:128)
    at
org.hibernate.connection.DatasourceConnectionProvider.getConnection(DatasourceConnectionProvider.java:92)
    at
org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:446)
... 23 more
Caused by: java.lang.InterruptedException
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1024)
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1303)
    at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:253)
    at
org.apache.tomcat.jdbc.pool.FairBlockingQueue.poll(FairBlockingQueue.java:151)
    at
org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:650)

But here something goes wrong because active connections count drops by 1
and
never goes back even though database load is still the same. Judging from
the
graph at this moment pool is giving only 4 pool connections to the
application.
Pretty soon the same exception is thrown again reducing the active pool
count
by 1. And then very quickly same exception is thrown 3 more times and pool
has
no active or idle connections. After this another type of exceptions starts
to
pop in log file:

Caused by: javax.persistence.PersistenceException:
org.hibernate.exception.GenericJDBCException: Cannot open connection
    at
org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1235)
    at
org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1168)
    at
org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1245)
    at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:63)
    at
com.hsl.chi.persistence.dao.jpa.DAO.beginEntityTransaction(DAO.java:1505)
    at
com.hsl.chi.persistence.dao.jpa.DAO.beginFirstSubOperation(DAO.java:992)
    at
com.hsl.chi.persistence.dao.jpa.SequenceDAO.internalResetSequence(SequenceDAO.java:568)
... 9 more
Caused by: org.hibernate.exception.GenericJDBCException: Cannot open
connection
    at
org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:140)
    at
org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:128)
    at
org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
    at
org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:52)
    at
org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:449)
    at
org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:167)
    at org.hibernate.jdbc.JDBCContext.connection(JDBCContext.java:142)
    at
org.hibernate.transaction.JDBCTransaction.begin(JDBCTransaction.java:85)
    at
org.hibernate.impl.SessionImpl.beginTransaction(SessionImpl.java:1463)
    at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:60)
... 12 more
Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException:
[[ProtocolHandler]-[pool-1-thread-113]] Timeout: Pool empty. Unable to
fetch a connection in 15 seconds, none available[size:5; busy:0; idle:0;
lastwait:15000].
    at
org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:674)
    at
org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:188)
    at
org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:128)
    at
org.hibernate.connection.DatasourceConnectionProvider.getConnection(DatasourceConnectionProvider.java:92)
    at
org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:446)

So from the Tomcat point of view we have no jdbc connections to the
database
in the pool. From this point no database operation can succeed: even if
load
is reduced to minimum - every database operation fails from application.
Then we check the database side: SQL server creates its internal process
for
each jdbc connection: we can see all 5 connections much alive doing
nothing.
Since we are using hibernate we were quickly able to do the same test on
PostgreSQL database and identical scenario happens: PostgreSQL shows
5 idle processes while pool is empty. The only thing we can do here is to
restart the Tomcat.

We also tried other configurations (Tomcat 7.0.32, hibernate library was
updated to latest version, we used PostgreSQL server instead of SQL server)
and different connection pool properties but without any success - we
always
end up with empty pool. Tomcat properties that we tried:

logAbandoned="true" - did not log anything (I guess stack trace should be
thrown to console, log file if connection would not be  closed from
application?), so I guess hibernate is closing connections properly
(we are always calling entity manager close method when we finish).

jdbcInterceptors="StatementFinalizer;SlowQueryReportJmx(threshold=10000)" -
statement finalizier did not help

testOnReturn="true", testOnIdle="true", propagateInterruptState="true" -
did not help


So the last option was to change connection pool: we used C3P0 connection
pool
with properties that are similar in meaning as Tomcat ones. We repeated the
above test case and it is working correctly. We have left it with full load
half an hour in which we get a lot of exceptions that application did not
receive connection but pool is still working OK (all connections are active
almost all the time). Exception that is being thrown:
com.mchange.v2.resourcepool.BasicResourcePool@437ff0 -- an attempt to
checkout
a resource was interrupted, and the pool is still live: some other thread
must
have either interrupted the Thread attempting checkout!
java.lang.InterruptedException . Then we reduce the load and exceptions are
soon gone since 5 connection can handle them. Here is the configuration from
server.xml for C3P0:

<Resource
    name="jdbc/ChiDS"
    auth="Container"
    type="com.mchange.v2.c3p0.ComboPooledDataSource"
    dataSourceName="jdbc/ChiDS"
    factory="org.apache.naming.factory.BeanFactory"
    driverClass="com.microsoft.sqlserver.jdbc.SQLServerDriver"
    jdbcUrl="jdbc:sqlserver://*****;databaseName=chiDB"
    user="*****"
    password="*****"
    initialPoolSize="2"
    minPoolSize="2"
    maxPoolSize="5"
    preferredTestQuery="SELECT 1"
    testConnectionOnCheckin="true"
    idleConnectionTestPeriod="30"
    numHelperThreads="5"/>

Currently our customer raised maxActive to bigger number but i guess this
will
not do once they will start increasing the number of clients (our
application
has big amount of transactions) and they will have to change connection
pool.
Are there any additional configuration setting I can try on Tomcat
connection
pool, otherwise this looks like a bug?

regards, Marko

Reply via email to