First of all, thank you this detailed analysis of the situation.
Based on the bug report I suspected the driver, but it now seems clear
it was (at least part of) the problem.

re: Why was it necessary for the abandoned connection thread to act?
I'm not sure.. I looked into the JDBC Session manager class and it
appeared to be simply reading/deserializing a session from the
database when the deadlock occurred.  I would expect that to be a very
fast query -- this is a production system but under very light load.

I looked into the pool configuration and 'removed abandoned timeout'
was 60 seconds.  That timeout seems a bit short, but if it takes > 60
sec to load a session from the database then there is a serious
performance issue.  I've never observed delays like that in the
application.

My guess at this point is perhaps the db (Amazon RDS/MySql) was
unavailable/very slow/down in some way, causing this connection to be
held beyond the timeout.  I don't have great visibility into
performance or outage information for RDS, tho I may reach out to DBAs
and try to get it.

Thanks again for your help.

Regards,
Colin

On Mon, Mar 18, 2013 at 6:39 AM, Mark Thomas <ma...@apache.org> wrote:
> On 15/03/2013 21:36, Colin Ingarfield wrote:
>
> Short version:
> Your upgrade to the latest Connector/J will have fixed this particular
> problem.
>
>
> Long version:
>
> <snip/>
>
>>>>> Found one Java-level deadlock: =============================
>
> <snip>
>
> Thread 1:
>
>>>>> Here are the stack traces: Thread 12820: (state = BLOCKED) -
>>>>> com.mysql.jdbc.ConnectionImpl.getCharacterSetMetadata() @bci=0,
>>>>> line=2851 (Compiled frame) -
>>>>> com.mysql.jdbc.Field.getStringFromBytes(int, int) @bci=37,
>>>>> line=717 (Compiled frame) - com.mysql.jdbc.Field.getName() @bci=17,
>>>>> line=631 (Interpreted frame) -
>>>>> com.mysql.jdbc.ResultSetImpl.buildIndexMapping() @bci=78, line=752
>>>>> (Compiled frame) -
>>>>> com.mysql.jdbc.ResultSetImpl.findColumn(java.lang.String) @bci=12,
>>>>> line=1110 (Interpreted frame) -
>>>>> com.mysql.jdbc.ResultSetImpl.getString(java.lang.String) @bci=3,
>>>>> line=5609 (Interpreted frame) -
>>>>> org.eclipse.jetty.server.session.JDBCSessionManager$1.run()
>>>>> @bci=111, line=844 (Interpreted frame) -
>
> <snip/>
>
> Thread 2:
>
>>>>> Thread 890: (state = BLOCKED) -
>>>>> com.mysql.jdbc.ResultSetImpl.realClose(boolean) @bci=0, line=7195
>>>>> (Interpreted frame) - com.mysql.jdbc.ResultSetImpl.close() @bci=2,
>>>>> line=909 (Interpreted frame) -
>>>>> com.mysql.jdbc.StatementImpl.realClose(boolean, boolean) @bci=126,
>>>>> line=2478 (Interpreted frame) -
>>>>> com.mysql.jdbc.PreparedStatement.realClose(boolean, boolean)
>>>>> @bci=71, line=3098 (Interpreted frame) -
>>>>> com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements() @bci=90,
>>>>> line=1628 (Interpreted frame) -
>>>>> com.mysql.jdbc.ConnectionImpl.realClose(boolean, boolean, boolean,
>>>>> java.lang.Throwable) @bci=176, line=4388 (Interpreted frame) -
>>>>> com.mysql.jdbc.ConnectionImpl.close() @bci=32, line=1601
>>>>> (Interpreted frame) -
>>>>> org.apache.tomcat.jdbc.pool.PooledConnection.disconnect(boolean)
>>>>> @bci=47, line=330 (Interpreted frame) -
>
> <snip/>
>
>>>>> Once I dug up these stack traces I started to wonder if the mysql
>>>>> driver was the problem (or contributing to the problem.)  I was
>>>>> using Connector/J version 5.1.19 when the deadlock occurred.  I
>>>>> found this bug:  http://bugs.mysql.com/bug.php?id=61247 which
>>>>> sounds a lot like what appears to have happened.  I'm interested in
>>>>> your thoughts on this.
>
> The problem is related to two threads accessing the same connection.
> Given the scenario - the pool spots a potentially abandoned connection
> and tries to close it - I don't view this as unreasonable behaviour by
> the pool since there are no other options available.
>
> The problem is that Thread 1 (above) results in this call order:
> ResultSetImpl.findColumn(java.lang.String)  - Syncs on ResultSetImpl
> ...
> ConnectionImpl.getCharacterSetMetadata()    - Syncs on ConnectionImpl
>
> while Thread 2 results in this call order:
> ConnectionImpl.close()            - Syncs on ConnectionImpl
> ...
> ResultSetImpl.realClose(boolean)  - Syncs on ResultSetImpl
>
> Two threads obtaining the same locks in a different order will result in
> a deadlock.
>
> I'd view that as a bug in the JDBC driver since - to my mind - "thread
> safe" includes "doesn't deadlock". However, I can see a counter argument
> that goes along the lines of:
> - The deadlock is triggered when the cleaner thread tries to close an
> active connection
> - The cleaner thread is closing abandoned connections
> - The only cause of an abandoned connection is buggy client code
> - Therefore the real root cause is buggy client code
>
>
>>>>> In the meantime I have upgraded to latest Connector/J which
>>>>> includes a fix for this bug.  I was running the old driver for
>>>>> months before this deadlock, though, so it will be difficult to
>>>>> know if it fixes the issue or not.
>
> A review of the Connector/J code indicates that this particular deadlock
> has been fixed (all the syncs above now sync on a new connectionMutex
> object). This a) fixes this particular deadlock and b) ensures that
> external code that syncs on a Connection/ResultSet/whatever can't
> interfere and trigger a deadlock.
>
> <snip/>
>
>> My database access code is built on Spring 3's JDBC APIs, so I do not
>> do any explicit connection management in my code.  Its Spring's job to
>> get connections and release them, and I assume its JDBC APIs is
>> well-tested.  So I don't think my code is the problem here.
>
> I agree but see the counter argument above. If the code accessing the
> database is perfect it shouldn't need to worry about abandoned connections.
>
>> The Jetty session manager is storing HTTP session data in the
>> database, and it does acquire connections directly from the pool.  I
>> haven't dug through the Jetty source to see if they are doing anything
>> questionable, tho I did email their user list about this issue.
>
> I don't see anything that jumps out at me but there is the "Why does it
> need to managed abandoned connections?" question.
>
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to