In an attempt to get access, I ended up killing a postgres process and the whole thing *recovered from hang* state. Now don't have more data points to debug.
I feel the trigger is the connection in <*idle in transaction>* state. On examining the application side(Java) stacktrace, I found that other end is trying to make connection. at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:143) at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:112) at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:71) at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:272) at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:269) at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:106) at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:64) at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:123) at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:28) at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:20) at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:30) at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:22) at org.postgresql.Driver.makeConnection(Driver.java:391) at org.postgresql.Driver.connect(Driver.java:265) On Thu, Feb 7, 2013 at 4:52 PM, Scott Marlowe <scott.marl...@gmail.com>wrote: > On Wed, Feb 6, 2013 at 11:55 PM, Anoop K <anoo...@gmail.com> wrote: > > We are hitting a situation where REINDEX is resulting in postgresql to > go to > > dead lock state for ever. On debugging the issue we found that > > 3 connections are going in to some dead lock state. > > > > idle in transaction > > REINDEX waiting > > SELECT waiting > > > > All these connections are made in the same minute. Once in deadlock > state we > > are not able to make new connections to db.(So not able to view pg_locks > > also). New connections appears as 'startup waiting' in ps output. > Initially > > we suspected <idle in transaction> is the result of not closing a > > connection. But it seems it got stuck after creating a connection and is > not > > able to proceed. > > This may or may not be a deadlock. Unless you've got a circle it's > not a deadlock, it's just a "cascading lock overloading your > connection limit" failure. You can get these with slony and vacuums > and ddl. Say I want to run a DDL script. Someone is running vacuum > (could be autovac process). I run slony execute to run ddl and it > waits with hard table locks, and all the updates stall behind that. > Your db then runs out of connections. What we need to know is what > that idle in transaction is just sitting there waiting to do, which is > usually a combination of db state and application state. > > As a short term fix you can set some reasonable statement level > timeout on the reindex's connection, user or database. If no reindex > ever takes more than a minute and you give it 5 minutes and check the > logs for it you can see how often it fails (once every month or once > every minute you check for a while etc then you could set that user's > connect. If that user is the superuser things become problematic. >