I have a production system using Postgresql 9.1.2.

The system basically receives messages, puts them in a queue, and then several 
parallel modules, each in its own thread, read from that queue, and perform two 
inserts, then release the message to the next queue for non-database-related 
processing.

Today, after we received complaints from a customer about delays, I noticed odd 
behavior in the system. About 2 percent of the messages were inserted into the 
tables more than an hour after they got into the system.

The queue never has more than 27,000 messages at the same time, and all 
together, the parallel modules process about 5000 or 6000 messages per minute. 
So basically, the delay for a single message should never be more than a few 
minutes. Even if one module gets stuck, another will grab the next message from 
the queue. I believe the only way for a message to be stuck for so long would 
be for it to be grabbed by a module, and then for the database write to be 
somehow delayed for a hour, although it's merely a simple insert performed with 
a prepared statement.

The database in production is very busy with millions of writes per hour. Could 
there be a situation in which a particular connection gets "starved" while 
other connections are able to run queries without noticeable delay?

How can I truck such locks down? Does anybody have any ideas other than 
starvation? The system lets me view statistics of how many messages were 
processed in each modules and the average latency. None of the four modules 
running has long average latency or low number of messages processes, so I 
don't think the issue is related to any particular thread in my (Java) system 
being slow or locked away by the others.

TIA,
Herouth



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to