On Mon, Feb 17, 2014 at 8:45 AM, Herouth Maoz <hero...@unicell.co.il> wrote:

> I have a production system using Postgresql 9.1.2.
>
> The system basically receives messages, puts them in a queue, and then
> several parallel modules, each in its own thread, read from that queue, and
> perform two inserts, then release the message to the next queue for
> non-database-related processing.
>
> Today, after we received complaints from a customer about delays, I
> noticed odd behavior in the system. About 2 percent of the messages were
> inserted into the tables more than an hour after they got into the system.
>

How do you know that?


>
> The queue never has more than 27,000 messages at the same time, and all
> together, the parallel modules process about 5000 or 6000 messages per
> minute. So basically, the delay for a single message should never be more
> than a few minutes. Even if one module gets stuck, another will grab the
> next message from the queue. I believe the only way for a message to be
> stuck for so long would be for it to be grabbed by a module, and then for
> the database write to be somehow delayed for a hour, although it's merely a
> simple insert performed with a prepared statement.
>
> The database in production is very busy with millions of writes per hour.
> Could there be a situation in which a particular connection gets "starved"
> while other connections are able to run queries without noticeable delay?
>

If there is a delay like that, it would almost certainly be due to database
locks that show up in pg_locks.

http://www.postgresql.org/docs/current/static/view-pg-locks.html
http://wiki.postgresql.org/wiki/Lock_Monitoring
http://wiki.postgresql.org/wiki/Lock_dependency_information

But, I doubt that that is your problem.


>
> How can I truck such locks down? Does anybody have any ideas other than
> starvation? The system lets me view statistics of how many messages were
> processed in each modules and the average latency. None of the four modules
> running has long average latency or low number of messages processes, so I
> don't think the issue is related to any particular thread in my (Java)
> system being slow or locked away by the others.
>

If the insert into PostgreSQL was freezing, wouldn't that time get
reflected in your latency monitoring?

It sounds to me like your application has a bug in its queue, where it
forgets about items on the queue for a while.

Cheers,

Jeff

Reply via email to