On Thu, Aug 6, 2015 at 10:10 AM, Antonin Houska <a...@cybertec.at> wrote: > During my experiments with parallel workers I sometimes saw the "master" and > worker process blocked. The master uses shm queue to send data to the worker, > both sides nowait==false. I concluded that the following happened: > > The worker process set itself as a receiver on the queue after > shm_mq_wait_internal() has completed its first check of "ptr", so this > function left sender's procLatch in reset state. But before the procLatch was > reset, the receiver still managed to read some data and set sender's procLatch > to signal the reading, and eventually called its (receiver's) WaitLatch(). > > So sender has effectively missed the receiver's notification and called > WaitLatch() too (if the receiver already waits on its latch, it does not help > for sender to call shm_mq_notify_receiver(): receiver won't do anything > because there's no new data in the queue). > > Below is my patch proposal.
Another good catch. However, I would prefer to fix this without introducing a "continue" as I think that will make the control flow clearer. Therefore, I propose the attached variant of your idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c index e765cea..8e10822 100644 --- a/src/backend/storage/ipc/shm_mq.c +++ b/src/backend/storage/ipc/shm_mq.c @@ -777,7 +777,7 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data, return SHM_MQ_DETACHED; } - if (available == 0) + if (available == 0 && !mqh->mqh_counterparty_attached) { shm_mq_result res; @@ -805,6 +805,13 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data, mqh->mqh_counterparty_attached = true; } + /* + * The receiver may have read some data after attaching, so we + * must not wait without rechecking the queue state. + */ + } + else if (available == 0) + { /* Let the receiver know that we need them to read some data. */ res = shm_mq_notify_receiver(mq); if (res != SHM_MQ_SUCCESS)
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers