On Tue, Jan 23, 2018 at 1:45 AM, Peter Geoghegan <p...@bowt.ie> wrote: > On Mon, Jan 22, 2018 at 3:52 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> The difference is that nodeGather.c doesn't have any logic like the >> one you have in _bt_leader_heapscan where the patch waits for each >> worker to increment nparticipantsdone. For Gather node, we do such a >> thing (wait for all workers to finish) by calling >> WaitForParallelWorkersToFinish which will have the capability after >> Robert's patch to detect if any worker is exited abnormally (fork >> failure or failed before attaching to the error queue). > > FWIW, I don't think that that's really much of a difference. > > ExecParallelFinish() calls WaitForParallelWorkersToFinish(), which is > similar to how _bt_end_parallel() calls > WaitForParallelWorkersToFinish() in the patch. The > _bt_leader_heapscan() condition variable wait for workers that you > refer to is quite a bit like how gather_readnext() behaves. It > generally checks to make sure that all tuple queues are done. > gather_readnext() can wait for developments using WaitLatch(), to make > sure every tuple queue is visited, with all output reliably consumed. >
The difference lies in the fact that in gather_readnext, we use tuple queue mechanism which has the capability to detect that the workers are stopped/exited whereas _bt_leader_heapscan doesn't have any such capability, so I think it will loop forever. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com