On Wed, Dec 13, 2017 at 12:31 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Mon, Dec 11, 2017 at 10:41 PM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> In particular, if the worker crashed (say somebody forcefully killed >> the worker), then I think it will lead to a restart of the server. So >> not sure, adding anything about that in the comment will make much >> sense. > > > The problem is that Gather sees nworkers_launched > 0 and assumes that > it is therefore entitled not to perform a local scan. >
Yeah, I think that assumption is not right. I think ideally before assuming that it should wait for workers to startup. > I'm pretty sure > that there could also be cases where this patch makes us miss an error > reported by a worker; suppose the worker error is reported and the > worker exits after WaitForParallelWorkersToFinish does > CHECK_FOR_INTERRUPTS() but before we reach GetBackgroundWorkerPid(). > The code will simply conclude that the worker is dead and stop > waiting, never mind the fact that there's a pending error in the > queue. I have tried to reproduce this situation by adding error case in worker (just before worker returns success ('X' message)) and by having breakpoint in WaitForParallelWorkersToFinish. What, I noticed is that worker is not allowed to exit till it receives some ack from master (basically worker waits at SendProcSignal after sending message) and error is reported properly. > This is exactly why I made parallel workers send a Terminate > message to the leader instead of just disappearing -- the leader needs > to see that message to know that it's reached the end of what the > worker is going to send. > > So, I think we need to regard fork failure and related failure modes > as ERROR conditions. It's not equivalent to failing to register the > background worker. When we fail to register the background worker, > the parallel.c machinery knows at LaunchParallelWorkers() time that > the worker will never show up and lets the calling code by setting > nworkers_launched, and the calling code can then choose to proceed > with that number of workers or ERROR out as it chooses. But once we > decide on the value of nworkers_launched to report to callers, it's > really not OK for workers to just silently not ever start, as the test > case above shows. We could make it the job of code that uses the > ParallelContext machinery to cope with that situation, but I think > that's a bad plan, because every user would have to worry about this > obscure case. Right now, the only in-core use of the ParallelContext > machinery is Gather/Gather Merge, but there are pending patches for > parallel index scan and parallel VACUUM, so we'll just end up adding > this handling to more and more places. And for what? If you've got > max_connections and/or max_parallel_workers set so high that your > system can't fork(), you have a big problem, and forcing things that > would have run using parallel query to run serially instead of > erroring out is not going to fix it. I think that deciding we're > going to handle fork() failure by reporting an ERROR is just fine. > So, if I understand correctly, this means that it will return an error even if there is a problem in one worker (exited or not started) even though other workers would have easily finished the work. Won't such an error can lead to situations where after running the query for one hour the user might see some failure just because one of the many workers is not started (or some similar failure) even though the query would have been completed without that? If so, that doesn't sound like a sensible behavior. > So, here's a patch. This patch keeps track of whether we've ever > received a message from each worker via the error queue. If there are > no remaining workers which have neither exited cleanly nor sent us at > least one message, then it calls GetBackgroundWorkerPid() for each > one. If any worker reports that it is BGWH_STOPPED, then it > double-checks whether the worker has attached to the error queue. If > it did, then it assumes that we'll detect clean shutdown or an error > on the next iteration and ignores the problem. If not, then the > worker died without ever attaching the error queue, so it throws an > ERROR. I tested that this causes the test case above to throw a > proper ERROR when fork_process() returns -1. Please review... > This also doesn't appear to be completely safe. If we add proc_exit(1) after attaching to error queue (say after pq_set_parallel_master) in the worker, then it will lead to *hang* as anyone_alive will still be false and as it will find that the sender is set for the error queue, it won't return any failure. Now, I think even if we check worker status (which will be stopped) and break after the new error condition, it won't work as it will still return zero rows in the case reported by you above. I think if before making an assumption not to scan locally if we have a check to see whether workers are started, then my patch should work fine and we don't need to add any additional checks. Also, it won't create any performance issue as we will perform such a check only when force_parallel_mode is not off or parallel leader participation is off which I think are not common cases. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com