On Fri, Oct 23, 2015 at 10:33 AM, Robert Haas <robertmh...@gmail.com> wrote: > > + /* > + * We can't finish transaction commit or abort until all of the > + * workers are dead. This means, in particular, that > we can't respond > + * to interrupts at this stage. > + */ > + HOLD_INTERRUPTS(); > + status = > WaitForBackgroundWorkerShutdown(pcxt->worker[i].bgwhandle); > + RESUME_INTERRUPTS(); > > These comments are correct when this code is called from > DestroyParallelContext(), but they're flat wrong when called from > ReinitializeParallelDSM(). I suggest moving the comment back to > DestroyParallelContext and following it with this: > > HOLD_INTERRUPTS(); > WaitForParallelWorkersToDie(); > RESUME_INTERRUPTS(); > > Then ditch the HOLD/RESUME interrupts in WaitForParallelWorkersToDie() itself. >
Changed as per suggestion. > This hunk is a problem: > > case 'X': /* Terminate, > indicating clean exit */ > { > - pfree(pcxt->worker[i].bgwhandle); > pfree(pcxt->worker[i].error_mqh); > - pcxt->worker[i].bgwhandle = NULL; > pcxt->worker[i].error_mqh = NULL; > break; > } > > If you do that on receipt of the 'X' message, then > DestroyParallelContext() might SIGTERM a worker that has supposedly > exited cleanly. That seems bad. I think maybe the solution is to > make DestroyParallelContext() terminate the worker only if > pcxt->worker[i].error_mqh != NULL. Changed as per suggestion. > So make error_mqh == NULL mean a > clean loss of a worker: either we couldn't register it, or it exited > cleanly. And bgwhandle == NULL would mean it's actually gone. > I think even if error_mqh is NULL, it not guarnteed that the worker has exited, it ensures that clean worker shutdown is either in-progress or done. > It makes sense to have ExecShutdownGather and > ExecShutdownGatherWorkers, but couldn't the former call the latter > instead of duplicating the code? > makes sense, so changed accordingly. > I think ReInitialize should be capitalized as Reinitialize throughout. > Changed as per suggestion. > ExecParallelReInitializeTupleQueues is almost a cut-and-paste > duplicate of ExecParallelSetupTupleQueues. Please refactor this to > avoid duplication - e.g. change > ExecParallelSetupTupleQueues(ParallelContext *pcxt) to take a second > argument bool reinit. ExecParallelReInitializeTupleQueues can just do > ExecParallelSetupTupleQueues(pxct, true). > Changed as per suggestion. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
parallel_seqscan_partialseqscan_v23.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers