Re: Explain buffers wrong counter with parallel plans

Robert Haas Wed, 01 Aug 2018 07:22:29 -0700

On Sat, Jul 28, 2018 at 2:14 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> We have done verification that the approach works and fixes the bug in
> all known cases.  Do you see any problem with this approach?


Regarding the change to execParallel.c, I think if you're going to
move that code you should add a comment explaining the justification
for the new positioning, like:

/*
 * Prepare to track buffer usage during query execution.
 *
 * We do this after starting up the executor to match what happens in the
 * leader, which also doesn't count buffer accesses that occur during
 * executor startup.
 */

If you just make a change and move the code without adding a comment
explaining why you moved it, the next person will wonder why they
shouldn't just move it back when they hit some other problem.  I also
agree with Andres that this could be committed separately.

Regarding the change to execProcnode.c, "atleast" should be "at
least".  But more broadly, I don't think the comment is very clear
about the justification for what we're doing, and Andres is right that
duplication the comment isn't helpful.  I've attempted to improve this
in the attached version.

Regarding Andres's comments about the changes to nodeLimit.c, I think
he's right, but as you say, this isn't the first place to have this
problem.  The issue is that someone might be using cursor operations
to fetch forward through the plan one tuple at a time, and then after
triggering the shutdown, they might start fetching backward.  That
doesn't matter so long as shutdown callbacks are only used for
shutting down stuff related to parallel query, because you can't fetch
backwards from a Gather node or anything under it.  But if we want to
make broader use of shutdown callbacks -- and I think that would be a
good idea -- then it's a problem.  There's already a latent bug here,
because custom scans and foreign scans are allowed to have shutdown
callbacks (but postgres_fdw and file_fdw don't).

Fixing that seems like a separate issue, and I think it would probably
be OK to proceed with the change as you have it for now, but we ought
to do something about it.  I'm not sure there's a problem with
rewinding, as Andres suggests: if the node is entirely rescanned, I
believe it's allowed to regenerate the output, and Gather (Merge) can
do that by firing up a new set of workers.  But scanning backwards is
a problem.  I'm not exactly sure what the best way of handling that
is, but one thing I think might work is to save ExecutePlan's
execute_once flag in the EState and then make the call in nodeLimit.c
and the one in ExecutePlan itself conditional on that flag.  If we
know that the plan is only going to be executed once, then there can
never be any backward fetches and it's fine to shut down as soon as we
finish going forward.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

fix-gather-stats-rmh.patch
Description: Binary data

Re: Explain buffers wrong counter with parallel plans

Reply via email to