On Sat, Jul 28, 2018 at 2:14 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > We have done verification that the approach works and fixes the bug in > all known cases. Do you see any problem with this approach?
Regarding the change to execParallel.c, I think if you're going to move that code you should add a comment explaining the justification for the new positioning, like: /* * Prepare to track buffer usage during query execution. * * We do this after starting up the executor to match what happens in the * leader, which also doesn't count buffer accesses that occur during * executor startup. */ If you just make a change and move the code without adding a comment explaining why you moved it, the next person will wonder why they shouldn't just move it back when they hit some other problem. I also agree with Andres that this could be committed separately. Regarding the change to execProcnode.c, "atleast" should be "at least". But more broadly, I don't think the comment is very clear about the justification for what we're doing, and Andres is right that duplication the comment isn't helpful. I've attempted to improve this in the attached version. Regarding Andres's comments about the changes to nodeLimit.c, I think he's right, but as you say, this isn't the first place to have this problem. The issue is that someone might be using cursor operations to fetch forward through the plan one tuple at a time, and then after triggering the shutdown, they might start fetching backward. That doesn't matter so long as shutdown callbacks are only used for shutting down stuff related to parallel query, because you can't fetch backwards from a Gather node or anything under it. But if we want to make broader use of shutdown callbacks -- and I think that would be a good idea -- then it's a problem. There's already a latent bug here, because custom scans and foreign scans are allowed to have shutdown callbacks (but postgres_fdw and file_fdw don't). Fixing that seems like a separate issue, and I think it would probably be OK to proceed with the change as you have it for now, but we ought to do something about it. I'm not sure there's a problem with rewinding, as Andres suggests: if the node is entirely rescanned, I believe it's allowed to regenerate the output, and Gather (Merge) can do that by firing up a new set of workers. But scanning backwards is a problem. I'm not exactly sure what the best way of handling that is, but one thing I think might work is to save ExecutePlan's execute_once flag in the EState and then make the call in nodeLimit.c and the one in ExecutePlan itself conditional on that flag. If we know that the plan is only going to be executed once, then there can never be any backward fetches and it's fine to shut down as soon as we finish going forward. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
fix-gather-stats-rmh.patch
Description: Binary data