Re: [HACKERS] parallel joins, and better parallel explain

Robert Haas Wed, 09 Dec 2015 10:23:13 -0800

On Fri, Dec 4, 2015 at 3:07 AM, Amit Kapila <[email protected]> wrote:
> Do you think it will be useful to display in a similar way if worker
> is not able to execute plan (like before it starts execution, the other
> workers have already finished the work)?


Maybe, but it would clutter the output a good deal.  I think we should
instead have the Gather node itself display the number of workers that
it actually managed to launch, and then the user can infer that any
execution nodes that don't mention those workers were not executed by
them.

> Other than that parallel-explain-v2.patch looks good.

OK, I'm going to go ahead and commit that part.

> I think the problem is at Gather node, the number of buffers (read + hit)
> are greater than the number of pages in relation.  The reason why it
> is doing so is that in Workers (ParallelQueryMain()), it starts the buffer
> usage accumulation before ExecutorStart() whereas in master backend
> it always calculate it for ExecutorRun() phase, on changing it to accumulate
> for ExecutorRun() phase above problem is fixed.  Attached patch fixes the
> problem.

Why is it a bad thing to capture the cost of doing ExecutorStart() in
the worker?  I can see there's an argument that changing this would be
more consistent, but I'm not totally convinced.  The overhead of
ExecutorStart() in the leader isn't attributable to any specific
worker, but the overhead of ExecutorStart() in the worker can fairly
be blamed on Gather, I think.  I'm not dead set against this change
but I'm not totally convinced, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel joins, and better parallel explain

Reply via email to