On Mon, Dec 7, 2020 at 5:25 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Mon, Dec 7, 2020 at 4:20 PM Bharath Rupireddy > <bharath.rupireddyforpostg...@gmail.com> wrote: > > > > On Mon, Dec 7, 2020 at 4:04 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > What is the need of checking query_level when 'isForCTAS' is set only > > > when Gather is a top-node? > > > > > > > isForCTAS is getting set before pg_plan_query() which is being used in > > cost_gather(). We will not have a Gather node by then and hence will > > not pass queryDesc to IsParallelInsertInCTASAllowed(into, NULL) while > > setting isForCTAS to true. > > > > IsParallelInsertInCTASAllowed() seems to be returning false if > queryDesc is NULL, so won't isForCTAS be always set to false? I think > I am missing something here. >
My bad. I utterly missed this, sorry for the confusion. My intention to have IsParallelInsertInCTASAllowed() is for two purposes. 1. when called before planning without queryDesc, it should return true if IS_CTAS(into) is true and is not a temporary table. 2. when called after planning with a non-null queryDesc, along with 1) checks, it should also perform the Gather State checks and return accordingly. I have corrected it in v9 patch. Please have a look. > > > > The isForCTAS will be true because [create table as], the > > > query_level is always 1 because there is no subquery. > > > So even if gather is not the top node, parallel cost will still be > > > ignored. > > > > > > Is that works as expected ? > > > > > > > I don't think that is expected and is not the case without this patch. > > The cost shouldn't be changed for existing cases where the write is > > not pushed to workers. > > > > Thanks for pointing that out. Yes it should not change for the cases > where parallel inserts will not be picked later. > > Any better suggestions on how to make the planner consider that the > CTAS might choose parallel inserts later at the same time avoiding the > above issue in case it doesn't? > I'm not quite sure how to address this. Can we not allow the planner to consider that the select is for CTAS and check only after the planning is done for the Gather node and other checks? This is simple to do, but we might miss some parallel plans for the SELECTs because the planner would have already considered the tuple transfer cost from workers to Gather wrongly because of which that parallel plan would have become costlier compared to non parallel plans. IMO, we can do this since it also keeps the existing behaviour of the planner i.e. when the planner is planning for SELECTs it doesn't know that it is doing it for CTAS. Thoughts? With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
v9-0001-Parallel-Inserts-in-CREATE-TABLE-AS.patch
Description: Binary data