On Wed, Dec 30, 2020 at 10:49 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Wed, 30 Dec 2020 at 10:47 AM, Bharath Rupireddy > <bharath.rupireddyforpostg...@gmail.com> wrote: >> >> On Wed, Dec 30, 2020 at 10:32 AM Dilip Kumar <dilipbal...@gmail.com> wrote: >> > I have completed reviewing 0001, I don't have more comments, just one >> > question. Soon I will review the remaining patches. >> >> Thanks. >> >> > + /* If parallel inserts are to be allowed, set a few extra >> > information. */ >> > + if (myState->is_parallel) >> > + { >> > + myState->object_id = intoRelationAddr.objectId; >> > + >> > + /* >> > + * We don't need to skip contacting FSM while inserting tuples for >> > + * parallel mode, while extending the relations, workers instead >> > of >> > + * blocking on a page while another worker is inserting, can >> > check the >> > + * FSM for another page that can accommodate the tuples. This >> > results >> > + * in major benefit for parallel inserts. >> > + */ >> > + myState->ti_options = 0; >> > >> > Is there any performance data for this or just theoretical analysis? >> >> I have seen that we don't get much performance with the skip fsm >> option, though I don't have the data to back it up. I'm planning to >> run performance tests after the patches 0001, 0002 and 0003 get >> reviewed. I will capture the data at that time. Hope that's fine. > > > Yeah that’s fine >
Some comments in 0002 1. +/* + * Information sent to the planner from CTAS to account for the cost + * calculations in cost_gather. We need to do this because, no tuples will be + * received by the Gather node if the workers insert the tuples in parallel. + */ +typedef enum CTASParallelInsertOpt +{ + CTAS_PARALLEL_INS_UNDEF = 0, /* undefined */ + CTAS_PARALLEL_INS_SELECT = 1 << 0, /* turn on this before planning */ + /* + * Turn on this while planning for upper Gather path to ignore parallel + * tuple cost in cost_gather. + */ + CTAS_PARALLEL_INS_TUP_COST_CAN_IGN = 1 << 1, + /* Turn on this after the cost is ignored. */ + CTAS_PARALLEL_INS_TUP_COST_IGNORED = 1 << 2 +} CTASParallelInsertOpt; I don't like the naming of these flags. Especially no need to define CTAS_PARALLEL_INS_UNDEF, we can directl use 0 for that purpose instead of giving some weird name. So I suggest first, just get rid of CTAS_PARALLEL_INS_UNDEF. 2. + /* + * Turn on a flag to ignore parallel tuple cost by the Gather path in + * cost_gather if the SELECT is for CTAS and we are generating an upper + * level Gather path. + */ + bool ignore = ignore_parallel_tuple_cost(root); + generate_useful_gather_paths(root, rel, false); + /* + * Reset the ignore flag, in case we turned it on but + * generate_useful_gather_paths returned without reaching cost_gather. + * If we reached cost_gather, we would have been reset it there. + */ + if (ignore && (root->parse->CTASParallelInsInfo & + CTAS_PARALLEL_INS_TUP_COST_CAN_IGN)) + { + root->parse->CTASParallelInsInfo &= + ~CTAS_PARALLEL_INS_TUP_COST_CAN_IGN; + } I think th way we are using these cost ignoring flag, doesn't look clean. I mean first, CTAS_PARALLEL_INS_SELECT is set if it is coming from CTAS and then ignore_parallel_tuple_cost will set the CTAS_PARALLEL_INS_TUP_COST_CAN_IGN if it satisfies certain condition which is fine. Now, internally cost gather will add CTAS_PARALLEL_INS_TUP_COST_IGNORED and remove CTAS_PARALLEL_INS_TUP_COST_CAN_IGN and if CTAS_PARALLEL_INS_TUP_COST_CAN_IGN is not removed then we will remove it outside. Why do we need to remove CTAS_PARALLEL_INS_TUP_COST_CAN_IGN flag at all? 3. + if (tuple_cost_flags && gstate->ps.ps_ProjInfo) + Assert(!(*tuple_cost_flags & CTAS_PARALLEL_INS_TUP_COST_IGNORED)); Instead of adding Assert inside an IF statement, you can convert whole statement as an assert. Lets not add unnecessary if in the release mode. 4. + if ((root->parse->CTASParallelInsInfo & CTAS_PARALLEL_INS_SELECT) && + (root->parse->CTASParallelInsInfo & + CTAS_PARALLEL_INS_TUP_COST_CAN_IGN)) + { + ignore_tuple_cost = true; + root->parse->CTASParallelInsInfo &= + ~CTAS_PARALLEL_INS_TUP_COST_CAN_IGN; + root->parse->CTASParallelInsInfo |= CTAS_PARALLEL_INS_TUP_COST_IGNORED; + } + + if (!ignore_tuple_cost) + run_cost += parallel_tuple_cost * path->path.rows; Changes this to (if, else) as shown below, because if it goes to the IF part then ignore_tuple_cost will always be true so no need to have an extra if check. if ((root->parse->CTASParallelInsInfo & CTAS_PARALLEL_INS_SELECT) && (root->parse->CTASParallelInsInfo & CTAS_PARALLEL_INS_TUP_COST_CAN_IGN)) { ignore_tuple_cost = true; root->parse->CTASParallelInsInfo &= ~CTAS_PARALLEL_INS_TUP_COST_CAN_IGN; root->parse->CTASParallelInsInfo |= CTAS_PARALLEL_INS_TUP_COST_IGNORED; } else run_cost += parallel_tuple_cost * path->path.rows; -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com