On Fri, Dec 25, 2020 at 10:04 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Dec 25, 2020 at 9:54 AM Bharath Rupireddy > <bharath.rupireddyforpostg...@gmail.com> wrote: > > > > On Fri, Dec 25, 2020 at 7:12 AM vignesh C <vignes...@gmail.com> wrote: > > > On Thu, Dec 24, 2020 at 11:29 AM Amit Kapila <amit.kapil...@gmail.com> > > > wrote: > > > > > > > > On Thu, Dec 24, 2020 at 10:25 AM vignesh C <vignes...@gmail.com> wrote: > > > > > > > > > > On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy > > > > > <bharath.rupireddyforpostg...@gmail.com> wrote: > > > > > > > > > > > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy > > > > > > Attaching v14 patch set that has above changes. Please consider this > > > > > > for further review. > > > > > > > > > > > > > > > > Few comments: > > > > > In the below case, should create be above Gather? > > > > > postgres=# explain create table t7 as select * from t6; > > > > > QUERY PLAN > > > > > ------------------------------------------------------------------- > > > > > Gather (cost=0.00..9.17 rows=0 width=4) > > > > > Workers Planned: 2 > > > > > -> Create t7 > > > > > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > > > (4 rows) > > > > > > > > > > Can we change it to something like: > > > > > ------------------------------------------------------------------- > > > > > Create t7 > > > > > -> Gather (cost=0.00..9.17 rows=0 width=4) > > > > > Workers Planned: 2 > > > > > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > > > (4 rows) > > > > > > > > > > > > > I think it is better to have it in a way as in the current patch > > > > because that reflects that we are performing insert/create below > > > > Gather which is the purpose of this patch. I think this is similar to > > > > what the Parallel Insert patch [1] has for a similar plan. > > > > > > > > > > > > [1] - https://commitfest.postgresql.org/31/2844/ > > > > > > > > > > Also another thing that I felt was that actually the Gather nodes will > > > actually do the insert operation, the Create table will be done earlier > > > itself. Should we change Create table to Insert table something like > > > below: > > > QUERY PLAN > > > ------------------------------------------------------------------- > > > Gather (cost=0.00..9.17 rows=0 width=4) > > > Workers Planned: 2 > > > -> Insert table2 (instead of Create table2) > > > -> Parallel Seq Scan on table1 (cost=0.00..9.17 rows=417 width=4) > > > > IMO, showing Insert under Gather makes sense if the query is INSERT > > INTO SELECT as it's in the other patch [1]. Since here it is a CTAS > > query, so having Create under Gather looks fine to me. This way we can > > also distinguish the EXPLAINs of parallel inserts in INSERT INTO > > SELECT and CTAS. > > > > Right, IIRC, we have done the way it is in the patch for convenience > and to move forward with it and come back to it later once all other > parts of the patch are good. > > > And also, some might wonder that Create under Gather means that each > > parallel worker is creating the table, it's actually not the creation > > of the table that's parallelized but it's insertion. If required, we > > can clarify it in CTAS docs with a sample EXPLAIN. I have not yet > > added docs related to allowing parallel inserts in CTAS. Shall I add a > > para saying when parallel inserts can be picked and how the sample > > EXPLAIN looks? Thoughts? > > > > Yeah, I don't see any problem with it, and maybe we can move Explain > related code to a separate patch. The reason is we don't display DDL > part without parallelism and this might need a separate discussion. >
This makes sense to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com