On 14.10.20 11:16, Bharath Rupireddy wrote:
On Tue, Oct 6, 2020 at 10:58 AM Amit Kapila <amit.kapil...@gmail.com> wrote:

Yes we do a bunch of catalog changes related to the created new table.
We will have both the txn id and command id assigned when catalogue
changes are being made. But, right after the table is created in the
leader, the command id is incremented (CommandCounterIncrement() is
called from create_ctas_internal()) whereas the txn id remains the
same. The new command id is marked as GetCurrentCommandId(true); in
intorel_startup, then the parallel mode is entered. The txn id and
command id are serialized into parallel DSM, they are then available
to all parallel workers. This is discussed in [1].

Few changes I have to make in the parallel worker code: set
currentCommandIdUsed = true;, may be via a common API
SetCurrentCommandIdUsedForWorker() proposed in [1] and remove the
extra command id sharing from the leader to workers.

I will add a few comments in the upcoming patch related to the above info.


Yes, that would be good.


Added comments.


But how does that work for SELECT INTO? Are you prohibiting
that? ...


In case of SELECT INTO, a new table gets created and I'm not
prohibiting the parallel inserts and I think we don't need to.


So, in this case, also do we ensure that table is created before we
launch the workers. If so, I think you can explain in comments about
it and what you need to do that to ensure the same.


For SELECT INTO, the table gets created by the leader in
create_ctas_internal(), then ExecInitParallelPlan() gets called which
launches the workers and then the leader(if asked to do so) and the
workers insert the rows. So, we don't need to do any extra work to
ensure the table gets created before the workers start inserting
tuples.


While skimming through the patch, a small thing I noticed:
+ /*
+ * SELECT part of the CTAS is parallelizable, so we can make
+ * each parallel worker insert the tuples that are resulted
+ * in it's execution into the target table.
+ */
+ if (!is_matview &&
+ IsA(plan->planTree, Gather))
+ ((DR_intorel *) dest)->is_parallel = true;
+

I am not sure at this stage if this is the best way to make CTAS as
parallel but if so, then probably you can expand the comments a bit to
say why you consider only Gather node (and that too when it is the
top-most node) and why not another parallel node like GatherMerge?


If somebody expects to preserve the order of the tuples that are
coming from GatherMerge node of the select part in CTAS or SELECT INTO
while inserting, now if parallelism is allowed, that may not be the
case i.e. the order of insertion of tuples may vary. I'm not quite
sure, if someone wants to use order by in the select parts of CTAS or
SELECT INTO in a real world use case. Thoughts?


Right, for now, I think you can simply remove that check from the code
instead of just commenting it. We will see if there is a better
check/Assert we can add there.


Done.

I also worked on some of the open points I listed earlier in my mail.


3. Need to restrict parallel inserts, if CTAS tries to create temp/global 
tables as the workers will not have access to those tables.


Done.


Need to analyze whether to allow parallelism if CTAS has prepared statements or 
with no data.


For prepared statements, the parallelism will not be picked and so is
parallel insertion.
For CTAS with no data option case the select part is not even planned,
and so the parallelism will also not be picked.


4. Need to stop unnecessary parallel shared state such as tuple queue being 
created and shared to workers.


Done.

I'm listing the things that are still pending.

1. How to represent the parallel insert for CTAS in explain plans? The
explain CTAS shows the plan for only the SELECT part. How about having
some textual info along with the Gather node? I'm not quite sure on
this point, any suggestions are welcome.
2. Addition of new test cases. Testing with more scenarios and
different data sets, sizes, tablespaces, select into. Analysis on the
2 mismatches in write_parallel.sql regression test.

Attaching v2 patch, thoughts and comments are welcome.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Hi,

Really looking forward to this ending up in postgres as I think it's a very nice improvement.

Whilst reviewing your patch I was wondering: is there a reason you did not introduce a batch insert in the destreceiver for the CTAS? For me this makes a huge difference in ingest speed as otherwise the inserts do not really scale so well as lock contention start to be a big problem. If you like I can make a patch to introduce this on top?

Kind regards,
Luc
Swarm64


Reply via email to