from:"Greg Nancarrow"

Re: Libpq support to connect to standby server as priority

2020-12-01 Thread Greg Nancarrow

On Thu, Nov 26, 2020 at 11:07 AM Greg Nancarrow  wrote:
>
> On Thu, Nov 26, 2020 at 3:43 AM Tom Lane  wrote:
> >
> > Thanks for looking!  Pushed.
> >
> > At this point the cfbot is going to start complaining that the last-posted
> > patch in this thread no longer applies.  Unless you have a new patch set
> > nearly ready to post, I think we should close the CF entry as RWF, and
> > then you can open a new one when you're ready.
> >
>
> Actually, the cfbot shouldn't be complaining, as my last-posted patch
> still seems to apply cleanly on the latest code (with your pushed
> patch), and all tests pass.
> Hopefully it's OK to let it roll over to the next CF with the WOA status.
> I am actively working on processing the feedback and updating the
> patch, and hope to post an update next week sometime.
>

Posting an updated set of patches.

Regards,
Greg Nancarrow
Fujitsu Australia


v20-0001-in_hot_standby-and-transaction_read_only-reportable-GUCs.patch
Description: Binary data


v20-0002-Enhance-libpq-target_session_attrs.patch
Description: Binary data

Re: On login trigger: take three

2020-12-03 Thread Greg Nancarrow

On Tue, Sep 15, 2020 at 2:12 AM Pavel Stehule  wrote:
>
>>
>> It is always possible to login by disabling startup triggers using 
>> disable_session_start_trigger GUC:
>>
>> psql "dbname=postgres options='-c disable_session_start_trigger=true'"
>
>
> sure, I know. Just this behavior can be a very unpleasant surprise, and my 
> question is if it can be fixed.  Creating custom libpq variables can be the 
> stop for people that use pgAdmin.
>

Hi,

I thought in the case of using pgAdmin (assuming you can connect as
superuser to a database, say the default "postgres" maintenance
database, that doesn't have an EVENT TRIGGER defined for the
session_start event) you could issue the query "ALTER SYSTEM SET
disable_session_start_trigger TO true;"  and then reload the
configuration?

Anyway, I am wondering if this patch is still being actively developed/improved?

Regarding the last-posted patch, I'd like to give some feedback. I
found that the documentation part wouldn't build because of errors in
the SGML tags. There are some grammatical errors too, and some minor
inconsistencies with the current documentation, and some descriptions
could be improved. I think that a colon separator should be added to
the NOTICE message for superuser, so it's clear exactly where the text
of the underlying error message starts. Also, I think that
"client_connection" is perhaps a better and more intuitive event name
than "session_start", or the suggested "user_backend_start".
I've therefore attached an updated patch with these suggested minor
improvements, please take a look and see what you think (please
compare with the original patch).

Regards,
Greg Nancarrow
Fujitsu Australia

on_connect_event_trigger_WITH_SUGGESTED_UPDATES.patch
Description: Binary data

Re: On login trigger: take three

2020-12-07 Thread Greg Nancarrow

On Fri, Dec 4, 2020 at 9:05 PM Konstantin Knizhnik
 wrote:
>
> As far as I understand Pavel concern was about the case when superuser
> defines wrong login trigger which prevents login to the system
> all user including himself. Right now solution of this problem is to
> include "options='-c disable_session_start_trigger=true'" in connection
> string.
> I do not know if it can be done with pgAdmin.
> >

As an event trigger is tied to a particular database, and a GUC is
global to the cluster, as long as there is one database in the cluster
for which an event trigger for the "client_connection" event is NOT
defined (say the default "postgres" maintenance database), then the
superuser can always connect to that database, issue "ALTER SYSTEM SET
disable_client_connection_trigger TO true" and reload the
configuration. I tested this with pgAdmin4 and it worked fine for me,
to allow login to a database for which login was previously prevented
due to a badly-defined logon trigger.

Pavel, is this an acceptable solution or do you still see problems
with this approach?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: On login trigger: take three

2020-12-09 Thread Greg Nancarrow

On Tue, Dec 8, 2020 at 3:26 PM Pavel Stehule  wrote:
>
>
> There are two maybe generic questions?
>
> 1. Maybe we can introduce more generic GUC for all event triggers like 
> disable_event_triggers? This GUC can be checked only by the database owner or 
> super user. It can be an alternative ALTER TABLE DISABLE TRIGGER ALL. It can 
> be protection against necessity to restart to single mode to repair the event 
> trigger. I think so more generic solution is better than special 
> disable_client_connection_trigger GUC.
>
> 2. I have no objection against client_connection. It is probably better for 
> the mentioned purpose - possibility to block connection to database. Can be 
> interesting, and I am not sure how much work it is to introduce the second 
> event - session_start. This event should be started after connecting - so the 
> exception there doesn't block connect, and should be started also after the 
> new statement "DISCARD SESSION", that will be started automatically after 
> DISCARD ALL.  This feature should not be implemented in first step, but it 
> can be a plan for support pooled connections
>

I've created a separate patch to address question (1), rather than
include it in the main patch, which I've adjusted accordingly. I'll
leave question (2) until another time, as you suggest.
See the attached patches.

Regards,
Greg Nancarrow
Fujitsu Australia


v1-0001-Add-new-config-parameter-disable_event_triggers.patch
Description: Binary data


v1-0002-Add-new-client_connection-event-supporting-a-logon-trigger.patch
Description: Binary data

Re: Parallel INSERT (INTO ... SELECT ...)

2020-12-09 Thread Greg Nancarrow

On Thu, Dec 10, 2020 at 3:50 PM vignesh C  wrote:
> Few comments:
> +   Node   *index_expr;
> +
> +   if (index_expr_item == NULL)
>  /* shouldn't happen */
> +   elog(ERROR, "too few
> entries in indexprs list");
> +
> +   index_expr = (Node *)
> lfirst(index_expr_item);
>
> We can change this elog to below to maintain consistency:
> if (index_expr_item == NULL)/* shouldn't happen */
> {
>   context->max_hazard = PROPARALLEL_UNSAFE;
>   return context->max_hazard;
> }
>

Thanks. I think you pointed out something similar to this before, but
somehow I must have missed updating this as well (I was just following
existing error handling for this case in the Postgres code).
I'll update it as you suggest, in the next version of the patch I post.

> static HeapTuple
> heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
> CommandId cid, int options)
> {
> /*
> * To allow parallel inserts, we need to ensure that they are safe to be
> * performed in workers. We have the infrastructure to allow parallel
> * inserts in general except for the cases where inserts generate a new
> * CommandId (eg. inserts into a table having a foreign key column).
> */
> I felt we could remove the above comments or maybe rephrase it.
>

That is Amit's comment, and I'm reluctant to change it because it is
still applicable even after application of this patch.
Amit has previously suggested that I add an Assert here, to match the
comment (to replace the original Parallel-worker error-check that I
removed), so I am looking into that.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2020-12-09 Thread Greg Nancarrow

On Thu, Dec 10, 2020 at 1:23 PM tsunakawa.ta...@fujitsu.com
 wrote:
>
> From: Greg Nancarrow 
> > Firstly, in order to perform parallel-safety checks in the case of 
> > partitions, the
> > patch currently recursively locks/unlocks
> > (AccessShareLock) each partition during such checks (as each partition may
> > itself be a partitioned table). Is there a better way of performing the
> > parallel-safety checks and reducing the locking requirements?
>
> First of all, as you demonstrated the planning time and execution time of 
> parallel insert, I think the increased planning time is negligible when the 
> parallel insert is intentionally used for loading large amount of data.  
> However, it's a problem if the overhead is imposed on OLTP transactions.  
> Does the overhead occur with the default values of 
> max_parallel_workers_per_gather = 2 and max_parall_workers = 8?
>
> To avoid this heavy checking during planning, I'm wondering if we can have an 
> attribute in pg_class, something like relhasindexes and relhas triggers.  The 
> concerning point is that we have to maintain the accuracy of the value when 
> dropping ancillary objects around the table/partition.
>

Having information in another table that needs to be accessed is
likely to also have locking requirements.
Here the issue is specifically with partitions, because otherwise if
the target relation is not a partitioned table, it will already be
locked prior to planning as part of the parse/re-write phase (and you
will notice that the initial lock-mode, used by the parallel-safety
checking code for opening the table, is NoLock).

>
> > Secondly, I found that when running "make check-world", the
> > "partition-concurrent-attach" test fails, because it is expecting a 
> > partition
> > constraint to be violated on insert, while an "alter table attach partition 
> > ..." is
> > concurrently being executed in another transaction. Because of the partition
> > locking done by the patch's parallel-safety checking code, the insert 
> > blocks on
> > the exclusive lock held by the "alter table" in the other transaction until 
> > the
> > transaction ends, so the insert ends up successfully completing (and thus 
> > fails
> > the test) when the other transaction ends. To overcome this test failure, 
> > the
> > patch code was updated to instead perform a conditional lock on the 
> > partition,
> > and on failure (i.e. because of an exclusive lock held somewhere else), just
> > assume it's parallel-unsafe because the parallel-safety can't be determined
> > without blocking on the lock. This is not ideal, but I'm not sure of what 
> > other
> > approach could be used and I am somewhat reluctant to change that test. If
> > anybody is familiar with the "partition-concurrent-attach" test, any ideas 
> > or
> > insights would be appreciated.
>
> That test looks sane.  I think what we should do is to disable parallel 
> operation during that test.  It looks like some of other existing test cases 
> disable parallel query by setting max_parallel_workers_per_gather to 0.  It's 
> not strange that some tests fail with some configuration.  autovacuum is 
> disabled in many places of the regression test.
>
> Rather, I don't think we should introduce the trick to use 
> ConditionalLockAcquire().  Otherwise, the insert would be executed in a 
> serial fashion without the user knowing it -- "What?  The insert suddenly 
> slowed down multiple times today, and it didn't finish within the planned 
> maintenance window.  What's wrong?"
>
>

I think that's probably the best idea, to disable parallel operation
during that test.
However, that doesn't change the fact that, after removal of that
"trick", then the partition locking used in the parallel-safety
checking code will block, if a concurrent transaction has exclusively
locked that partition (as in this test case), and thus there is no
guarantee that a parallel insert will execute faster compared to
serial execution (as such locks tend to be held until the end of the
transaction).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2020-12-10 Thread Greg Nancarrow

On Thu, Dec 10, 2020 at 5:25 PM Dilip Kumar  wrote:
>
>
>  /*
> + * PrepareParallelMode
> + *
> + * Prepare for entering parallel mode, based on command-type.
> + */
> +void
> +PrepareParallelMode(CmdType commandType)
> +{
> + Assert(!IsInParallelMode() || force_parallel_mode != FORCE_PARALLEL_OFF);
> +
> + if (IsModifySupportedInParallelMode(commandType))
> + {
> + /*
> + * Prepare for entering parallel mode by assigning a
> + * FullTransactionId, to be included in the transaction state that is
> + * serialized in the parallel DSM.
> + */
> + (void) GetCurrentTransactionId();
> + }
> +}
>
> Why do we need to serialize the transaction ID for 0001?  I mean in
> 0001 we are just allowing the SELECT to be executed in parallel so why
> we would need the transaction Id for that.  I agree that we would need
> this once we try to perform the Insert also from the worker in the
> remaining patches.
>

There's a very good reason. It's related to parallel-mode checks for
Insert and how the XID is lazily acquired if required.
When allowing SELECT to be executed in parallel, we're in
parallel-mode and the leader interleaves Inserts with retrieval of the
tuple data from the workers.
You will notice that heap_insert() calls GetTransactionId() as the
very first thing it does. If the FullTransactionId is not valid,
AssignTransactionId() is then called, which then executes this code:

/*
 * Workers synchronize transaction state at the beginning of each parallel
 * operation, so we can't account for new XIDs at this point.
 */
if (IsInParallelMode() || IsParallelWorker())
elog(ERROR, "cannot assign XIDs during a parallel operation");

So that code (currently) has no way of knowing that a XID is being
(lazily) assigned at the beginning, or somewhere in the middle of, a
parallel operation.
This is the reason why PrepareParallelMode() is calling
GetTransactionId() up-front, to ensure a FullTransactionId is assigned
up-front, prior to parallel-mode (so then there won't be an attempted
XID assignment).

If you remove the GetTransactionId() call from PrepareParallelMode()
and run "make installcheck-world" with "force_parallel_mode=regress"
in effect, many tests will fail with:
ERROR:  cannot assign XIDs during a parallel operation

Regards,
Greg Nancarrow
Fujitsu Australia

Re: libpq debug log

2020-12-15 Thread Greg Nancarrow

On Tue, Oct 27, 2020 at 3:23 AM Alvaro Herrera  wrote:
>
> I've been hacking at this patch again.  There were a few things I wasn't
> too clear about, so I reordered the code and renamed the routines to try
> to make it easier to follow.
>

Hi,

Hopefully Iwata-san will return to looking at this soon.

I have tried the latest patch a little (using "psql" as my client),
and have a few small comments so far:

- In pqTraceInit(), in the (admittedly rare) case that fe_msg malloc()
fails, I'd NULL out be_msg too after free(), rather than leave it
dangling (because if pgTraceInit() was ever invoked again, as the
comment says it could, it would result in previously freed memory
being accessed ...)

conn->fe_msg = malloc(MAX_FRONTEND_MSGS * sizeof(pqFrontendMessage));
if (conn->fe_msg == NULL)
{
free(conn->be_msg);
conn->be_msg = NULL;
return false;
}

- >3. COPY ... (FORMAT BINARY) emits "invalid protocol" ... not good.

That seemed to happen for me only if COPYing binary format to stdout.

UnknownCommand :::Invalid Protocol

- >5. Error messages are still printing the terminating zero byte.  I
  >suggest that it should be suppressed.

Perhaps there's a more elegant way of doing it, but I got rid of the
logging of the zero byte using the following change to
pgLogMsgByte1(), though there still seems to be a trailing space
issue:

case LOG_CONTENTS:
-   fprintf(conn->Pfdebug, "%c ", v);
+   if (v != '\0')
+   fprintf(conn->Pfdebug, "%c ", v);
    pqTraceMaybeBreakLine(sizeof(v), conn);
break;

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-11 Thread Greg Nancarrow

On Fri, Feb 12, 2021 at 2:33 PM Zhihong Yu  wrote:
>
> For v17-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch :
>
> +   /* Assume original queries have hasModifyingCTE set correctly */
> +   if (parsetree->hasModifyingCTE)
> +   hasModifyingCTE = true;
>
> Since hasModifyingCTE is false by the time the above is run, it can be 
> simplified as:
> hasModifyingCTE = parsetree->hasModifyingCTE
>

Actually, we should just return parsetree->hasModifyingCTE at this
point, because if it's false, we shouldn't need to continue the search
(as we're assuming it has been set correctly for QSRC_ORIGINAL case).

> +   if (!hasSubQuery)
> +   return false;
> +
> +   return true;
>
> The above can be simplified as:
> return hasSubQuery;
>

Yes, absolutely right, silly miss on that one!
Thanks.

This was only ever meant to be a temporary fix for this bug that
affects this patch.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-11 Thread Greg Nancarrow

On Fri, Feb 12, 2021 at 3:21 PM Zhihong Yu  wrote:
>
> Greg:
> bq. we should just return parsetree->hasModifyingCTE at this point,
>
> Maybe you can clarify a bit.
> The if (parsetree->hasModifyingCTE) check is followed by if 
> (!hasModifyingCTE).
> When parsetree->hasModifyingCTE is false, !hasModifyingCTE would be true, 
> resulting in the execution of the if (!hasModifyingCTE) block.
>
> In your reply, did you mean that the if (!hasModifyingCTE) block is no longer 
> needed ? (I guess not)
>

Sorry for not making it clear. What I meant was that instead of:

if (parsetree->querySource == QSRC_ORIGINAL)
{
  /* Assume original queries have hasModifyingCTE set correctly */
  if (parsetree->hasModifyingCTE)
hasModifyingCTE = true;
}

I thought I should be able to use the following (it the setting for
QSRC_ORIGINAL can really be trusted):

if (parsetree->querySource == QSRC_ORIGINAL)
{
  /* Assume original queries have hasModifyingCTE set correctly */
  return parsetree->hasModifyingCTE;
}

(and then the "if (!hasModifyingCTE)" test on the code following
immediately below it can be removed)

BUT - after testing that change, the problem test case (in the "with"
tests) STILL fails.
I then checked if hasModifyingCTE is always false in the QSRC_ORIGINAL
case (by adding an Assert), and it always is false.
So actually, there is no point in having the "if
(parsetree->querySource == QSRC_ORIGINAL)" code block - even the so
called "original" query doesn't maintain the setting correctly (even
though the actual original query sent into the query rewriter does).
And also then the "if (!hasModifyingCTE)" test on the code following
immediately below it can be removed.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-12 Thread Greg Nancarrow

On Fri, Feb 12, 2021 at 5:30 PM tsunakawa.ta...@fujitsu.com
 wrote:
>
> > If I disable parallel_leader_participation.
> >
> > For max_parallel_workers_per_gather = 4, It still have performance
> > degradation.
> >
> > For max_parallel_workers_per_gather = 2, the performance degradation will
> > not happen in most of the case.
> > There is sometimes a noise(performance degradation), but most of
> > result(about 80%) is good.
>
> Thank you.  The results indicate that it depends on the degree of parallelism 
> whether the gain from parallelism outweighs the loss of parallel insert 
> operations, at least in the bitmap scan case.
>

That seems to be the pattern for this particular query, but I think
we'd need to test a variety to determine if that's always the case.

> But can we conclude that this is limited to bitmap scan?  Even if that's the 
> case, the planner does not have information about insert operation to choose 
> other plans like serial execution or parallel sequential scan.  Should we 
> encourage the user in the manual to tune parameters and find the fastest plan?
>
>

It's all based on path costs, so we need to analyze and compare the
costing calculations done in this particular case against other cases,
and the values of the various parameters (costsize.c).
It's not difficult to determine for a parallel ModifyTablePath if it
has a BitmapHeapPath subpath - perhaps total_cost needs adjustment
(increase) for this case - and that will influence the planner to
choose a cheaper path. I was able to easily test the effect of doing
this, in the debugger - by increasing total_cost in cost_modifytable()
for the parallel bitmap heap scan case, the planner then chose a
serial Insert + bitmap heap scan, because it then had a cheaper cost.
Of course we need to better understand the problem and observed
patters in order to get a better feel of how those costs should be
adjusted.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-14 Thread Greg Nancarrow

On Sat, Feb 13, 2021 at 12:17 AM Amit Langote  wrote:
>
> On Thu, Feb 11, 2021 at 4:43 PM Greg Nancarrow  wrote:
> > On Thu, Feb 11, 2021 at 5:33 PM Greg Nancarrow  wrote:
> > > On Tue, Feb 9, 2021 at 1:04 AM Amit Langote  
> > > wrote:
> > > >
> > > > * I think that the concerns raised by Tsunakawa-san in:
> > > >
> > > > https://www.postgresql.org/message-id/TYAPR01MB2990CCB6E24B10D35D28B949FEA30%40TYAPR01MB2990.jpnprd01.prod.outlook.com
> > > >
> > > > regarding how this interacts with plancache.c deserve a look.
> > > > Specifically, a plan that uses parallel insert may fail to be
> > > > invalidated when partitions are altered directly (that is without
> > > > altering their root parent).  That would be because we are not adding
> > > > partition OIDs to PlannerGlobal.invalItems despite making a plan
> > > > that's based on checking their properties.  See this (tested with all
> > > > patches applied!):
> > > >
> > >
> > > Does any current Postgres code add partition OIDs to
> > > PlannerGlobal.invalItems for a similar reason?
>
> Currently, the planner opens partitions only for SELECT queries and
> also adds them to the query's range table.  And because they are added
> to the range table, their OIDs do get added to
> PlannerGlobal.relationOids (not invalItems, sorry!) by way of
> CompleteCachedPlan() calling extract_query_dependencies(), which looks
> at Query.rtable to decide which tables/partitions to add.
>
> > > I would have thought that, for example,  partitions with a default
> > > column expression, using a function that is changed from SAFE to
> > > UNSAFE, would suffer the same plancache issue (for current parallel
> > > SELECT functionality) as we're talking about here - but so far I
> > > haven't seen any code handling this.
>
> AFAIK, default column expressions don't affect plans for SELECT
> queries.  OTOH, consider a check constraint expression as an example.
> The planner may use one to exclude a partition from the plan with its
> constraint exclusion algorithm (separate from "partition pruning").
> If the check constraint is dropped, any cached plans that used it will
> be invalidated.
>

Sorry, I got that wrong, default column expressions are relevant to
INSERT, not SELECT.

> >
> > Actually, I tried adding the following in the loop that checks the
> > parallel-safety of each partition and it seemed to work:
> >
> > glob->relationOids =
> > lappend_oid(glob->relationOids, pdesc->oids[i]);
> >
> > Can you confirm, is that what you were referring to?
>
> Right.  I had mistakenly mentioned PlannerGlobal.invalItems, sorry.
>
> Although it gets the job done, I'm not sure if manipulating
> relationOids from max_parallel_hazard() or its subroutines is okay,
> but I will let the committer decide that.  As I mentioned above, the
> person who designed this decided for some reason that it is
> extract_query_dependencies()'s job to populate
> PlannerGlobal.relationOids/invalItems.
>

Yes, it doesn't really seem right doing it within max_parallel_hazard().
I tried doing it in extract_query_dependencies() instead - see
attached patch - and it seems to work, but I'm not sure if there might
be any unintended side-effects.

Regards,
Greg Nancarrow
Fujitsu Australia


setrefs.patch
Description: Binary data

Re: Libpq support to connect to standby server as priority

2021-02-15 Thread Greg Nancarrow

On Fri, Feb 12, 2021 at 2:42 PM vignesh C  wrote:
> > Thanks, just one minor thing I missed in doc/src/sgml/libpq.sgml.
> >
> > +The support of read-write transactions is determined by the
> > value of the
> > +default_transaction_read_only and
> > +in_hot_standby configuration parameters, that is
> > +reported by the server (if supported) upon successful connection.  
> > If
> >
> >
> > should be:
> >
> > +The support of read-write transactions is determined by the
> > values of the
> > +default_transaction_read_only and
> > +in_hot_standby configuration parameters, that 
> > are
> > +reported by the server (if supported) upon successful connection.  
> > If
> >
> >
> > (i.e. "value" -> "values" and "is" -> "are")
>
> Thanks for the comments, this is handled in the v23 patch attached.
> Thoughts?
>

I've marked this as "Ready for Committer".

(and also added you to the author list)


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-16 Thread Greg Nancarrow

=


test=# set max_parallel_workers_per_gather=4;
SET
test=# explain analyze verbose insert into testscan_index select a
from x where a<8 or (a%2=0 and a>19990);

QUERY PLAN


 Gather  (cost=4193.29..1255440.94 rows=74267 width=0) (actual
time=560.460..562.386 rows=0 loops=1)
   Workers Planned: 4
   Workers Launched: 4
   ->  Insert on public.testscan_index  (cost=3193.29..1247014.24
rows=0 width=0) (actual time=553.434..553.435 rows=0 loops=5)
 Worker 0:  actual time=548.751..548.752 rows=0 loops=1
 Worker 1:  actual time=552.008..552.009 rows=0 loops=1
 Worker 2:  actual time=553.094..553.095 rows=0 loops=1
 Worker 3:  actual time=553.389..553.390 rows=0 loops=1
 ->  Parallel Bitmap Heap Scan on public.x
(cost=3193.29..1247014.24 rows=18567 width=8) (actual
time=13.759..34.487 ro
ws=26000 loops=5)
   Output: x.a, NULL::integer
   Recheck Cond: ((x.a < 8) OR (x.a > 19990))
   Filter: ((x.a < 8) OR (((x.a % 2) = 0) AND (x.a >
19990)))
   Rows Removed by Filter: 1
   Heap Blocks: exact=183
   Worker 0:  actual time=8.698..29.924 rows=26173 loops=1
   Worker 1:  actual time=12.865..33.889 rows=27421 loops=1
   Worker 2:  actual time=13.088..32.823 rows=24591 loops=1
   Worker 3:  actual time=14.075..36.349 rows=26571 loops=1
   ->  BitmapOr  (cost=3193.29..3193.29 rows=170535
width=0) (actual time=19.356..19.357 rows=0 loops=1)
 ->  Bitmap Index Scan on x_a_idx
(cost=0.00..1365.94 rows=73783 width=0) (actual time=10.330..10.330
rows=
7 loops=1)
   Index Cond: (x.a < 8)
 ->  Bitmap Index Scan on x_a_idx
(cost=0.00..1790.21 rows=96752 width=0) (actual time=9.024..9.024
rows=10
 loops=1)
   Index Cond: (x.a > 19990)
 Planning Time: 0.219 ms
 Execution Time: 562.442 ms
(25 rows)



test=# set max_parallel_workers_per_gather=0;
SET
test=# truncate testscan_index;
TRUNCATE TABLE
test=# explain analyze verbose insert into testscan_index select a
from x where a<8 or (a%2=0 and a>19990);
QUERY
PLAN


 Insert on public.testscan_index  (cost=3193.29..3625636.35 rows=0
width=0) (actual time=607.619..607.621 rows=0 loops=1)
   ->  Bitmap Heap Scan on public.x  (cost=3193.29..3625636.35
rows=74267 width=8) (actual time=21.001..96.283 rows=12 loops
=1)
 Output: x.a, NULL::integer
 Recheck Cond: ((x.a < 8) OR (x.a > 19990))
 Filter: ((x.a < 8) OR (((x.a % 2) = 0) AND (x.a > 19990)))
 Rows Removed by Filter: 5
 Heap Blocks: exact=975
 ->  BitmapOr  (cost=3193.29..3193.29 rows=170535 width=0)
(actual time=20.690..20.691 rows=0 loops=1)
   ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1365.94
rows=73783 width=0) (actual time=9.097..9.097 rows=7 lo
ops=1)
 Index Cond: (x.a < 8)
   ->  Bitmap Index Scan on x_a_idx  (cost=0.00..1790.21
rows=96752 width=0) (actual time=11.591..11.591 rows=10
 loops=1)
 Index Cond: (x.a > 19990)
 Planning Time: 0.205 ms
 Execution Time: 607.734 ms
(14 rows)


Even when I changed the queries to return more rows from the scan, to
the point where it chose not to use a parallel INSERT bitmap heap scan
(in favour of parallel seq scan), and then forced it to by disabling
seqscan, I found that it was still at least as fast as serial INSERT
with bitmap heap scan.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-16 Thread Greg Nancarrow

On Wed, Feb 17, 2021 at 12:19 AM Amit Langote  wrote:
>
> On Mon, Feb 15, 2021 at 4:39 PM Greg Nancarrow  wrote:
> > On Sat, Feb 13, 2021 at 12:17 AM Amit Langote  
> > wrote:
> > > On Thu, Feb 11, 2021 at 4:43 PM Greg Nancarrow  
> > > wrote:
> > > > Actually, I tried adding the following in the loop that checks the
> > > > parallel-safety of each partition and it seemed to work:
> > > >
> > > > glob->relationOids =
> > > > lappend_oid(glob->relationOids, pdesc->oids[i]);
> > > >
> > > > Can you confirm, is that what you were referring to?
> > >
> > > Right.  I had mistakenly mentioned PlannerGlobal.invalItems, sorry.
> > >
> > > Although it gets the job done, I'm not sure if manipulating
> > > relationOids from max_parallel_hazard() or its subroutines is okay,
> > > but I will let the committer decide that.  As I mentioned above, the
> > > person who designed this decided for some reason that it is
> > > extract_query_dependencies()'s job to populate
> > > PlannerGlobal.relationOids/invalItems.
> >
> > Yes, it doesn't really seem right doing it within max_parallel_hazard().
> > I tried doing it in extract_query_dependencies() instead - see
> > attached patch - and it seems to work, but I'm not sure if there might
> > be any unintended side-effects.
>
> One issue I see with the patch is that it fails to consider
> multi-level partitioning, because it's looking up partitions only in
> the target table's PartitionDesc and no other.
>
> @@ -3060,8 +3066,36 @@ extract_query_dependencies_walker(Node *node,
> PlannerInfo *context)
> RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
>
> if (rte->rtekind == RTE_RELATION)
> -   context->glob->relationOids =
> -   lappend_oid(context->glob->relationOids, rte->relid);
> +   {
> +   PlannerGlobal   *glob;
> +
> +   glob = context->glob;
> +   glob->relationOids =
> +   lappend_oid(glob->relationOids, rte->relid);
> +   if (query->commandType == CMD_INSERT &&
> +   rte->relkind == RELKIND_PARTITIONED_TABLE)
>
> The RTE whose relkind is being checked here may not be the INSERT
> target relation's RTE, even though that's perhaps always true today.
> So, I suggest to pull the new block out of the loop over rtable and
> perform its deeds on the result RTE explicitly fetched using
> rt_fetch(), preferably using a separate recursive function.  I'm
> thinking something like the attached revised version.
>
>

Thanks. Yes, I'd forgotten about the fact a partition may itself be
partitioned, so it needs to be recursive (like in the parallel-safety
checks).
Your revised version seems OK, though I do have a concern:
Is the use of "table_close(rel, NoLock)'' intentional? That will keep
the lock (lockmode) until end-of-transaction.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-17 Thread Greg Nancarrow

On Thu, Feb 18, 2021 at 12:34 AM Amit Langote  wrote:
>
> > Your revised version seems OK, though I do have a concern:
> > Is the use of "table_close(rel, NoLock)'' intentional? That will keep
> > the lock (lockmode) until end-of-transaction.
>
> I think we always keep any locks on relations that are involved in a
> plan until end-of-transaction.  What if a partition is changed in an
> unsafe manner between being considered safe for parallel insertion and
> actually performing the parallel insert?
>
> BTW, I just noticed that exctract_query_dependencies() runs on a
> rewritten, but not-yet-planned query tree, that is, I didn't know that
> extract_query_dependencies() only populates the CachedPlanSource's
> relationOids and not CachedPlan's.  The former is only for tracking
> the dependencies of an unplanned Query, so partitions should never be
> added to it.  Instead, they should be added to
> PlannedStmt.relationOids (note PlannedStmt belongs to CachedPlan),
> which is kind of what your earlier patch did.  Needless to say,
> PlanCacheRelCallback checks both CachedPlanSource.relationOids and
> PlannedStmt.relationOids, so if it receives a message about a
> partition, its OID is matched from the latter.
>
> All that is to say that we should move our code to add partition OIDs
> as plan dependencies to somewhere in set_plan_references(), which
> otherwise populates PlannedStmt.relationOids.  I updated the patch to
> do that.

OK, understood. Thanks for the detailed explanation.

> It also occurred to me that we can avoid pointless adding of
> partitions if the final plan won't use parallelism.  For that, the
> patch adds checking glob->parallelModeNeeded, which seems to do the
> trick though I don't know if that's the correct way of doing that.
>

I'm not sure if's pointless adding partitions even in the case of NOT
using parallelism, because we may be relying on the result of
parallel-safety checks on partitions in both cases.
For example, insert_parallel.sql currently includes a test (that you
originally provided in a previous post) that checks a non-parallel
plan is generated after a parallel-unsafe trigger is created on a
partition involved in the INSERT.
If I further add to that test by then dropping that trigger and then
again using EXPLAIN to see what plan is generated, then I'd expect a
parallel-plan to be generated, but with the setrefs-v3.patch it still
generates a non-parallel plan. So I think the "&&
glob->parallelModeNeeded" part of test needs to be removed.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-18 Thread Greg Nancarrow

On Thu, Feb 18, 2021 at 4:35 PM Amit Langote  wrote:
>
> Looking at this again, I am a bit concerned about going over the whole
> partition tree *twice* when making a parallel plan for insert into
> partitioned tables.  Maybe we should do what you did in your first
> attempt a slightly differently -- add partition OIDs during the
> max_parallel_hazard() initiated scan of the partition tree as you did.
> Instead of adding them directly to PlannerGlobal.relationOids, add to,
> say, PlannerInfo.targetPartitionOids and have set_plan_references() do
> list_concat(glob->relationOids, list_copy(root->targetPartitionOids)
> in the same place as setrefs-v3 does
> add_target_partition_oids_recurse().  Thoughts?
>

Agreed, that might be a better approach, and that way we're also only
recording the partition OIDs that the parallel-safety checks are
relying on.
I'll give it a go and see if I can detect any issues with this method.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-21 Thread Greg Nancarrow

On Fri, Feb 19, 2021 at 9:38 PM Amit Kapila  wrote:
>
> On Thu, Feb 18, 2021 at 11:05 AM Amit Langote  wrote:
> >
> > > > It also occurred to me that we can avoid pointless adding of
> > > > partitions if the final plan won't use parallelism.  For that, the
> > > > patch adds checking glob->parallelModeNeeded, which seems to do the
> > > > trick though I don't know if that's the correct way of doing that.
> > > >
> > >
> > > I'm not sure if's pointless adding partitions even in the case of NOT
> > > using parallelism, because we may be relying on the result of
> > > parallel-safety checks on partitions in both cases.
> > > For example, insert_parallel.sql currently includes a test (that you
> > > originally provided in a previous post) that checks a non-parallel
> > > plan is generated after a parallel-unsafe trigger is created on a
> > > partition involved in the INSERT.
> > > If I further add to that test by then dropping that trigger and then
> > > again using EXPLAIN to see what plan is generated, then I'd expect a
> > > parallel-plan to be generated, but with the setrefs-v3.patch it still
> > > generates a non-parallel plan. So I think the "&&
> > > glob->parallelModeNeeded" part of test needs to be removed.
> >
> > Ah, okay, I didn't retest my case after making that change.
> >
>
> Greg has point here but I feel something on previous lines (having a
> test of glob->parallelModeNeeded) is better. We only want to
> invalidate the plan if the prepared plan is unsafe to execute next
> time. It is quite possible that there are unsafe triggers on different
> partitions and only one of them is dropped, so next time planning will
> again yield to the same non-parallel plan. If we agree with that I
> think it is better to add this dependency in set_plan_refs (along with
> Gather node handling).
>

I think we should try to be consistent with current plan-cache
functionality, rather than possibly inventing new rules for
partitions.
I'm finding that with current Postgres functionality (master branch),
if there are two parallel-unsafe triggers defined on a normal table
and one is dropped, it results in replanning and it yields the same
(non-parallel) plan. This would seem to go against what you are
suggesting.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-22 Thread Greg Nancarrow

On Mon, Feb 22, 2021 at 6:25 PM houzj.f...@fujitsu.com
 wrote:
>
> Hi
>
> (I may be wrong here)
> I noticed that the patch does not have check for partial index(index 
> predicate).
> It seems parallel index build will check it like the following:
> --
> /*
>  * Determine if it's safe to proceed.
>  *
>  * Currently, parallel workers can't access the leader's temporary 
> tables.
>  * Furthermore, any index predicate or index expressions must be 
> parallel
>  * safe.
>  */
> if (heap->rd_rel->relpersistence == RELPERSISTENCE_TEMP ||
> !is_parallel_safe(root, (Node *) 
> RelationGetIndexExpressions(index)) ||
> !is_parallel_safe(root, (Node *) 
> RelationGetIndexPredicate(index)))
> --
>
> Should we do parallel safety check for it ?
>

Thanks, it looks like you're right, it is missing (and there's no test for it).
I can add a fix to the index-checking code, something like:

+if (!found_max_hazard)
+{
+ii_Predicate = RelationGetIndexPredicate(index_rel);
+if (ii_Predicate != NIL)
+{
+if (max_parallel_hazard_walker((Node *)ii_Predicate, context))
+{
+found_max_hazard = true;
+}
+}
+    }

Also will need a bit of renaming of that function.
I'll include this in the next patch update.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-22 Thread Greg Nancarrow

On Tue, Feb 23, 2021 at 12:33 AM Amit Kapila  wrote:
>
> On Mon, Feb 22, 2021 at 8:41 AM Greg Nancarrow  wrote:
> >
> > On Fri, Feb 19, 2021 at 9:38 PM Amit Kapila  wrote:
> > >
> > > On Thu, Feb 18, 2021 at 11:05 AM Amit Langote  
> > > wrote:
> > > >
> > > > > > It also occurred to me that we can avoid pointless adding of
> > > > > > partitions if the final plan won't use parallelism.  For that, the
> > > > > > patch adds checking glob->parallelModeNeeded, which seems to do the
> > > > > > trick though I don't know if that's the correct way of doing that.
> > > > > >
> > > > >
> > > > > I'm not sure if's pointless adding partitions even in the case of NOT
> > > > > using parallelism, because we may be relying on the result of
> > > > > parallel-safety checks on partitions in both cases.
> > > > > For example, insert_parallel.sql currently includes a test (that you
> > > > > originally provided in a previous post) that checks a non-parallel
> > > > > plan is generated after a parallel-unsafe trigger is created on a
> > > > > partition involved in the INSERT.
> > > > > If I further add to that test by then dropping that trigger and then
> > > > > again using EXPLAIN to see what plan is generated, then I'd expect a
> > > > > parallel-plan to be generated, but with the setrefs-v3.patch it still
> > > > > generates a non-parallel plan. So I think the "&&
> > > > > glob->parallelModeNeeded" part of test needs to be removed.
> > > >
> > > > Ah, okay, I didn't retest my case after making that change.
> > > >
> > >
> > > Greg has point here but I feel something on previous lines (having a
> > > test of glob->parallelModeNeeded) is better. We only want to
> > > invalidate the plan if the prepared plan is unsafe to execute next
> > > time. It is quite possible that there are unsafe triggers on different
> > > partitions and only one of them is dropped, so next time planning will
> > > again yield to the same non-parallel plan. If we agree with that I
> > > think it is better to add this dependency in set_plan_refs (along with
> > > Gather node handling).
> > >
> >
> > I think we should try to be consistent with current plan-cache
> > functionality, rather than possibly inventing new rules for
> > partitions.
> > I'm finding that with current Postgres functionality (master branch),
> > if there are two parallel-unsafe triggers defined on a normal table
> > and one is dropped, it results in replanning and it yields the same
> > (non-parallel) plan.
> >
>
> Does such a plan have partitions access in it? Can you share the plan?
>

Er, no (it's just a regular table), but that was exactly my point:
aren't you suggesting functionality for partitions that doesn't seem
to work the same way for non-partitions?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-23 Thread Greg Nancarrow

On Tue, Feb 23, 2021 at 2:30 PM Amit Kapila  wrote:
>
> On Tue, Feb 23, 2021 at 6:37 AM Greg Nancarrow  wrote:
> >
> > On Tue, Feb 23, 2021 at 12:33 AM Amit Kapila  
> > wrote:
> > >
> > > On Mon, Feb 22, 2021 at 8:41 AM Greg Nancarrow  
> > > wrote:
> > > >
> > > > On Fri, Feb 19, 2021 at 9:38 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > On Thu, Feb 18, 2021 at 11:05 AM Amit Langote 
> > > > >  wrote:
> > > > > >
> > > > > > > > It also occurred to me that we can avoid pointless adding of
> > > > > > > > partitions if the final plan won't use parallelism.  For that, 
> > > > > > > > the
> > > > > > > > patch adds checking glob->parallelModeNeeded, which seems to do 
> > > > > > > > the
> > > > > > > > trick though I don't know if that's the correct way of doing 
> > > > > > > > that.
> > > > > > > >
> > > > > > >
> > > > > > > I'm not sure if's pointless adding partitions even in the case of 
> > > > > > > NOT
> > > > > > > using parallelism, because we may be relying on the result of
> > > > > > > parallel-safety checks on partitions in both cases.
> > > > > > > For example, insert_parallel.sql currently includes a test (that 
> > > > > > > you
> > > > > > > originally provided in a previous post) that checks a non-parallel
> > > > > > > plan is generated after a parallel-unsafe trigger is created on a
> > > > > > > partition involved in the INSERT.
> > > > > > > If I further add to that test by then dropping that trigger and 
> > > > > > > then
> > > > > > > again using EXPLAIN to see what plan is generated, then I'd 
> > > > > > > expect a
> > > > > > > parallel-plan to be generated, but with the setrefs-v3.patch it 
> > > > > > > still
> > > > > > > generates a non-parallel plan. So I think the "&&
> > > > > > > glob->parallelModeNeeded" part of test needs to be removed.
> > > > > >
> > > > > > Ah, okay, I didn't retest my case after making that change.
> > > > > >
> > > > >
> > > > > Greg has point here but I feel something on previous lines (having a
> > > > > test of glob->parallelModeNeeded) is better. We only want to
> > > > > invalidate the plan if the prepared plan is unsafe to execute next
> > > > > time. It is quite possible that there are unsafe triggers on different
> > > > > partitions and only one of them is dropped, so next time planning will
> > > > > again yield to the same non-parallel plan. If we agree with that I
> > > > > think it is better to add this dependency in set_plan_refs (along with
> > > > > Gather node handling).
> > > > >
> > > >
> > > > I think we should try to be consistent with current plan-cache
> > > > functionality, rather than possibly inventing new rules for
> > > > partitions.
> > > > I'm finding that with current Postgres functionality (master branch),
> > > > if there are two parallel-unsafe triggers defined on a normal table
> > > > and one is dropped, it results in replanning and it yields the same
> > > > (non-parallel) plan.
> > > >
> > >
> > > Does such a plan have partitions access in it? Can you share the plan?
> > >
> >
> > Er, no (it's just a regular table), but that was exactly my point:
> > aren't you suggesting functionality for partitions that doesn't seem
> > to work the same way for non-partitions?
> >
>
> I don't think so. The non-parallel plan for Insert doesn't directly
> depend on partitions so we don't need to invalidate those.
>

But the non-parallel plan was chosen (instead of a parallel plan)
because of parallel-safety checks on the partitions, which found
attributes of the partitions which weren't parallel-safe.
So it's not so clear to me that the dependency doesn't exist - the
non-parallel plan does in fact depend on the state of the partitions.
I know you're suggesting to reduce plan-cache invalidation by only
recording a dependency in the parallel-plan case, but I've yet to see
that in the existing code, and in fact it seems to be inconsistent
with the behaviour I've tested so far (one example given prior, but
will look for more).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-23 Thread Greg Nancarrow

On Tue, Feb 23, 2021 at 10:53 PM Amit Kapila  wrote:
>
> On Tue, Feb 23, 2021 at 4:47 PM Greg Nancarrow  wrote:
> >
> > On Tue, Feb 23, 2021 at 2:30 PM Amit Kapila  wrote:
> > >
> > > On Tue, Feb 23, 2021 at 6:37 AM Greg Nancarrow  
> > > wrote:
> > > >
> > > > On Tue, Feb 23, 2021 at 12:33 AM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > On Mon, Feb 22, 2021 at 8:41 AM Greg Nancarrow  
> > > > > wrote:
> > > > > >
> > > > > > On Fri, Feb 19, 2021 at 9:38 PM Amit Kapila 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Thu, Feb 18, 2021 at 11:05 AM Amit Langote 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > > > It also occurred to me that we can avoid pointless adding of
> > > > > > > > > > partitions if the final plan won't use parallelism.  For 
> > > > > > > > > > that, the
> > > > > > > > > > patch adds checking glob->parallelModeNeeded, which seems 
> > > > > > > > > > to do the
> > > > > > > > > > trick though I don't know if that's the correct way of 
> > > > > > > > > > doing that.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I'm not sure if's pointless adding partitions even in the 
> > > > > > > > > case of NOT
> > > > > > > > > using parallelism, because we may be relying on the result of
> > > > > > > > > parallel-safety checks on partitions in both cases.
> > > > > > > > > For example, insert_parallel.sql currently includes a test 
> > > > > > > > > (that you
> > > > > > > > > originally provided in a previous post) that checks a 
> > > > > > > > > non-parallel
> > > > > > > > > plan is generated after a parallel-unsafe trigger is created 
> > > > > > > > > on a
> > > > > > > > > partition involved in the INSERT.
> > > > > > > > > If I further add to that test by then dropping that trigger 
> > > > > > > > > and then
> > > > > > > > > again using EXPLAIN to see what plan is generated, then I'd 
> > > > > > > > > expect a
> > > > > > > > > parallel-plan to be generated, but with the setrefs-v3.patch 
> > > > > > > > > it still
> > > > > > > > > generates a non-parallel plan. So I think the "&&
> > > > > > > > > glob->parallelModeNeeded" part of test needs to be removed.
> > > > > > > >
> > > > > > > > Ah, okay, I didn't retest my case after making that change.
> > > > > > > >
> > > > > > >
> > > > > > > Greg has point here but I feel something on previous lines 
> > > > > > > (having a
> > > > > > > test of glob->parallelModeNeeded) is better. We only want to
> > > > > > > invalidate the plan if the prepared plan is unsafe to execute next
> > > > > > > time. It is quite possible that there are unsafe triggers on 
> > > > > > > different
> > > > > > > partitions and only one of them is dropped, so next time planning 
> > > > > > > will
> > > > > > > again yield to the same non-parallel plan. If we agree with that I
> > > > > > > think it is better to add this dependency in set_plan_refs (along 
> > > > > > > with
> > > > > > > Gather node handling).
> > > > > > >
> > > > > >
> > > > > > I think we should try to be consistent with current plan-cache
> > > > > > functionality, rather than possibly inventing new rules for
> > > > > > partitions.
> > > > > > I'm finding that with current Postgres functionality (master 
> > > > > > branch),
> > > > > > if there are two parallel-unsafe triggers defined on a normal table
> > > > > > and one is dropped, it results in replanning and it y

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-24 Thread Greg Nancarrow

On Wed, Feb 24, 2021 at 3:12 PM Amit Kapila  wrote:
>
> On Wed, Feb 24, 2021 at 8:41 AM Greg Nancarrow  wrote:
> >
> > On Tue, Feb 23, 2021 at 10:53 PM Amit Kapila  
> > wrote:
> > >
> > > > But the non-parallel plan was chosen (instead of a parallel plan)
> > > > because of parallel-safety checks on the partitions, which found
> > > > attributes of the partitions which weren't parallel-safe.
> > > > So it's not so clear to me that the dependency doesn't exist - the
> > > > non-parallel plan does in fact depend on the state of the partitions.
> > > >
> > >
> > > Hmm, I think that is not what we can consider as a dependency.
> > >
> >
> > Then if it's not a dependency, then we shouldn't have to check the
> > attributes of the partitions for parallel-safety, to determine whether
> > we must use a non-parallel plan (or can use a parallel plan).
> > Except, of course, we do have to ...
> >
>
> I don't think the plan-dependency and checking for parallel-safety are
> directly related.
>

That is certainly not my understanding. Why do you think that they are
not directly related?
This whole issue came about because Amit L pointed out that there is a
need to add partition OIDs as plan-dependencies BECAUSE the checking
for parallel-safety and plan-dependency are related - since now, for
Parallel INSERT, we're executing extra parallel-safety checks that
check partition properties, so the resultant plan is dependent on the
partitions and their properties.

Amit L originally explained this as follows:

"I think that the concerns raised by Tsunakawa-san in:
https://www.postgresql.org/message-id/TYAPR01MB2990CCB6E24B10D35D28B949FEA30%40TYAPR01MB2990.jpnprd01.prod.outlook.com
regarding how this interacts with plancache.c deserve a look.
Specifically, a plan that uses parallel insert may fail to be
invalidated when partitions are altered directly (that is without
altering their root parent).  That would be because we are not adding
partition OIDs to PlannerGlobal.invalItems despite making a plan
that's based on checking their properties."

Tsunakawa-san: "Is the cached query plan invalidated when some
alteration is done to change the parallel safety, such as adding a
trigger with a parallel-unsafe trigger action?"

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-24 Thread Greg Nancarrow

On Wed, Feb 24, 2021 at 8:39 PM Amit Kapila  wrote:
>
> On Fri, Feb 19, 2021 at 6:56 AM Greg Nancarrow  wrote:
> >
> > Posting a new version of the patches, with the following updates:
> >
>
> I am not happy with the below code changes, I think we need a better
> way to deal with this.
>
> @@ -313,19 +314,35 @@ standard_planner(Query *parse, const char
> *query_string, int cursorOptions,
>   glob->transientPlan = false;
>   glob->dependsOnRole = false;
>
> + if (IsModifySupportedInParallelMode(parse->commandType) &&
> + !parse->hasModifyingCTE)
> + {
> + /*
> + * FIXME
> + * There is a known bug in the query rewriter: re-written queries with
> + * a modifying CTE may not have the "hasModifyingCTE" flag set. When
> + * that bug is fixed, this temporary fix must be removed.
> + *
> + * Note that here we've made a fix for this problem only for a
> + * supported-in-parallel-mode table-modification statement (i.e.
> + * INSERT), but this bug exists for SELECT too.
> + */
> + parse->hasModifyingCTE = query_has_modifying_cte(parse);
> + }
> +
>
> I understand that this is an existing bug but I am not happy with this
> workaround. I feel it is better to check for modifyingCTE in
> max_parallel_hazard_walker. See attached, this is atop
> v18-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.
>

Thanks, I'll try it.
I did, however, notice a few concerns with your suggested alternative fix:
- It is not restricted to INSERT (as current fix is).
- It does not set parse->hasModifyingCTE (as current fix does), so the
return value (PlannedStmt) from standard_planner() won't have
hasModifyingCTE set correctly in the cases where the rewriter doesn't
set it correctly (and I'm not sure what strange side effects ??? that
might have).
- Although the invocation of max_parallel_hazard_walker() on a RTE
subquery will "work" in finally locating your fix's added
"CommonTableExpr" parallel-safety disabling block for commandType !=
CMD_SELECT, it looks like it potentially results in checking and
walking over a lot of other stuff within the subquery not related to
CTEs. The current fix does a more specific and efficient search for a
modifying CTE.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-24 Thread Greg Nancarrow

On Wed, Feb 24, 2021 at 10:38 PM Amit Kapila  wrote:
> >
> > Thanks, I'll try it.
> > I did, however, notice a few concerns with your suggested alternative fix:
> > - It is not restricted to INSERT (as current fix is).
> >
>
> So what? The Select also has a similar problem.
>

Yes, but you're potentially now adding overhead to every
SELECT/UPDATE/DELETE with a subquery, by the added recursive checking
and walking done by the new call to  max_parallel_hazard_walker().and
code block looking for a modifying CTE
And anyway I'm not sure it's really right putting in a fix for SELECT
with a modifying CTE, into a patch that adds parallel INSERT
functionality - in any case you'd need to really spell this out in
code comments, as this is at best a temporary fix that would need to
be removed whenever the query rewriter is fixed to set hasModifyingCTE
correctly.

> > - It does not set parse->hasModifyingCTE (as current fix does), so the
> > return value (PlannedStmt) from standard_planner() won't have
> > hasModifyingCTE set correctly in the cases where the rewriter doesn't
> > set it correctly (and I'm not sure what strange side effects ??? that
> > might have).
>
> Here end goal is not to set hasModifyingCTE but do let me know if you
> see any problem or impact.
>

parse->hasModifyingCTE is not just used in the shortcut-test for
parallel-safety, its value is subsequently copied into the PlannedStmt
returned by standard_planner.
It's inconsistent to leave hasModifyingCTE FALSE when by the fix it
has found a modifying CTE.
Even if no existing tests detect an issue with this, PlannedStmt is
left with a bad hasModifyingCTE value in this case, so there is the
potential for something to go wrong.

> > - Although the invocation of max_parallel_hazard_walker() on a RTE
> > subquery will "work" in finally locating your fix's added
> > "CommonTableExpr" parallel-safety disabling block for commandType !=
> > CMD_SELECT, it looks like it potentially results in checking and
> > walking over a lot of other stuff within the subquery not related to
> > CTEs. The current fix does a more specific and efficient search for a
> > modifying CTE.
> >
>
> I find the current fix proposed by you quite ad-hoc and don't think we
> can go that way.
>

At least my current fix is very specific, efficient and clear in its
purpose, and suitably documented, so it is very clear when and how it
is to be removed, when the issue is fixed in the query rewriter.
Another concern with the alternative fix is that it always searches
for a modifying CTE, even when parse->hasModifyingCTE is true after
the query rewriter processing.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-24 Thread Greg Nancarrow

On Thu, Feb 25, 2021 at 12:19 AM Amit Kapila  wrote:
>
> On Wed, Feb 24, 2021 at 6:21 PM Greg Nancarrow  wrote:
> >
> > On Wed, Feb 24, 2021 at 10:38 PM Amit Kapila  
> > wrote:
> > > >
> > > > Thanks, I'll try it.
> > > > I did, however, notice a few concerns with your suggested alternative 
> > > > fix:
> > > > - It is not restricted to INSERT (as current fix is).
> > > >
> > >
> > > So what? The Select also has a similar problem.
> > >
> >
> > Yes, but you're potentially now adding overhead to every
> > SELECT/UPDATE/DELETE with a subquery, by the added recursive checking
> > and walking done by the new call to  max_parallel_hazard_walker().and
> > code block looking for a modifying CTE
> >
>
> Can you please share an example where it has added an overhead?
>
> > And anyway I'm not sure it's really right putting in a fix for SELECT
> > with a modifying CTE, into a patch that adds parallel INSERT
> > functionality - in any case you'd need to really spell this out in
> > code comments, as this is at best a temporary fix that would need to
> > be removed whenever the query rewriter is fixed to set hasModifyingCTE
> > correctly.
> >
> > > > - It does not set parse->hasModifyingCTE (as current fix does), so the
> > > > return value (PlannedStmt) from standard_planner() won't have
> > > > hasModifyingCTE set correctly in the cases where the rewriter doesn't
> > > > set it correctly (and I'm not sure what strange side effects ??? that
> > > > might have).
> > >
> > > Here end goal is not to set hasModifyingCTE but do let me know if you
> > > see any problem or impact.
> > >
> >
> > parse->hasModifyingCTE is not just used in the shortcut-test for
> > parallel-safety, its value is subsequently copied into the PlannedStmt
> > returned by standard_planner.
> > It's inconsistent to leave hasModifyingCTE FALSE when by the fix it
> > has found a modifying CTE.
> > Even if no existing tests detect an issue with this, PlannedStmt is
> > left with a bad hasModifyingCTE value in this case, so there is the
> > potential for something to go wrong.
> >
> > > > - Although the invocation of max_parallel_hazard_walker() on a RTE
> > > > subquery will "work" in finally locating your fix's added
> > > > "CommonTableExpr" parallel-safety disabling block for commandType !=
> > > > CMD_SELECT, it looks like it potentially results in checking and
> > > > walking over a lot of other stuff within the subquery not related to
> > > > CTEs. The current fix does a more specific and efficient search for a
> > > > modifying CTE.
> > > >
> > >
> > > I find the current fix proposed by you quite ad-hoc and don't think we
> > > can go that way.
> > >
> >
> > At least my current fix is very specific, efficient and clear in its
> > purpose, and suitably documented, so it is very clear when and how it
> > is to be removed, when the issue is fixed in the query rewriter.
> > Another concern with the alternative fix is that it always searches
> > for a modifying CTE, even when parse->hasModifyingCTE is true after
> > the query rewriter processing.
> >
>
> There is a check in standard_planner such that if
> parse->hasModifyingCTE is true then we won't try checking
> parallel-safety.
>

OK, I retract that last concern, parallel-safety checks are skipped
when parse->hasModifyingCTE is true.
Examples of overhead will need to wait until tomorrow (and would need
to test), but seems fairly clear max_parallel_hazard_walker() first
checks parallel-unsafe functions in the node, then does numerous
node-type checks before getting to CommonTableExpr - exactly how much
extra work would depend on the SQL.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-25 Thread Greg Nancarrow

On Fri, Feb 26, 2021 at 4:07 PM Amit Langote  wrote:
>
> Hi Greg,
>
> Replying to an earlier email in the thread because I think I found a
> problem relevant to the topic that was brought up.
>
> On Wed, Feb 17, 2021 at 10:34 PM Amit Langote  wrote:
> > On Wed, Feb 17, 2021 at 10:44 AM Greg Nancarrow  wrote:
> > > Is the use of "table_close(rel, NoLock)'' intentional? That will keep
> > > the lock (lockmode) until end-of-transaction.
> >
> > I think we always keep any locks on relations that are involved in a
> > plan until end-of-transaction.  What if a partition is changed in an
> > unsafe manner between being considered safe for parallel insertion and
> > actually performing the parallel insert?
>
> I realized that there is a race condition that will allow a concurrent
> backend to invalidate a parallel plan (for insert into a partitioned
> table) at a point in time when it's too late for plancache.c to detect
> it.  It has to do with how plancache.c locks the relations affected by
> a cached query and its cached plan.  Specifically,
> AcquireExecutorLocks(), which locks the relations that need to be
> locked before the plan could be considered safe to execute, does not
> notice the partitions that would have been checked for parallel safety
> when the plan was made.  Given that AcquireExecutorLocks() only loops
> over PlannedStmt.rtable and locks exactly the relations found there,
> partitions are not locked.  This means that a concurrent backend can,
> for example, add an unsafe trigger to a partition before a parallel
> worker inserts into it only to fail when it does.
>
> Steps to reproduce (tried with v19 set of patches):
>
> drop table if exists rp, foo;
> create table rp (a int) partition by range (a);
> create table rp1 partition of rp for values from (minvalue) to (0);
> create table rp2 partition of rp for values from (0) to (maxvalue);
> create table foo (a) as select generate_series(1, 100);
> set plan_cache_mode to force_generic_plan;
> set enable_parallel_dml to on;
> deallocate all;
> prepare q as insert into rp select * from foo where a%2 = 0;
> explain analyze execute q;
>
> At this point, attach a debugger to this backend and set a breakpoint
> in AcquireExecutorLocks() and execute q again:
>
> -- will execute the cached plan
> explain analyze execute q;
>
> Breakpoint will be hit.  Continue till return from the function and
> stop. Start another backend and execute this:
>
> -- will go through, because no lock still taken on the partition
> create or replace function make_table () returns trigger language
> plpgsql as $$ begin create table bar(); return null; end; $$ parallel
> unsafe;
> create trigger ai_rp2 after insert on rp2 for each row execute
> function make_table();
>
> Back to the debugger, hit continue to let the plan execute.  You
> should see this error:
>
> ERROR:  cannot start commands during a parallel operation
> CONTEXT:  SQL statement "create table bar()"
> PL/pgSQL function make_table() line 1 at SQL statement parallel worker
>
> The attached patch fixes this, although I am starting to have second
> thoughts about how we're tracking partitions in this patch.  Wondering
> if we should bite the bullet and add partitions into the main range
> table instead of tracking them separately in partitionOids, which
> might result in a cleaner patch overall.
>

Thanks Amit,

I was able to reproduce the problem using your instructions (though I
found I had to run that explain an extra time, in order to hit the
breakpoint).
Also, I can confirm that the problem doesn't occur after application
of your patch.
I'll leave it to your better judgement as to what to do next  - if you
feel the current tracking method is not sufficient

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2021-03-02 Thread Greg Nancarrow

On Wed, Mar 3, 2021 at 2:49 PM vignesh C  wrote:
>
>
>
> On Wed, Mar 3, 2021 at 7:37 AM Tom Lane  wrote:
> >
> > Greg Nancarrow  writes:
> > > I've marked this as "Ready for Committer".
> >
> > I've pushed this after whacking it around a fair amount.  A lot of
> > that was cosmetic, but one thing that wasn't is that I got rid of the
> > proposed "which_primary_host" variable.  I thought the logic around
> > that was way too messy and probably buggy.  Even if it worked exactly
> > as intended, I'm dubious that the design intention was good.  I think
> > it makes more sense just to go through the whole server list again
> > without the restriction to standby servers.  In particular, that will
> > give saner results if the servers' status isn't holding still.
> >
>
> Buildfarm machine crake and conchuela have failed after this commit.
> I had checked the failures, crake is failing because of:
> Mar 02 21:22:56 ./src/test/recovery/t/001_stream_rep.pl: Variable
declared in conditional statement at line 88, column 2.  Declare variables
outside of the condition.  ([Variables::ProhibitConditionalDeclarations]
Severity: 5)
> I have analyzed and posted a patch at [1] for this. That might fix this
problem.
>
> Conchuela is failing because of:
> ok 17 - connect to node standby_1 if mode "standby" and standby_1,primary
listed
> ack Broken pipe: write( 13, 'SHOW port;' ) at
/usr/local/lib/perl5/site_perl/IPC/Run/IO.pm line 549.
> ### Stopping node "primary" using mode immediate
> # Running: pg_ctl -D
/home/pgbf/buildroot/HEAD/pgsql.build/src/test/recovery/tmp_check/t_001_stream_rep_primary_data/pgdata
-m immediate stop
> waiting for server to shut down... done
>
> I could not find the exact reason for this failure, I'm checking further
on why it is failing.
> Thoughts?
>
> [1] -
https://www.postgresql.org/message-id/CALDaNm3L%3DROeb%3D4rKf0XMN0CqrEnn6T%3D-44m4fsDAhcw-%40mail.gmail.com
> OUCVA
>


At least the first problem seems to possibly be because of:

https://www.effectiveperlprogramming.com/2019/07/no-more-false-postfix-lexical-declarations-in-v5-30/

The buildfarm machine crake is using Perl 5.30.3.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-02 Thread Greg Nancarrow

On Tue, Mar 2, 2021 at 11:19 PM Amit Kapila  wrote:
>
> On Mon, Mar 1, 2021 at 9:08 AM Greg Nancarrow  wrote:
> >
> > Posting an updated set of patches that includes Amit Langote's patch
> > to the partition tracking scheme...
> >
>
> Few comments:
> ==
> 1.
> "Prior to entering parallel-mode for execution of INSERT with parallel SELECT,
> a TransactionId is acquired and assigned to the current transaction state 
> which
> is then serialized in the parallel DSM for the parallel workers to use."
>
> This point in the commit message seems a bit misleading to me because
> IIUC we need to use transaction id only in the master backend for the
> purpose of this patch. And, we are doing this at an early stage
> because we don't allow to allocate it in parallel-mode. I think here
> we should mention in some way that this has a disadvantage that if the
> underneath select doesn't return any row then this transaction id
> allocation would not be of use. However, that shouldn't happen in
> practice in many cases. But, if I am missing something and this is
> required by parallel workers then we can ignore what I said.
>

I'll remove the part that says "which is then serialized ...", as this
may imply that the patch is doing this (which of course it is not,
Postgres always serializes the transaction state in the parallel DSM
for the parallel workers to use).
The acquiring of the TransactionId up-front, prior to entering
parallel-mode, is absolutely required because heap_insert() lazily
tries to get the current transaction-id and attempts to assign one if
the current transaction state hasn't got one, and this assignment is
not allowed in parallel-mode ("ERROR:  cannot assign XIDs during a
parallel operation"), and this mode is required for use of parallel
SELECT.
I'll add the part about the disadvantage of the transaction-id not
being used if the underlying SELECT doesn't return any rows.

> 2.
> /*
> + * Prepare for entering parallel mode by assigning a
> + * FullTransactionId, to be included in the transaction state that is
> + * serialized in the parallel DSM.
> + */
> + (void) GetCurrentTransactionId();
> + }
>
> Similar to the previous comment this also seems to indicate that we
> require TransactionId for workers. If that is not the case then this
> comment also needs to be modified.
>

I'll update to comment to remove the part about the serialization (as
this always happens, not a function of the patch) and mention it is
needed to avoid attempting to assign a FullTransactionId in
parallel-mode.

> 3.
>  static int common_prefix_cmp(const void *a, const void *b);
>
> -
>  
> /*
>
> Spurious line removal.
>

I will reinstate that empty line.

> 4.
>  * Assess whether it's feasible to use parallel mode for this query. We
>   * can't do this in a standalone backend, or if the command will try to
> - * modify any data, or if this is a cursor operation, or if GUCs are set
> - * to values that don't permit parallelism, or if parallel-unsafe
> - * functions are present in the query tree.
> + * modify any data using a CTE, or if this is a cursor operation, or if
> + * GUCs are set to values that don't permit parallelism, or if
> + * parallel-unsafe functions are present in the query tree.
>
> This comment change is not required because this is quite similar to
> what we do for CTAS. Your further comment changes in this context are
> sufficient.

An INSERT modifies data, so according to the original comment, then
it's not feasible to use parallel mode, because the command tries to
modify data ("We can't do this in a standalone backend, or if the
command will try to modify any data ...").
Except now we need to use parallel-mode for "INSERT with parallel
SELECT", and INSERT is a command that modifies data.
So isn't the comment change actually needed?

>
> 5.
> + (IsModifySupportedInParallelMode(parse->commandType) &&
> + is_parallel_possible_for_modify(parse))) &&
>
> I think it would be better if we move the check
> IsModifySupportedInParallelMode inside
> is_parallel_possible_for_modify.

The reason it is done outside of is_parallel_possible_for_modify() is
to avoid the function overhead of calling
is_parallel_possible_for_modify() for SELECT statements, only to
return without doing anything. Note also that
IsModifySupportedInParallelMode() is an inline function.

>Also, it might be better to name this
> function as is_parallel_allowed_for_modify.

I do tend to think that in this case "possible" is better than "allowed".
Only the "parallel_dml&q

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-03 Thread Greg Nancarrow

On Wed, Mar 3, 2021 at 10:45 PM Dilip Kumar  wrote:
>
> On Mon, Mar 1, 2021 at 9:08 AM Greg Nancarrow  wrote:
> >
> > Posting an updated set of patches that includes Amit Langote's patch
> > to the partition tracking scheme...
> > (the alternative of adding partitions to the range table needs further
> > investigation)
>
> I was reviewing your latest patch and I have a few comments.
>
> In patch 0001
> 1.
> +static bool
> +target_rel_max_parallel_hazard_recurse(Relation rel,
> +   CmdType command_type,
> +   max_parallel_hazard_context *context)
> +{
> + TupleDesc tupdesc;
> + int attnum;
> +
> + /* Currently only CMD_INSERT is supported */
> + Assert(command_type == CMD_INSERT);
> …….
> + /*
> + * Column default expressions and check constraints are only applicable to
> + * INSERT and UPDATE, but since only parallel INSERT is currently supported,
> + * only command_type==CMD_INSERT is checked here.
> + */
> + if (command_type == CMD_INSERT)
>
> If we have an assert at the beginning of the function, then why do we
> want to put the if check here?
>

Asserts are normally only enabled in a debug-build, so for a
release-build that Assert has no effect.
The Assert is being used as a sanity-check that the function is only
currently getting called for INSERT, because that's all it currently
supports.
Further code below specifically checks for INSERT because the block
contains code that is specific to INSERT (and actually DELETE too, but
that is not currently supported - code is commented accordingly).
In future, this function will support DELETE and UPDATE, and the
Assert serves to alert the developer ASAP that it needs updating to
support those.


> 2.
> In patch 0004,  We are still charging the parallel_tuple_cost for each
> tuple, are we planning to do something about this?  I mean after this
> patch tuple will not be transferred through the tuple queue, so we
> should not add that cost.
>

I believe that for Parallel INSERT, cost_modifytable() will set
path->path.rows to 0 (unless there is a RETURNING list), so, for
example, in cost_gather(), it will not add to the run_cost as
"run_cost += parallel_tuple_cost * path->path.rows;"


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-03 Thread Greg Nancarrow

On Thu, Mar 4, 2021 at 1:07 AM Dilip Kumar  wrote:
>
> On Wed, Mar 3, 2021 at 5:50 PM Greg Nancarrow  wrote:
> >
> > Asserts are normally only enabled in a debug-build, so for a
> > release-build that Assert has no effect.
> > The Assert is being used as a sanity-check that the function is only
> > currently getting called for INSERT, because that's all it currently
> > supports.
>
> I agree that assert is only for debug build, but once we add and
> assert that means we are sure that it should only be called for insert
> and if it is called for anything else then it is a programming error
> from the caller's side.  So after the assert, adding if check for the
> same condition doesn't look like a good idea.  That means we think
> that the code can hit assert in the debug mode so we need an extra
> protection in the release mode.
>

The if-check isn't there for "extra protection".
It's to help with future changes; inside that if-block is code only
applicable to INSERT (and to UPDATE - sorry, before I said DELETE), as
the code-comment indicates, whereas the rest of the function is
generic to all command types. I don't see any problem with having this
if-block here, to help in this way, when other command types are
added.

>
> >
> > > 2.
> > > In patch 0004,  We are still charging the parallel_tuple_cost for each
> > > tuple, are we planning to do something about this?  I mean after this
> > > patch tuple will not be transferred through the tuple queue, so we
> > > should not add that cost.
> > >
> >
> > I believe that for Parallel INSERT, cost_modifytable() will set
> > path->path.rows to 0 (unless there is a RETURNING list), so, for
> > example, in cost_gather(), it will not add to the run_cost as
> > "run_cost += parallel_tuple_cost * path->path.rows;"
> >
>
> But the cost_modifytable is setting the number of rows to 0 in
> ModifyTablePath whereas the cost_gather will multiply the rows from
> the GatherPath.  I can not see the rows from GatherPath is ever set to
> 0.
>

OK, I see the problem now.
It works the way I described, but currently there's a problem with
where it's getting the rows for the GatherPath, so there's a
disconnect.
When generating the GatherPaths, it's currently always taking the
rel's (possibly estimated) row-count, rather than using the rows from
the cheapest_partial_path (the subpath: ModifyTablePath). See
generate_gather_paths().
So when generate_useful_gather_paths() is called from the planner, for
the added partial paths for Parallel INSERT, it should be passing
"true" for the "override_rows" parameter, not "false".

So I think that in the 0004 patch, the if-block:

+   if (parallel_modify_partial_path_added)
+   {
+   final_rel->rows = current_rel->rows;/* ??? why
hasn't this been
+
  * set above somewhere  */
+   generate_useful_gather_paths(root, final_rel, false);
+   }
+

can be reduced to:

+   if (parallel_modify_partial_path_added)
+   generate_useful_gather_paths(root, final_rel, true);
+

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-04 Thread Greg Nancarrow

On Thu, Mar 4, 2021 at 10:07 PM Amit Kapila  wrote:
>
> On Fri, Feb 26, 2021 at 10:37 AM Amit Langote  wrote:
> >
> > I realized that there is a race condition that will allow a concurrent
> > backend to invalidate a parallel plan (for insert into a partitioned
> > table) at a point in time when it's too late for plancache.c to detect
> > it.  It has to do with how plancache.c locks the relations affected by
> > a cached query and its cached plan.  Specifically,
> > AcquireExecutorLocks(), which locks the relations that need to be
> > locked before the plan could be considered safe to execute, does not
> > notice the partitions that would have been checked for parallel safety
> > when the plan was made.  Given that AcquireExecutorLocks() only loops
> > over PlannedStmt.rtable and locks exactly the relations found there,
> > partitions are not locked.  This means that a concurrent backend can,
> > for example, add an unsafe trigger to a partition before a parallel
> > worker inserts into it only to fail when it does.
> >
> > Steps to reproduce (tried with v19 set of patches):
> >
> > drop table if exists rp, foo;
> > create table rp (a int) partition by range (a);
> > create table rp1 partition of rp for values from (minvalue) to (0);
> > create table rp2 partition of rp for values from (0) to (maxvalue);
> > create table foo (a) as select generate_series(1, 100);
> > set plan_cache_mode to force_generic_plan;
> > set enable_parallel_dml to on;
> > deallocate all;
> > prepare q as insert into rp select * from foo where a%2 = 0;
> > explain analyze execute q;
> >
> > At this point, attach a debugger to this backend and set a breakpoint
> > in AcquireExecutorLocks() and execute q again:
> >
> > -- will execute the cached plan
> > explain analyze execute q;
> >
> > Breakpoint will be hit.  Continue till return from the function and
> > stop. Start another backend and execute this:
> >
> > -- will go through, because no lock still taken on the partition
> > create or replace function make_table () returns trigger language
> > plpgsql as $$ begin create table bar(); return null; end; $$ parallel
> > unsafe;
> > create trigger ai_rp2 after insert on rp2 for each row execute
> > function make_table();
> >
> > Back to the debugger, hit continue to let the plan execute.  You
> > should see this error:
> >
> > ERROR:  cannot start commands during a parallel operation
> > CONTEXT:  SQL statement "create table bar()"
> > PL/pgSQL function make_table() line 1 at SQL statement parallel worker
> >
> > The attached patch fixes this,
> >
>
> One thing I am a bit worried about this fix is that after this for
> parallel-mode, we will maintain partitionOids at two places, one in
> plannedstmt->relationOids and the other in plannedstmt->partitionOids.
> I guess you have to do that because, in AcquireExecutorLocks, we can't
> find which relationOids are corresponding to partitionOids, am I
> right? If so, why not just maintain them in plannedstmt->partitionOids
> and then make PlanCacheRelCallback consider it?

Maybe Amit Langote can kindly comment on this, as it would involve
updates to his prior partition-related fixes.

>Also, in
> standard_planner, we should add these partitionOids only for
> parallel-mode.
>

It is doing that in v20 patch (what makes you think it isn't?).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-04 Thread Greg Nancarrow

On Thu, Mar 4, 2021 at 11:05 PM Amit Kapila  wrote:
>
> On Thu, Mar 4, 2021 at 5:24 PM Greg Nancarrow  wrote:
> >
> > On Thu, Mar 4, 2021 at 10:07 PM Amit Kapila  wrote:
> > >
> >
> > >Also, in
> > > standard_planner, we should add these partitionOids only for
> > > parallel-mode.
> > >
> >
> > It is doing that in v20 patch (what makes you think it isn't?).
> >
>
> The below code snippet:
> + /* For AcquireExecutorLocks(). */
> + if (IsModifySupportedInParallelMode(parse->commandType))
> + result->partitionOids = glob->partitionOids;
>
> I understand that you have a check for the parallel mode in
> AcquireExecutorLocks but why can't we have it before adding that to
> planned statement
>

OK, I think I'm on the same wavelength now (sorry, I didn't realise
you're talking about PlannedStmt).

What  I believe you're suggesting is in the planner where
partitionOids are "added" to the returned PlannedStmt, they should
only be added if glob->parallelModeNeeded is true:.

i.e.

/* For AcquireExecutorLocks(). */
if (glob->partitionOids != NIL && glob->parallelModeNeeded)
result->partitionOids = glob->partitionOids;

(seems reasonable to me, as then it will match the condition for which
glob->partitionOids are added to glob->relationOids)

then in plancache.c the check on parallelModeNeeded can be removed:

/* Lock partitions ahead of modifying them in parallel mode. */
if (rti == resultRelation &&
plannedstmt->partitionOids != NIL)
    AcquireExecutorLocksOnPartitions(plannedstmt->partitionOids,
 rte->rellockmode, acquire);

Let me know if this matches what you were thinking.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-07 Thread Greg Nancarrow

On Sat, Mar 6, 2021 at 9:19 PM Amit Kapila  wrote:
>
> On Fri, Mar 5, 2021 at 6:34 PM Greg Nancarrow  wrote:
> >
> > For the time being at least, I am posting an updated set of patches,
> > as I found that the additional parallel-safety checks on DOMAIN check
> > constraints to be somewhat inefficient and could also be better
> > integrated into max_parallel_hazard(). I also updated the basic tests
> > with a test case for this.
> >
>
> Thanks, your changes look good to me. I went ahead and changed the
> patch to track the partitionOids once rather than in two separate
> lists and use that list in PlanCacheRelCallback to invalidate the
> plans. I made few other changes:
> a. don't need to retain the lock on indexes as we can't change index 
> expressions
> b. added/updated comments at few places in the code.
> c. made a few other cosmetic changes
> d. ran pgindent
> e. slightly modify the commit message.
> f. combine the code and test case patch
>
> For now, I have left 0005 and 0006 patches, we can come back to those
> once we are done with the first set of patches. The first patch looks
> good to me and I think we can commit it and then bikeshed about
> GUC/reloption patch.
>

I've checked and tested the changes, and the resultant patches look OK
to me, so I'm happy for you to commit the first patch.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-08 Thread Greg Nancarrow

On Mon, Mar 8, 2021 at 6:25 PM Amit Langote  wrote:
>
> A couple of things that look odd in v24-0001 (sorry if there were like
> that from the beginning):
>
> +static bool
> +target_rel_max_parallel_hazard(max_parallel_hazard_context *context)
> +{
> +   boolmax_hazard_found;
> +
> +   RelationtargetRel;
> +
> +   /*
> +* The target table is already locked by the caller (this is done in the
> +* parse/analyze phase), and remains locked until end-of-transaction.
> +*/
> +   targetRel = table_open(context->target_rte->relid,
> +  context->target_rte->rellockmode);
>
> The comment seems to imply that the lock need not be retaken here, but
> the code does.  Maybe it's fine to pass NoLock here, but use
> rellockmode when locking partitions, because they would not have been
> locked by the parser/analyzer.

Actually, it was originally NoLock, but I think in your suggested
changes (v15_delta.diff) to better integrate the extra parallel-safety
checks into max_parallel_hazard(), you changed "NoLock" to
"context->targetRTE->rellockmode"..
Having said that, since the table is already locked, I think that
using "context->target_rte->rellockmode" in this case just results in
the lock reference count being incremented, so I doubt it really makes
any difference, since it's locked until end-of-transaction.
I'll revert it back to NoLock.

>Which brings me to:
>
> +static bool
> +target_rel_partitions_max_parallel_hazard(Relation rel,
> + max_parallel_hazard_context 
> *context)
> +{
> ...
> +   for (i = 0; i < pdesc->nparts; i++)
> +   {
> +   boolmax_hazard_found;
> +   Relationpart_rel;
> +
> +   /*
> +* The partition needs to be locked, and remain locked until
> +* end-of-transaction to ensure its parallel-safety state is not
> +* hereafter altered.
> +*/
> +   part_rel = table_open(pdesc->oids[i], AccessShareLock);
>
> Not that I can prove there to be any actual hazard, but we tend to
> avoid taking locks with different strengths in the same query as would
> occur with this piece of code.  We're locking the partition with
> AccessShareLock here, but the executor will lock the partition with
> the stronger RowExclusiveLock mode before trying to insert into it.
> Better to use the stronger mode to begin with or just use the target
> partitioned table's RTE's rellockmode which would be RowExclusiveLock.
> You can see that that's what AcquireExecutorLocksOnPartitions() will
> do in the generic plan case.
>

OK, I see what you mean, best to use the target partitioned table's
RTE's rellockmode in this case then.
I'll update the patch accordingly.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-11 Thread Greg Nancarrow

On Wed, Mar 10, 2021 at 8:18 PM Amit Kapila  wrote:
>
> Now, coming back to Hou-San's patch to introduce a GUC and reloption
> for this feature, I think both of those make sense to me because when
> the feature is enabled via GUC, one might want to disable it for
> partitioned tables? Do we agree on that part or someone thinks
> otherwise?
>

Having both makes sense to me.

> The other points to bikeshed could be:
> 1. The name of GUC and reloption. The two proposals at hand are
> enable_parallel_dml and enable_parallel_insert. I would prefer the
> second (enable_parallel_insert) because updates/deletes might not have
> a similar overhead.
>

I guess to have the finer granularity we'd have to go with
enable_parallel_insert, which then would mean possibly having to later
add enable_parallel_update, should parallel update have similar
potential overhead in the parallel-safety checks (which to me, looks
like it could, and parallel delete may not ...)

It's a shame there is no "set" type for GUC options.
e.g.
enable_parallel_dml='insert,update'
Maybe that's going too far.

> 2. Should we keep the default value of GUC to on or off? It is
> currently off. I am fine keeping it off for this release and we can
> always turn it on in the later releases if required. Having said that,
> I see the value in keeping it on because in many cases Insert ...
> Select will be used for large data and there we will see a benefit of
> parallelism and users facing trouble (who have a very large number of
> partitions with less data to query) can still disable the parallel
> insert for that particular table. Also, the other benefit of keeping
> it on till at least the beta period is that this functionality will
> get tested and if we found reports of regression then we can turn it
> off for this release as well.
>

I'd agree to keeping it on by default (and it can be selectively
turned off for a particular table using the reloption, if needed).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-11 Thread Greg Nancarrow

On Fri, Mar 12, 2021 at 5:00 AM Tom Lane  wrote:
>
> The buildfarm has revealed that this patch doesn't work under
> CLOBBER_CACHE_ALWAYS:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky&dt=2021-03-10%2021%3A09%3A32
>
> I initially thought that that was a problem with c3ffe34863, but after
> reproducing it here I get this stack trace:
>
> #0  target_rel_trigger_max_parallel_hazard (context=0x7fff9cbfc220,
> trigdesc=0x7fdd7dfa9718) at clauses.c:971
> #1  target_rel_max_parallel_hazard_recurse (command_type=CMD_INSERT,
> context=0x7fff9cbfc220, rel=0x7fdd7df819e0) at clauses.c:929
> #2  target_rel_max_parallel_hazard_recurse (rel=0x7fdd7df819e0,
> command_type=, context=0x7fff9cbfc220) at clauses.c:893
> #3  0x0075d26e in target_rel_max_parallel_hazard (
> context=0x7fff9cbfc220) at clauses.c:884
> #4  max_parallel_hazard_walker (node=node@entry=0x146e590,
> context=context@entry=0x7fff9cbfc220) at clauses.c:831
> #5  0x00700812 in range_table_entry_walker (rte=0x146e590,
> walker=0x75cf00 , context=0x7fff9cbfc220,
> flags=16) at nodeFuncs.c:2467
> #6  0x00700927 in range_table_walker (rtable=0x11fdd70,
> walker=walker@entry=0x75cf00 ,
> context=context@entry=0x7fff9cbfc220, flags=16) at nodeFuncs.c:2446
> #7  0x00700ada in query_tree_walker (flags=,
> context=0x7fff9cbfc220, walker=0x75cf00 ,
> query=0x11fdc58) at nodeFuncs.c:2423
> #8  query_tree_walker (query=query@entry=0x700927 ,
> walker=walker@entry=0x75cf00 ,
> context=context@entry=0x11fdc58, flags=) at 
> nodeFuncs.c:2336
> #9  0x0075d138 in max_parallel_hazard_walker (
> node=node@entry=0x11fdc58, context=0x11fdc58, 
> context@entry=0x7fff9cbfc220)
> at clauses.c:853
> #10 0x0075dc98 in max_parallel_hazard (parse=parse@entry=0x11fdc58,
> glob=glob@entry=0x11fdb40) at clauses.c:585
> #11 0x0074cd22 in standard_planner (parse=0x11fdc58,
> query_string=, cursorOptions=256,
> boundParams=) at planner.c:345
> #12 0x00814947 in pg_plan_query (querytree=0x11fdc58,
> query_string=0x11fc740 "insert into fk_notpartitioned_pk (a, b)\n  select 
> 2048, x from generate_series(1,10) x;", cursorOptions=256, boundParams=0x0)
> at postgres.c:809
> #13 0x00814a43 in pg_plan_queries (querytrees=0x14725d0,
> query_string=query_string@entry=0x11fc740 "insert into 
> fk_notpartitioned_pk (a, b)\n  select 2048, x from generate_series(1,10) x;",
> cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0)
> at postgres.c:900
> #14 0x00814d35 in exec_simple_query (
> query_string=0x11fc740 "insert into fk_notpartitioned_pk (a, b)\n  select 
> 2048, x from generate_series(1,10) x;") at postgres.c:1092
>
> The problem is that target_rel_trigger_max_parallel_hazard and its caller
> think they can use a relcache TriggerDesc field across other cache
> accesses, which they can't because the relcache doesn't guarantee that
> that won't move.
>
> One approach would be to add logic to RelationClearRelation similar to
> what it does for tupdescs, rules, etc, to avoid moving them when their
> contents haven't changed.  But given that we've not needed that for the
> past several decades, I'm disinclined to add the overhead.  I think this
> code ought to be adjusted to not make its own copy of the trigdesc
> pointer, but instead fetch it out of the relcache struct each time it is
> accessed.  There's no real reason why
> target_rel_trigger_max_parallel_hazard shouldn't be passed the (stable)
> Relation pointer instead of just the trigdesc pointer.
>
> BTW, having special logic for FK triggers in
> target_rel_trigger_max_parallel_hazard seems quite loony to me.
> Why isn't that handled by setting appropriate proparallel values
> for those trigger functions?
>

Thanks Tom for your investigation, detailed analysis and suggested code fixes.
Will work on getting these issues corrected ASAP (and, er, removing
the looniness ...).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-03-11 Thread Greg Nancarrow

On Fri, Mar 12, 2021 at 5:00 AM Tom Lane  wrote:
>
>
> The problem is that target_rel_trigger_max_parallel_hazard and its caller
> think they can use a relcache TriggerDesc field across other cache
> accesses, which they can't because the relcache doesn't guarantee that
> that won't move.
>
> One approach would be to add logic to RelationClearRelation similar to
> what it does for tupdescs, rules, etc, to avoid moving them when their
> contents haven't changed.  But given that we've not needed that for the
> past several decades, I'm disinclined to add the overhead.  I think this
> code ought to be adjusted to not make its own copy of the trigdesc
> pointer, but instead fetch it out of the relcache struct each time it is
> accessed.  There's no real reason why
> target_rel_trigger_max_parallel_hazard shouldn't be passed the (stable)
> Relation pointer instead of just the trigdesc pointer.
>

I have attached a patch to fix the issue, based on your suggestion
(tested with CLOBBER_CACHE_ALWAYS defined).

> BTW, having special logic for FK triggers in
> target_rel_trigger_max_parallel_hazard seems quite loony to me.
> Why isn't that handled by setting appropriate proparallel values
> for those trigger functions?
>

... and also attached a patch to update the code for this issue.

(2nd patch relies on application of the 1st patch)

Thanks again for pointing out these problems.

Regards,
Greg Nancarrow
Fujitsu Australia


v1-0001-Fix-TriggerDesc-relcache-bug-introduced-by-recent-commit.patch
Description: Binary data


v1-0001-Better-implement-FK-trigger-parallel-safety-checking.patch
Description: Binary data

Re: Parallel INSERT (INTO ... SELECT ...)

2020-10-26 Thread Greg Nancarrow

On Fri, Oct 16, 2020 at 9:26 PM Amit Kapila  wrote:
>
>
> Cool, let me try to explain my thoughts a bit more. The idea is first
> (in standard_planner) we check if there is any 'parallel_unsafe'
> function/expression (via max_parallel_hazard) in the query tree. If we
> don't find anything 'parallel_unsafe' then we mark parallelModeOk. At
> this stage, the patch is checking whether there is any
> 'parallel_unsafe' or 'parallel_restricted' expression/function in the
> target relation and if there is none then we mark parallelModeOK as
> true. So, if there is anything 'parallel_restricted' then we will mark
> parallelModeOK as false which doesn't seem right to me.
>
> Then later in the planner during set_rel_consider_parallel, we
> determine if a particular relation can be scanned from within a
> worker, then we consider that relation for parallelism. Here, we
> determine if certain things are parallel-restricted then we don't
> consider this for parallelism. Then we create partial paths for the
> relations that are considered for parallelism. I think we don't need
> to change anything for the current patch in these later stages because
> we anyway are not considering Insert to be pushed into workers.
> However, in the second patch where we are thinking to push Inserts in
> workers, we might need to do something to filter parallel-restricted
> cases during this stage of the planner.
>

Posting an updated Parallel INSERT patch which (mostly) addresses
previously-identified issues and suggestions.

More work needs to be done in order to support parallel UPDATE and
DELETE (even after application of Thomas Munro's combo-cid
parallel-support patch), but it is getting closer.

Regards,
Greg Nancarrow
Fujitsu Australia


v5-0001-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO.patch
Description: Binary data

Re: Parallel INSERT (INTO ... SELECT ...)

2020-10-29 Thread Greg Nancarrow

ut it is good to verify that
> once.
>

I can't imagine how you could check parallel-safety properly without
checking all of the partitions.
We don't know which partition that data will get inserted into until
runtime (e.g. range/list partitioning).
Each partition can have its own column default expressions,
check-constraints, triggers etc. (which may or may not be
parallel-safe) and a partition may itself be a partitioned table.


> 4. Have you checked the overhead of this on the planner for different
> kinds of statements like inserts into tables having 100 or 500
> partitions? Similarly, it is good to check the overhead of domain
> related checks added in the patch.
>

Checking that now and will post results soon.

> 5. Can we have a separate patch for parallel-selects for Insert? It
> will make review easier.
>

See attached patches.


Regards,
Greg Nancarrow
Fujitsu Australia


v6-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch
Description: Binary data


v6-0002-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO.patch
Description: Binary data

Re: Parallel INSERT (INTO ... SELECT ...)

2020-11-01 Thread Greg Nancarrow

On Fri, Oct 30, 2020 at 5:00 PM Amit Kapila  wrote:
>
> > So then to avoid
> > errors from when parallel-mode is forced on and such unsafe INSERTs
> > are run, the only real choice is to only allow parallelModeNeeded to
> > be true for SELECT only (not INSERT), and this is kind of cheating and
> > also not picking up cases where parallel-safe INSERT is run but
> > invokes parallel-mode-unsafe features.
> > My conclusion, at least for the moment, is to leave the check where it is.
> >
>
> Okay, then can we integrate the functionality of
> MaxParallelHazardForModify in max_parallel_hazard? Calling it
> separately looks bit awkward.
>

Looking into that.

> >
> > > Few other comments on latest patch:
> > > ===
> > > 1.
> > > MaxRelParallelHazardForModify()
> > > {
> > > ..
> > > + if (commandType == CMD_INSERT || commandType == CMD_UPDATE)
> > > + {
> > > + /*
> > > ..
> > >
> > > Why to check CMD_UPDATE here?
> > >
> >
> > That was a bit of forward-thinking, for when/if UPDATE/DELETE is
> > supported in parallel-mode.
> > Column default expressions and check-constraints are only applicable
> > to INSERT and UPDATE.
> > Note however that currently this function can only ever be called with
> > commandType == CMD_INSERT.
> >
>
> I feel then for other command types there should be an Assert rather
> than try to handle something which is not yet implemented nor it is
> clear what all is required for that. It confuses the reader, at least
> it confused me. Probably we can write a comment but I don't think we
> should have any check for Update at this stage of work.
>

OK, for now I'll restrict the checks to INSERT, but I'll add comments
to assist with potential future UPDATE support.

> > > 2.
> > > +void PrepareParallelModeForModify(CmdType commandType, bool
> > > isParallelModifyLeader)
> > > +{
> > > + Assert(!IsInParallelMode());
> > > +
> > > + if (isParallelModifyLeader)
> > > + (void)GetCurrentCommandId(true);
> > > +
> > > + (void)GetCurrentFullTransactionId();
> > >
> > > Here, we should use GetCurrentTransactionId() similar to heap_insert
> > > or other heap operations. I am not sure why you have used
> > > GetCurrentFullTransactionId?
> > >
> >
> > GetCurrentTransactionId() and GetCurrentFullTransactionId() actually
> > have the same functionality, just a different return value (which is
> > not being used here).
> >
>
> Sure but lets use what is required.
>
> > But anyway I've changed it to use GetCurrentTransactionId().
> >
>
> But comments in ExecutePlan and PrepareParallelModeForModify still
> refer to FullTransactionId.
>

I believe those comments are technically correct.
GetCurrentTransactionId() calls AssignTransactionId() to do all the
work - and the comment for that function says "Assigns a new permanent
FullTransactionId to the given TransactionState".

> >
> >
> > > 4. Have you checked the overhead of this on the planner for different
> > > kinds of statements like inserts into tables having 100 or 500
> > > partitions? Similarly, it is good to check the overhead of domain
> > > related checks added in the patch.
> > >
> >
> > Checking that now and will post results soon.
> >
>

I am seeing a fair bit of overhead in the planning for the INSERT
parallel-safety checks (mind you, compared to the overall performance
gain, it's not too bad).
Some representative timings for a parallel INSERT of a millions rows
into 100,250 and 500 partitions are shown below.

(1) Without patch

# PartitionsPlanning Time (ms)
Execution Time (ms)
1001.014
   4176.435
250    0.404
   3842.414
5000.529
   4440.633


(2) With Parallel INSERT patch

# PartitionsPlanning Time (ms)
Execution Time (ms)
10011.420
  2131.148
25023.269
  3472.259
50036.531
  3238.868

I'm looking into how this can be improved by better integration into
the current code, and addressing locking concerns that you've
previously mentioned.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2020-11-03 Thread Greg Nancarrow

 * If the trigger type is RI_TRIGGER_FK, this indicates a FK 
> exists in
> +* the relation, and this would result in creation of new 
> CommandIds
> +* on insert/update/delete and this isn't supported in a 
> parallel
> +* worker (but is safe in the parallel leader).
> +*/
> +   trigtype = RI_FKey_trigger_type(trigger->tgfoid);
> +   if (trigtype == RI_TRIGGER_FK)
> +   {
> +   context->max_hazard = PROPARALLEL_RESTRICTED;
> +   /*
> +* As we're looking for the max parallel hazard, we 
> don't break
> +* here; examine any further triggers ...
> +*/
> +   }
>

max_parallel_hazard_test won't return true for PROPARALLEL_RESTRICTED.
max_parallel_hazard_test only returns true when
"context.max_interesting" is found, and that is set to
PROPARALLEL_UNSAFE in max_parallel_hazard_for_modify().

> -> Should we switch to non-parallel mode in this case, instead of throwing 
> error?
> +   val = SysCacheGetAttr(CONSTROID, tup,
> +   Anum_pg_constraint_conbin, 
> &isnull);
> +   if (isnull)
> +   elog(ERROR, "null conbin for constraint %u", 
> con->oid);
> +   conbin = TextDatumGetCString(val);
>

I didn't invent that error check, it's found in several other places
in the Postgres code (that error should only ever occur if the
database has been corrupted or intentionally invalidated).
Having said that, I agree that perhaps it's best to switch to
non-parallel mode in this case, but this wouldn't stop it erroring out
when the plan is actually run.

> -> We could include a few tests for this in regression.
>

Looking at adding relevant test cases.

> -> We might need some documentation update like in 
> parallel-query.html/parallel-plans.html, etc
>

Looking at doc updates.


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2020-11-13 Thread Greg Nancarrow

On Wed, Nov 4, 2020 at 2:18 PM Amit Kapila  wrote:
>
> Or you might want to consider moving the check related to
> IsModifySupportedInParallelMode() inside
> PrepareParallelModeForModify(). That way the code might look a bit
> cleaner.
>

Posting an updated Parallel SELECT for "INSERT INTO ... SELECT ..."
patch which addresses previously-identified issues and suggestions,
and adds some tests and doc updates.
I won't post an updated Parallel INSERT patch just yet (which just
builds on the 1st patch), because there's at least a couple of issues
in this 1st patch which need to be discussed first.

Firstly, in order to perform parallel-safety checks in the case of
partitions, the patch currently recursively locks/unlocks
(AccessShareLock) each partition during such checks (as each partition
may itself be a partitioned table). Is there a better way of
performing the parallel-safety checks and reducing the locking
requirements?

Secondly, I found that when running "make check-world", the
"partition-concurrent-attach" test fails, because it is expecting a
partition constraint to be violated on insert, while an "alter table
attach partition ..." is concurrently being executed in another
transaction. Because of the partition locking done by the patch's
parallel-safety checking code, the insert blocks on the exclusive lock
held by the "alter table" in the other transaction until the
transaction ends, so the insert ends up successfully completing (and
thus fails the test) when the other transaction ends. To overcome this
test failure, the patch code was updated to instead perform a
conditional lock on the partition, and on failure (i.e. because of an
exclusive lock held somewhere else), just assume it's parallel-unsafe
because the parallel-safety can't be determined without blocking on
the lock. This is not ideal, but I'm not sure of what other approach
could be used and I am somewhat reluctant to change that test. If
anybody is familiar with the "partition-concurrent-attach" test, any
ideas or insights would be appreciated.

Regards,
Greg Nancarrow
Fujitsu Australia

v7-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch
Description: Binary data

v7-0002-Parallel-SELECT-for-INSERT-INTO-.-SELECT-tests-and-doc.patch
Description: Binary data

Crash in virtual file descriptor FDDEBUG code

2020-11-16 Thread Greg Nancarrow

Hi Hackers,

While investigating a possible file-descriptor issue, I enabled the
FDDEBUG code in src/backend/storage/file/fd.c, only to find that it
resulted in a crash as soon as it was invoked.
It turns out the crash was in some logging code that was being passed
a NULL fileName.
I've attached a small patch that corrects the issue.

Regards,
Greg Nancarrow
Fujitsu Australia


v1-0001-Fix-crash-in-virtual-file-descriptor-FDDEBUG-code.patch
Description: Binary data

Re: Parallel plans and "union all" subquery

2020-11-22 Thread Greg Nancarrow

On Sun, Nov 22, 2020 at 11:51 PM Phil Florent  wrote:
>
>
> Hi,
>
>
> I have a question about parallel plans. I also posted it on the general list 
> but perhaps it's a question for hackers. Here is my test case :
>
>
> explain
> select count(*)
> from (select
> n1
> from drop_me
> union all
> values(1)) ua;
>
>
> QUERY PLAN
> 
> Aggregate (cost=2934739.24..2934739.25 rows=1 width=8)
> -> Append (cost=0.00..2059737.83 rows=7113 width=32)
> -> Seq Scan on drop_me (cost=0.00..1009736.12 rows=7112 width=6)
> -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=32)
> -> Result (cost=0.00..0.01 rows=1 width=4)
> JIT:
> Functions: 4
> Options: Inlining true, Optimization true, Expressions true, Deforming true
>
>
> No parallel plan, 2s6
>
>
> I read the documentation but I don't get the reason of the "noparallel" seq 
> scan of drop_me in the last case ?
>

Without debugging this, it looks to me that the UNION type resolution
isn't working as well as it possibly could in this case, for the
generation of a parallel plan. I found that with a minor tweak to your
SQL, either for the table creation or query, it will produce a
parallel plan.

Noting that currently you're creating the drop_me table with a
"numeric" column, you can either:

(1) Change the table creation

FROM:
create unlogged table drop_me as select generate_series(1,7e7) n1;
TO:
create unlogged table drop_me as select generate_series(1,7e7)::int n1;


OR


(2) Change the query

FROM:
explain
select count(*)
from (select
n1
from drop_me
union all
values(1)) ua;

TO:

explain
select count(*)
from (select
n1
from drop_me
union all
values(1::numeric)) ua;


QUERY PLAN

 Finalize Aggregate  (cost=821152.71..821152.72 rows=1 width=8)
   ->  Gather  (cost=821152.50..821152.71 rows=2 width=8)
 Workers Planned: 2
 ->  Partial Aggregate  (cost=820152.50..820152.51 rows=1 width=8)
   ->  Parallel Append  (cost=0.00..747235.71 rows=29166714 width=0)
     ->  Result  (cost=0.00..0.01 rows=1 width=0)
 ->  Parallel Seq Scan on drop_me
(cost=0.00..601402.13 rows=29166713 width=0)
(7 rows)


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Keep elog(ERROR) and ereport(ERROR) calls in the cold path

2020-11-23 Thread Greg Nancarrow

On Tue, Nov 24, 2020 at 10:06 AM David Rowley  wrote:
>
> On Tue, 24 Nov 2020 at 09:36, David Rowley  wrote:
> > Well, that makes it look pretty good.  If we can get 10-15% on some
> > machines without making things slower on any other machines, then that
> > seems like a good win to me.
>
> Pushed.
>
> Thank you both for reviewing this.
>
> David
>
>

Hmmm, unfortunately this seems to break my build ...

make[1]: Entering directory `/space2/pg13/postgres/src'
make -C common all
make[2]: Entering directory `/space2/pg13/postgres/src/common'
gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -O0 -DFRONTEND -I.
-I../../src/common -I../../src/include  -D_GNU_SOURCE  -DVAL_CC="\"gcc
-std=gnu99\"" -DVAL_CPPFLAGS="\"-D_GNU_SOURCE\"" -DVAL_CFLAGS="\"-Wall
-Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement
-Werror=vla -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv
-fexcess-precision=standard -g -O0\"" -DVAL_CFLAGS_SL="\"-fPIC\""
-DVAL_LDFLAGS="\"-Wl,--as-needed
-Wl,-rpath,'/usr/local/pg14/lib',--enable-new-dtags\""
-DVAL_LDFLAGS_EX="\"\"" -DVAL_LDFLAGS_SL="\"\""
-DVAL_LIBS="\"-lpgcommon -lpgport -lpthread -lz -lreadline -lrt -ldl
-lm \""  -c -o archive.o archive.c
In file included from ../../src/include/postgres_fe.h:25:0,
 from archive.c:19:
../../src/include/c.h:198:49: error: missing binary operator before token "("
 #if defined(__has_attribute) && __has_attribute (cold)
 ^
../../src/include/c.h:204:49: error: missing binary operator before token "("
 #if defined(__has_attribute) && __has_attribute (hot)
 ^
make[2]: *** [archive.o] Error 1
make[2]: Leaving directory `/space2/pg13/postgres/src/common'
make[1]: *** [all-common-recurse] Error 2
make[1]: Leaving directory `/space2/pg13/postgres/src'
make: *** [world-src-recurse] Error 2

[gregn@localhost postgres]$ gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


I think your commit needs to be fixed based on the following documentation:

https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005fattribute.html#g_t_005f_005fhas_005fattribute

"The first ‘#if’ test succeeds only when the operator is supported by
the version of GCC (or another compiler) being used. Only when that
test succeeds is it valid to use __has_attribute as a preprocessor
operator. As a result, combining the two tests into a single
expression as shown below would only be valid with a compiler that
supports the operator but not with others that don’t. "

(Thanks to my colleague Peter Smith for finding the doc explanation)

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel plans and "union all" subquery

2020-11-23 Thread Greg Nancarrow

On Tue, Nov 24, 2020 at 2:34 AM Luc Vlaming  wrote:
>
> Hi,
>
> For this problem there is a patch I created, which is registered under
> https://commitfest.postgresql.org/30/2787/ that should fix this without
> any workarounds. Maybe someone can take a look at it?
>

I tried your patch with the latest PG source code (24/11), but
unfortunately a non-parallel plan was still produced in this case.

test=# explain
select count(*)
from (select
n1
from drop_me
union all
values(1)) ua;
   QUERY PLAN

 Aggregate  (cost=1889383.54..1889383.55 rows=1 width=8)
   ->  Append  (cost=0.00..1362834.03 rows=42123961 width=32)
 ->  Seq Scan on drop_me  (cost=0.00..730974.60 rows=42123960 width=32)
 ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..0.02 rows=1 width=32)
   ->  Result  (cost=0.00..0.01 rows=1 width=4)
(5 rows)


That's not to say your patch doesn't have merit - but maybe just not a
fix for this particular case.

As before, if the SQL is tweaked to align the types for the UNION, you
get a parallel plan:

test=# explain
select count(*)
from (select
n1
from drop_me
union all
values(1::numeric)) ua;
 QUERY PLAN

 Finalize Aggregate  (cost=821152.71..821152.72 rows=1 width=8)
   ->  Gather  (cost=821152.50..821152.71 rows=2 width=8)
 Workers Planned: 2
 ->  Partial Aggregate  (cost=820152.50..820152.51 rows=1 width=8)
   ->  Parallel Append  (cost=0.00..747235.71 rows=29166714 width=0)
 ->  Result  (cost=0.00..0.01 rows=1 width=0)
 ->  Parallel Seq Scan on drop_me
(cost=0.00..601402.13 rows=29166713 width=0)
(7 rows)


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2020-11-24 Thread Greg Nancarrow

On Wed, Nov 25, 2020 at 12:07 PM Tom Lane  wrote:
>
>
> Here's a v2 that does it like that.
>

Looks OK to me.



Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel plans and "union all" subquery

2020-11-25 Thread Greg Nancarrow

On Wed, Nov 25, 2020 at 6:43 PM Luc Vlaming  wrote:
>
>
> You're completely right, sorry for my error. I was too quick on assuming
> my patch would work for this specific case too; I should have tested
> that before replying. It looked very similar but turns out to not work
> because of the upper rel not being considered parallel.
>
> I would like to extend my patch to support this, or create a second
> patch. This would however be significantly more involved because it
> would require that we (always?) consider two paths whenever we process a
> subquery: the best parallel plan and the best serial plan. Before I
> emback on such a journey I would like some input on whether this would
> be a very bad idea. Thoughts?
>

Hi,

I must admit, your intended approach isn't what immediately came to mind
when I saw this issue. Have you analyzed and debugged this to know exactly
what is going on?
I haven't had time to debug this and see exactly where the code paths
diverge for the use of "values(1)" verses "values(1::numeric)" in this
case, but that would be one of the first steps.

What I wondered (and I may well be wrong) was how come the documented type
resolution algorithm (
https://www.postgresql.org/docs/13/typeconv-union-case.html) doesn't seem
to be working quite right here, at least to the point of creating the
same/similar parse tree as when I change "values(1)" to
"values(1::numeric)" or even just "values(1.)"? So shouldn't then  the use
of "values(1)" in this case (a constant, convertible to numeric - the
preferred type ) result in the same (parallel) plan as when
"values(1::numeric)" is used? Perhaps this isn't happening because the code
is treating these as generalised expressions when their types aren't the
same, and this then affects parsing/planning?
My natural thought was that there seems to be a minor issue in the code,
which should be reasonably easy to fix, at least for this fairly simple
case.

However, I claim no expertise in the area of parser/analyzer/planner, I
only know certain areas of that code, but enough to appreciate it is
complex and intricate, and easily broken.
Perhaps one of the major contributors to this area of the code, who
probably know this code very well, like maybe Tom Lane or Robert Haas (to
name two) might like to comment on whether what we're looking at is indeed
a bug/deficiency and worth fixing, and whether Luc is correct in his
expressed approach on what would be required to fix it?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2020-11-25 Thread Greg Nancarrow

On Thu, Nov 26, 2020 at 3:43 AM Tom Lane  wrote:
>
> Thanks for looking!  Pushed.
>
> At this point the cfbot is going to start complaining that the last-posted
> patch in this thread no longer applies.  Unless you have a new patch set
> nearly ready to post, I think we should close the CF entry as RWF, and
> then you can open a new one when you're ready.
>

Actually, the cfbot shouldn't be complaining, as my last-posted patch
still seems to apply cleanly on the latest code (with your pushed
patch), and all tests pass.
Hopefully it's OK to let it roll over to the next CF with the WOA status.
I am actively working on processing the feedback and updating the
patch, and hope to post an update next week sometime.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel plans and "union all" subquery

2020-11-26 Thread Greg Nancarrow

On Thu, Nov 26, 2020 at 6:11 PM Luc Vlaming  wrote:
>
> If interesting I can make a draft of what this would look like if this
> makes it easier to discuss?
>

Sure, that would help clarify it.

I did debug this a bit, but it seems my gut feeling was wrong, even
though it knows a type coercion is required and can be done, the
parse/analyze code doesn't actually modify the nodes in place "for
fear of changing the semantics", so when the types don't exactly match
it's all left up to the planner, but for this parse tree it fails to
produce a parallel plan.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-17 Thread Greg Nancarrow

On Fri, Jan 8, 2021 at 8:25 PM vignesh C  wrote:
>

> For one of the test you can validate that the same transaction has
> been used by all the parallel workers, you could use something like
> below to validate:
> SELECT COUNT(*) FROM (SELECT DISTINCT cmin,xmin FROM  para_insert_p1) as dt;
>

> Few includes are not required:
>  #include "executor/nodeGather.h"
>
> This include is not required in nodeModifyTable.c
>

>
> The includes indexing.h, rewriteHandler.h & lmgr.h is not required in 
> clauses.c
>

> exeption should be exception
>
> resonable should be reasonable
>

Thanks Vignesh,
I'll be sure to make those updates in the next version of the patches.
Regards,
Greg

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-17 Thread Greg Nancarrow

On Fri, Jan 15, 2021 at 7:39 PM Amit Kapila  wrote:
>
> Here is an additional review of
> v11-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT. There are
> quite a few comments raised on the V11-0001* patch. I suggest first
> post a revised version of V11-0001* patch addressing those comments
> and then you can separately post a revised version of
> v11-0003-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO.
>

1)

>Here, it seems we are accessing the relation descriptor without any
>lock on the table which is dangerous considering the table can be
>modified in a parallel session. Is there a reason why you think this
>is safe? Did you check anywhere else such a coding pattern?

Yes, there's a very good reason and I certainly have checked for the
same coding pattern elsewhere, and not just randomly decided that
locking can be ignored.
The table has ALREADY been locked (by the caller) during the
parse/analyze phase.
(This is not the case for a partition, in which case the patch code
uses AccessShareLock, as you will see).
And BTW, with asserts enabled, an attempt to table_open() with NoLock
when you haven't already locked the table will fire an assert - see
following code in relation_open():

/*
 * If we didn't get the lock ourselves, assert that caller holds one,
 * except in bootstrap mode where no locks are used.
 */
Assert(lockmode != NoLock ||
   IsBootstrapProcessingMode() ||
   CheckRelationLockedByMe(r, AccessShareLock, true));

2)

>+ /*
>+ * If there are any index expressions, check that they are parallel-mode
>+ * safe.
>+ */
>+ max_hazard = index_expr_max_parallel_hazard_for_modify(rel, context);
>+ if (max_parallel_hazard_test(max_hazard, context))
>+ {
>+ table_close(rel, lockmode);
>+ return context->max_hazard;
>+ }

>Here and at all other similar places, the call to
>max_parallel_hazard_test seems redundant because
>index_expr_max_parallel_hazard_for_modify would have already done
>that. Why can't we just return true/false from
>index_expr_max_parallel_hazard_for_modify? The context would have been
>already updated for max_hazard.

Yes, you're right, it's redundant to call max_parallel_hazard_test(_)
again. The max_hazard can always be retrieved from the context if
needed (rather than explicitly returned), so I'll change this function
(and any similar cases) to just return true if the max_hazard of
interest has been reached.

3)

>I don't like this way of checking parallel_hazard for modify. This not
>only duplicates some code in max_parallel_hazard_for_modify from
>max_parallel_hazard but also appears quite awkward. Can we move
>max_parallel_hazard_for_modify inside max_parallel_hazard? Basically,
>after calling max_parallel_hazard_walker, we can check for modify
>statement and call the new function.

Agree, I'll move it, as you suggest.

4)

>domain_max_parallel_hazard_for_modify()
>{
>..
>+ if (isnull)
>+ {
>+ /*
>+ * This shouldn't ever happen, but if it does, log a WARNING
>+ * and return UNSAFE, rather than erroring out.
>+ */
>+ elog(WARNING, "null conbin for constraint %u", con->oid);
>+ context->max_hazard = PROPARALLEL_UNSAFE;
>+ break;
>+ }
>..
>}
>index_expr_max_parallel_hazard_for_modify()
>{
>..
>+ if (index_expr_item == NULL) /* shouldn't happen */
>+ {
>+ index_close(index_rel, lockmode);
>+ context->max_hazard = PROPARALLEL_UNSAFE;
>+ return context->max_hazard;
>+ }
>..
>}

>It is not clear why the above two are shouldn't happen cases and if so
>why we want to treat them as unsafe. Ideally, there should be an
>Assert if these can't happen but it is difficult to decide without
>knowing why you have considered them unsafe?

The checks being done here for "should never happen" cases are THE
SAME as other parts of the Postgres code.
For example, search Postgres code for "null conbin" and you'll find 6
other places in the Postgres code which actually ERROR out if conbin
(binary representation of the constraint) in a pg_constraint tuple is
found to be null.
The cases that you point out in the patch used to also error out in
the same way, but Vignesh suggested changing them to just return
parallel-unsafe instead of erroring-out, which I agree with. Such
cases could surely ever only happen if the DB had been corrupted, so
extremely rare IMHO and most likely to have caused an ERROR elsewhere
before ever reaching here...
I can add some Asserts to the current code, to better alert for such
cases, for when asserts are enabled, but otherwise I think that the
current code is OK (cleaning up other Postgres code is beyond the
scope of the task here).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-18 Thread Greg Nancarrow

On Mon, Jan 18, 2021 at 8:10 PM Amit Kapila  wrote:
>
> > 1)
> >
> > >Here, it seems we are accessing the relation descriptor without any
> > >lock on the table which is dangerous considering the table can be
> > >modified in a parallel session. Is there a reason why you think this
> > >is safe? Did you check anywhere else such a coding pattern?
> >
> > Yes, there's a very good reason and I certainly have checked for the
> > same coding pattern elsewhere, and not just randomly decided that
> > locking can be ignored.
> > The table has ALREADY been locked (by the caller) during the
> > parse/analyze phase.
> >
>
> Fair enough. I suggest adding a comment saying the same so that the
> reader doesn't get confused about the same.
>

OK, I'll add a comment.

> > (This is not the case for a partition, in which case the patch code
> > uses AccessShareLock, as you will see).
>
> Okay, but I see you release the lock on partition rel immediately
> after checking parallel-safety. What if a user added some
> parallel-unsafe constraint (via Alter Table) after that check?
>
> >

I'm not sure. But there would be a similar concern for current
Parallel SELECT functionality, right?
My recollection is that ALTER TABLE obtains an exclusive lock on the
table which it retains until the end of the transaction, so that will
result in blocking at certain points, during parallel-checks and
execution, but there may still be a window.

> > 4)
> >
> > >domain_max_parallel_hazard_for_modify()
> > >{
> > >..
> > >+ if (isnull)
> > >+ {
> > >+ /*
> > >+ * This shouldn't ever happen, but if it does, log a WARNING
> > >+ * and return UNSAFE, rather than erroring out.
> > >+ */
> > >+ elog(WARNING, "null conbin for constraint %u", con->oid);
> > >+ context->max_hazard = PROPARALLEL_UNSAFE;
> > >+ break;
> > >+ }
> > >..
> > >}
> > >index_expr_max_parallel_hazard_for_modify()
> > >{
> > >..
> > >+ if (index_expr_item == NULL) /* shouldn't happen */
> > >+ {
> > >+ index_close(index_rel, lockmode);
> > >+ context->max_hazard = PROPARALLEL_UNSAFE;
> > >+ return context->max_hazard;
> > >+ }
> > >..
> > >}
> >
> > >It is not clear why the above two are shouldn't happen cases and if so
> > >why we want to treat them as unsafe. Ideally, there should be an
> > >Assert if these can't happen but it is difficult to decide without
> > >knowing why you have considered them unsafe?
> >
> > The checks being done here for "should never happen" cases are THE
> > SAME as other parts of the Postgres code.
> > For example, search Postgres code for "null conbin" and you'll find 6
> > other places in the Postgres code which actually ERROR out if conbin
> > (binary representation of the constraint) in a pg_constraint tuple is
> > found to be null.
> > The cases that you point out in the patch used to also error out in
> > the same way, but Vignesh suggested changing them to just return
> > parallel-unsafe instead of erroring-out, which I agree with.
> >
>
> You have not raised a WARNING for the second case.

The same checks in current Postgres code also don't raise a WARNING
for that case, so I'm just being consistent with existing Postgres
code (which itself isn't consistent for those two cases).

>But in the first
> place what is the reasoning for making this different from other parts
> of code? If we don't have a solid reason then I suggest keeping these
> checks and errors the same as in other parts of the code.
>

The checks are the same as done in existing Postgres source - but
instead of failing with an ERROR (i.e. whole backend dies), in the
middle of parallel-safety-checking, it has been changed to regard the
operation as parallel-unsafe, so that it will try to execute in
non-parallel mode, where it will most likely fail too when those
corrupted attributes are accessed - but it will fail in the way that
it currently does in Postgres, should that very rare condition ever
happen. This was suggested by Vignesh, and I agree with him. So in
effect, it's just allowing it to use the existing error paths in the
code, rather than introducing new ERROR points.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-18 Thread Greg Nancarrow

On Tue, Jan 19, 2021 at 2:03 PM Amit Kapila  wrote:
>
> > >
> > > You have not raised a WARNING for the second case.
> >
> > The same checks in current Postgres code also don't raise a WARNING
> > for that case, so I'm just being consistent with existing Postgres
> > code (which itself isn't consistent for those two cases).
> >
>
> Search for the string "too few entries in indexprs list" and you will
> find a lot of places in code raising ERROR for the same condition.
>

Yes, but raising an ERROR stops processing (not just logs an error
message). Raising a WARNING logs a warning message and continues
processing. It's a big difference.
So, for the added parallel-safety-checking code, it was suggested by
Vignesh (and agreed by me) that, for these rare and highly unlikely
conditions, it would be best not to just copy the error-handling code
verbatim from other cases in the Postgres code (as I had originally
done) and just stop processing dead with an error, but to instead
return PARALLEL_UNSAFE, so that processing continues as it would for
current non-parallel processing, which would most likely error-out
anyway along the current error-handling checks and paths when those
bad attributes/fields are referenced.
I will add some Asserts() and don't mind adding a WARNING message for
the 2nd case.
If you really feel strongly about this, I can just restore the
original code, which will stop dead with an ERROR in the middle of
parallel-safety checking should one of these rare conditions ever
occur.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-18 Thread Greg Nancarrow

On Tue, Jan 19, 2021 at 1:37 PM tsunakawa.ta...@fujitsu.com
 wrote:
>
> > The table has ALREADY been locked (by the caller) during the
> > parse/analyze phase.
>
> Isn't there any case where planning is done but parse analysis is not done 
> immediately before?  e.g.
>
> * Alteration of objects invalidates cached query plans, and the next 
> execution of the plan rebuilds it.  (But it seems that parse analysis is done 
> in this case in plancache.c.)
>
> * Execute a prepared statement with a different parameter value, which builds 
> a new custom plan or a generic plan.
>

I don't know, but since NoLock is used in other parts of the planner,
I'd expect those to fail if such cases existed.

> Is the cached query plan invalidated when some alteration is done to change 
> the parallel safety, such as adding a trigger with a parallel-unsafe trigger 
> action?
>

Needs to be tested, but I'd expect the cached plan to get invalidated
in this case - surely the same potential issue exists in Postgres for
the current Parallel SELECT functionality - for example, for a column
with a default value that is an expression (which could be altered
from being parallel-safe to parallel-unsafe).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-19 Thread Greg Nancarrow

On Fri, Jan 8, 2021 at 8:25 PM vignesh C  wrote:
>
>
> Few includes are not required:
>  #include "executor/nodeGather.h"
> +#include "executor/nodeModifyTable.h"
>  #include "executor/nodeSubplan.h"
>  #include "executor/tqueue.h"
>  #include "miscadmin.h"
> @@ -60,6 +61,7 @@ ExecInitGather(Gather *node, EState *estate, int eflags)
> GatherState *gatherstate;
> Plan   *outerNode;
> TupleDesc   tupDesc;
> +   Index   varno;
>
> This include is not required in nodeModifyTable.c
>

I think you meant nodeGather.c (not nodeModifyTable.c).
However, the include file (executor/nodeModifyTable.h) is actually
required here, otherwise there are build warnings.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-20 Thread Greg Nancarrow

On Fri, Jan 8, 2021 at 8:16 PM Amit Kapila  wrote:
>
>
> - if (pcxt->nworkers_launched > 0)
> + if (pcxt->nworkers_launched > 0 && !(isParallelModifyLeader &&
> !isParallelModifyWithReturning))
>   {
>
> I think this check could be simplified to if (pcxt->nworkers_launched
> > 0 && isParallelModifyWithReturning) or something like that.
>

Not quite. The existing check is correct, because it needs to account
for existing Parallel SELECT functionality (not just new Parallel
INSERT).
But I will re-write the test as an equivalent expression, so it's
hopefully more readable (taking into account Antonin's suggested
variable-name changes):

if (pcxt->nworkers_launched > 0 && (!isModify || isModifyWithReturning))

> >
> > @@ -252,6 +252,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
> > PlannerGlobal *glob = root->glob;
> > int rtoffset = list_length(glob->finalrtable);
> > ListCell   *lc;
> > +   Plan   *finalPlan;
> >
> > /*
> >  * Add all the query's RTEs to the flattened rangetable.  The live 
> > ones
> > @@ -302,7 +303,17 @@ set_plan_references(PlannerInfo *root, Plan *plan)
> > }
> >
> > /* Now fix the Plan tree */
> > -   return set_plan_refs(root, plan, rtoffset);
> > +   finalPlan = set_plan_refs(root, plan, rtoffset);
> > +   if (finalPlan != NULL && IsA(finalPlan, Gather))
> > +   {
> > +   Plan   *subplan = outerPlan(finalPlan);
> > +
> > +   if (IsA(subplan, ModifyTable) && castNode(ModifyTable, 
> > subplan)->returningLists != NULL)
> > +   {
> > +   finalPlan->targetlist = 
> > copyObject(subplan->targetlist);
> > +   }
> > +   }
> > +   return finalPlan;
> >  }
> >
> > I'm not sure if the problem of missing targetlist should be handled here 
> > (BTW,
> > NIL is the constant for an empty list, not NULL). Obviously this is a
> > consequence of the fact that the ModifyTable node has no regular targetlist.
> >
>
> I think it is better to add comments along with this change. In this
> form, this looks quite hacky to me.
>

The targetlist on the ModifyTable node has been setup correctly, but
it hasn't been propagated to the Gather node.
Of course, I have previously tried to elegantly fix this, but struck
various problems, using different approaches.
Perhaps this code could just be moved into set_plan_refs().
For the moment, I'll just keep the current code, but I'll add a FIXME
comment for this.
I'll investigate Antonin's suggestions as a lower-priority side-task.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-20 Thread Greg Nancarrow

On Thu, Jan 21, 2021 at 1:28 PM Zhihong Yu  wrote:
>
> Hi,
> For v12-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch :
>
> is found from the additional parallel-safety checks, or from the existing
> parallel-safety checks for SELECT that it currently performs.
>
> existing and 'it currently performs' are redundant. You can omit 'that it 
> currently performs'.
>

OK, but this is very minor.

> Minor. For index_expr_max_parallel_hazard_for_modify(),
>
> +   if (keycol == 0)
> +   {
> +   /* Found an index expression */
>
> You can check if keycol != 0, continue with the loop. This would save some 
> indent.

Yes I know, but I don't really see any issue with indent (I'm using
4-space tabs).

>
> +   if (index_expr_item == NULL)/* shouldn't happen */
> +   {
> +   elog(WARNING, "too few entries in indexprs list");
>
> I think the warning should be an error since there is assertion ahead of the 
> if statement.
>

Assertions are normally for DEBUG builds, so the Assert would have no
effect in a production (release) build.
Besides, as I have explained in my reply to previous feedback, the
logging of a WARNING (rather than ERROR) is intentional, because I
want processing to continue (not stop) if ever this very rare
condition was to occur - so that the issue can be dealt with by the
current non-parallel processing (rather than stop dead in the middle
of parallel-safety-checking code). For a DEBUG build, it is handy for
the Assert to immediately alert us to the issue (which could really
only be caused by a database corruption, not bug in the code).
Note that Vignesh originally suggested changing it from
"elog(ERROR,...)" to "elog(WARNING,...)", and I agree with him.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-20 Thread Greg Nancarrow

On Thu, Jan 21, 2021 at 1:50 PM Zhihong Yu  wrote:
>
> For v12-0003-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO.patch:
>
> +   boolisParallelModifyLeader = IsA(planstate, GatherState) && 
> IsA(outerPlanState(planstate), ModifyTableState);
>
> Please wrap long line.
>

OK.
I thought I ran pg_indent fairly recently, but maybe it chose not to
wrap that line.


> +   uint64 *processed_count_space;
>
> If I read the code correctly, it seems it can be dropped (use 
> pei->processed_count directly).
>

You're right. I'll change it in the next version.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-20 Thread Greg Nancarrow

On Thu, Jan 21, 2021 at 12:47 PM Hou, Zhijie  wrote:
>
> >
> Hi
>
> It seems there are some previous comments[1][2] not addressed in current 
> patch.
> Just to make sure it's not missed.
>
> [1]
> https://www.postgresql.org/message-id/77e1c06ffb2240838e5fc94ec8dcb7d3%40G08CNEXMBPEKD05.g08.fujitsu.local
>
> [2]
> https://www.postgresql.org/message-id/CAA4eK1LMmz58ej5BgVLJ8VsUGd%3D%2BKcaA8X%3DkStORhxpfpODOxg%40mail.gmail.com
>

Thanks for alerting me to those, somehow I completely missed them,
sorry about that.
I'll be sure to investigate and address them in the next version of
the patch I post.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-20 Thread Greg Nancarrow

On Wed, Dec 23, 2020 at 1:45 PM Amit Kapila  wrote:
>
> On Wed, Dec 23, 2020 at 7:52 AM Hou, Zhijie  wrote:
> >
> > Hi
> >
> > > > I may be wrong, and if I miss sth in previous mails, please give me some
> > > hints.
> > > > IMO, serial insertion with underlying parallel SELECT can be
> > > > considered for foreign table or temporary table, as the insertions only
> > > happened in the leader process.
> > > >
> > >
> > > I don't think we support parallel scan for temporary tables. Can you 
> > > please
> > > try once both of these operations without Insert being involved? If you
> > > are able to produce a parallel plan without Insert then we can see why it
> > > is not supported with Insert.
> >
> > Sorry, may be I did not express it clearly, I actually means the case when 
> > insert's target(not in select part) table is temporary.
> > And you are right that parallel select is not enabled when temporary table 
> > is in select part.
> >
>
> I think Select can be parallel for this case and we should support this case.
>

So I think we're saying that if the target table is a foreign table or
temporary table, it can be regarded as PARALLEL_RESTRICTED, right?

i.e. code-wise:

/*
-* We can't support table modification in parallel-mode if
it's a foreign
-* table/partition (no FDW API for supporting parallel access) or a
+* We can't support table modification in a parallel worker if it's a
+* foreign table/partition (no FDW API for supporting parallel
access) or a
 * temporary table.
 */
if (rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE ||
RelationUsesLocalBuffers(rel))
{
-   table_close(rel, lockmode);
-   context->max_hazard = PROPARALLEL_UNSAFE;
-   return true;
+   if (max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context))
+   {
+   table_close(rel, lockmode);
+   return true;
+   }
}


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-20 Thread Greg Nancarrow

On Mon, Dec 21, 2020 at 1:50 PM Hou, Zhijie  wrote:
>
> Hi
>
> +
> +   index_oid_list = RelationGetIndexList(rel);
> ...
>
> As memtioned in the comments of RelationGetIndexList:
> * we return a copy of the list palloc'd in the caller's context.  The caller
> * may list_free() the returned list after scanning it.
>
> Shall we list_free(index_oid_list) at the end of function ?
> Just to avoid potential memory leak.
>

I think that's a good idea, so I'll make that update in the next
version of the patch.
I do notice, however, that there seems to be quite a few places in the
Postgres code where RelationGetIndexList() is being called without a
corresponding list_free() being called - obviously instead relying on
it being deallocated when the caller's memory-context is destroyed.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Greg Nancarrow

On Fri, Jan 22, 2021 at 12:08 PM Hou, Zhijie  wrote:
>
> >
> > I think that's a good idea, so I'll make that update in the next version
> > of the patch.
> > I do notice, however, that there seems to be quite a few places in the 
> > Postgres
> > code where RelationGetIndexList() is being called without a corresponding
> > list_free() being called - obviously instead relying on it being deallocated
> > when the caller's memory-context is destroyed.
> >
>
> Yes, it will be deallocated when the caller's memory-context is destroyed.
>
> Currently, parallel safety-check check each partition.
> I am just a little worried about if there are lots of partition here, it may 
> cause high memory use.
>
> And there is another place like this:
>
> 1.
> +   conbin = TextDatumGetCString(val);
> +   check_expr = stringToNode(conbin);
>
> It seems we can free the cobin when not used(for the same reason above).
> What do you think ?
>
>

Yes, I think you're right, we should pfree conbin after converting to
Node, to minimize memory usage.
Again, it's interesting that existing Postgres code, when looping
through all of the constraints, doesn't do this.
Hmmm. I'm wondering if there is a performance reason behind this -
avoiding multiple calls to pfree() and just relying on it to be
deallocated in one hit, when the memory context is destroyed.
Anyway, perhaps the concerns of many partitions and the recursive
nature of these checks overrides that, because, as you say, possible
high memory usage.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Greg Nancarrow

On Thu, Jan 21, 2021 at 7:30 PM Amit Kapila  wrote:
>
> > i.e. code-wise:
> >
> > /*
> > -* We can't support table modification in parallel-mode if
> > it's a foreign
> > -* table/partition (no FDW API for supporting parallel access) or a
> > +* We can't support table modification in a parallel worker if it's 
> > a
> > +* foreign table/partition (no FDW API for supporting parallel
> > access) or a
> >  * temporary table.
> >  */
> > if (rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE ||
> > RelationUsesLocalBuffers(rel))
> > {
> > -   table_close(rel, lockmode);
> > -   context->max_hazard = PROPARALLEL_UNSAFE;
> > -   return true;
> > +   if (max_parallel_hazard_test(PROPARALLEL_RESTRICTED, 
> > context))
> > +   {
> > +   table_close(rel, lockmode);
> > +   return true;
> > +   }
> > }
> >
>
> Yeah, these changes look correct to me.
>

Unfortunately, this change results in a single test failure in the
"with" tests when "force_parallel_mode=regress" is in effect.

I have reproduced the problem, by extracting relevant SQL from those
tests, as follows:

CREATE TEMP TABLE bug6051 AS
  select i from generate_series(1,3) as i;
SELECT * FROM bug6051;
CREATE TEMP TABLE bug6051_2 (i int);
CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD
 INSERT INTO bug6051_2
 SELECT NEW.i;
WITH t1 AS ( DELETE FROM bug6051 RETURNING * )
INSERT INTO bug6051 SELECT * FROM t1;
ERROR:  cannot delete tuples during a parallel operation

Note that prior to the patch, all INSERTs were regarded as
PARALLEL_UNSAFE, so this problem obviously didn't occur.
I believe this INSERT should be regarded as PARALLEL_UNSAFE, because
it contains a modifying CTE.
However, for some reason, the INSERT is not regarded as having a
modifying CTE, so instead of finding it PARALLEL_UNSAFE, it falls into
the parallel-safety-checks and is found to be PARALLEL_RESTRICTED:

The relevant code in standard_planner() is:

if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
IsUnderPostmaster &&
(parse->commandType == CMD_SELECT ||
 IsModifySupportedInParallelMode(parse->commandType)) &&
!parse->hasModifyingCTE &&
max_parallel_workers_per_gather > 0 &&
!IsParallelWorker())
{
/* all the cheap tests pass, so scan the query tree */
glob->maxParallelHazard = max_parallel_hazard(parse);
glob->parallelModeOK = (glob->maxParallelHazard != PROPARALLEL_UNSAFE);
}
else
{
/* skip the query tree scan, just assume it's unsafe */
glob->maxParallelHazard = PROPARALLEL_UNSAFE;
glob->parallelModeOK = false;
}

When I debugged this (transformWithClause()), the WITH clause was
found to contain a modifying CTE and for the INSERT
query->hasModifyingCTE was set true.
But somehow in the re-writer code, this got lost.
Bug?
Ideas?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Greg Nancarrow

On Fri, Jan 22, 2021 at 1:16 PM Hou, Zhijie  wrote:
>
> Hi
>
> I took a look at v12-0001 patch, here are some comments:
>
> 1.
> +   /*
> +* Setup the context used in finding the max parallel-mode hazard.
> +*/
> +   Assert(initial_max_parallel_hazard == 0 ||
> +  initial_max_parallel_hazard == PROPARALLEL_SAFE ||
> +  initial_max_parallel_hazard == PROPARALLEL_RESTRICTED);
> +   context.max_hazard = initial_max_parallel_hazard == 0 ?
> +   PROPARALLEL_SAFE : initial_max_parallel_hazard;
>
> I am not quiet sure when will " max_parallel_hazard == 0"
> Does it means the case max_parallel_hazard_context not initialized ?
>

That function doesn't accept a "max_parallel_hazard_context". It
accepts an initial "max_parallel_hazard" value (char).
The "0" value is just a way of specifying "use the default"
(PROPARALLEL_SAFE). It is not currently used, since currently we just
always pass the "context.max_parallel_hazard" value resulting from the
previous parallel-safety checks for SELECT (and otherwise don't call
that function anywhere else).

>
> 2.
> Some tiny code style suggestions
>
> +   if (con->contype == CONSTRAINT_CHECK)
> +   {
> +   char   *conbin;
> +   Datum   val;
> +   boolisnull;
> +   Expr   *check_expr;
>
> How about :
>
> if (con->contype != CONSTRAINT_CHECK)
> continue;
>
> 3.
> +   if (keycol == 0)
> +   {
> +   /* Found an index expression */
> +
> +   Node   *index_expr;
>
> Like 2, how about:
>
> If (keycol != 0)
> Continue;
>

This is really a programmer style preference (plenty of discussions on
the internet about it), but it can be argued that use of "continue"
here is not quite as clear as the explicit "if" condition, especially
in this very simple one-condition case.
I'm inclined to leave it as is.

>
> 4.
> +   ListCell   *index_expr_item = 
> list_head(index_info->ii_Expressions);
> ...
> +   index_expr = (Node *) 
> lfirst(index_expr_item);
> +   index_expr = (Node *) 
> expression_planner((Expr *) index_expr);
>
> It seems BuildIndexInfo has already called eval_const_expressions for 
> ii_Expressions,
> Like the flow: BuildIndexInfo--> RelationGetIndexExpressions--> 
> eval_const_expressions
>
> So, IMO, we do not need to call expression_planner for the expr again.
>
>
> And there seems another solution for this:
>
> In the patch, We only use the { ii_Expressions , ii_NumIndexAttrs , 
> ii_IndexAttrNumbers } from the IndexInfo,
> which seems can get from "Relation-> rd_index".
>
> Based on above, May be we do not need to call BuildIndexInfo to build the 
> IndexInfo.
> It can avoid some unnecessary cost.
> And in this solution we do not need to remove expression_planner.
>

OK, maybe this is a good idea, but I do recall trying to minimize this
kind of processing before, but there were cases which broke it.
Have you actually tried your idea and run all regression tests and
verified that they passed?
In any case, I'll look into it.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Greg Nancarrow

On Fri, Jan 22, 2021 at 1:16 PM Hou, Zhijie  wrote:
>
>
> 4.
> +   ListCell   *index_expr_item = 
> list_head(index_info->ii_Expressions);
> ...
> +   index_expr = (Node *) 
> lfirst(index_expr_item);
> +   index_expr = (Node *) 
> expression_planner((Expr *) index_expr);
>
> It seems BuildIndexInfo has already called eval_const_expressions for 
> ii_Expressions,
> Like the flow: BuildIndexInfo--> RelationGetIndexExpressions--> 
> eval_const_expressions
>
> So, IMO, we do not need to call expression_planner for the expr again.
>

Thanks. You are right. I debugged it, and found that BuildIndexInfo-->
RelationGetIndexExpressions executes the same expression evaluation
code as expression_planner().
So I'll remove the redundant call to expression_planner() here.

>
> And there seems another solution for this:
>
> In the patch, We only use the { ii_Expressions , ii_NumIndexAttrs , 
> ii_IndexAttrNumbers } from the IndexInfo,
> which seems can get from "Relation-> rd_index".
>
> Based on above, May be we do not need to call BuildIndexInfo to build the 
> IndexInfo.
> It can avoid some unnecessary cost.
> And in this solution we do not need to remove expression_planner.
>

Hmmm, when I debugged my simple test case, I found rel->rd_index was
NULL, so it seems that the call to BuildIndexInfo is needed.
(have I understood your suggestion correctly?)

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-21 Thread Greg Nancarrow

On Fri, Jan 22, 2021 at 4:49 PM Amit Kapila  wrote:
>
> >
> > Unfortunately, this change results in a single test failure in the
> > "with" tests when "force_parallel_mode=regress" is in effect.
> >
> > I have reproduced the problem, by extracting relevant SQL from those
> > tests, as follows:
> >
> > CREATE TEMP TABLE bug6051 AS
> >   select i from generate_series(1,3) as i;
> > SELECT * FROM bug6051;
> > CREATE TEMP TABLE bug6051_2 (i int);
> > CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD
> >  INSERT INTO bug6051_2
> >  SELECT NEW.i;
> > WITH t1 AS ( DELETE FROM bug6051 RETURNING * )
> > INSERT INTO bug6051 SELECT * FROM t1;
> > ERROR:  cannot delete tuples during a parallel operation
> >
> > Note that prior to the patch, all INSERTs were regarded as
> > PARALLEL_UNSAFE, so this problem obviously didn't occur.
> > I believe this INSERT should be regarded as PARALLEL_UNSAFE, because
> > it contains a modifying CTE.
> > However, for some reason, the INSERT is not regarded as having a
> > modifying CTE, so instead of finding it PARALLEL_UNSAFE, it falls into
> > the parallel-safety-checks and is found to be PARALLEL_RESTRICTED:
> >
> > The relevant code in standard_planner() is:
> >
> > if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
> > IsUnderPostmaster &&
> > (parse->commandType == CMD_SELECT ||
> >  IsModifySupportedInParallelMode(parse->commandType)) &&
> > !parse->hasModifyingCTE &&
> > max_parallel_workers_per_gather > 0 &&
> > !IsParallelWorker())
> > {
> > /* all the cheap tests pass, so scan the query tree */
> > glob->maxParallelHazard = max_parallel_hazard(parse);
> > glob->parallelModeOK = (glob->maxParallelHazard != 
> > PROPARALLEL_UNSAFE);
> > }
> > else
> > {
> > /* skip the query tree scan, just assume it's unsafe */
> > glob->maxParallelHazard = PROPARALLEL_UNSAFE;
> > glob->parallelModeOK = false;
> > }
> >
> > When I debugged this (transformWithClause()), the WITH clause was
> > found to contain a modifying CTE and for the INSERT
> > query->hasModifyingCTE was set true.
> > But somehow in the re-writer code, this got lost.
> > Bug?
> > Ideas?
> >
>
> How it behaves when the table in the above test is a non-temp table
> with your patch? If it leads to the same error then we can at least
> conclude that this is a generic problem and nothing specific to temp
> tables.
>

Oh, I don't believe that this has anything to do with TEMP tables -
it's just that when I relaxed the parallel-safety level on TEMP
tables, it exposed the CTE issue in this test case because it just
happens to use a TEMP table.
Having said that, when I changed that test code to not use a TEMP
table, an Assert fired in the planner code and caused the backend to
abort.
It looks like I need to update the following Assert in the planner
code (unchanged by the current patch) in order to test further - but
this Assert only fired because the commandType was CMD_DELETE, which
SHOULD have been excluded by the "hasModifyingCTE" test on the parent
INSERT, which is what I'm saying is strangely NOT getting set.

/*
 * Generate partial paths for final_rel, too, if outer query levels might
 * be able to make use of them.
 */
if (final_rel->consider_parallel && root->query_level > 1 &&
!limit_needed(parse))
{
Assert(!parse->rowMarks && parse->commandType == CMD_SELECT);
foreach(lc, current_rel->partial_pathlist)
{
Path   *partial_path = (Path *) lfirst(lc);

add_partial_path(final_rel, partial_path);
}
}

Once the Assert above is changed to not test the commandType, the same
error occurs as before.

This appears to possibly be some kind of bug in which hasModifyingCTE
is not getting set, at least in this particular INSERT case, but in
the current Postgres code it doesn't matter because all INSERTs are
considered parallel-unsafe, so won't be executed in parallel-mode
anyway.
I notice that if I execute "SELECT * from t1" instead of "INSERT INTO
bug6051 SELECT * from t1", then "hasModifyingCTE" is getting set to
true for the query, and thus is always considered parallel-unsafe.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-22 Thread Greg Nancarrow

On Fri, Jan 22, 2021 at 6:21 PM Hou, Zhijie  wrote:
> Hi greg,
>
> Thanks for debugging this.
>
> May be I missed something. I am not sure about the case when rel->rd_index 
> was NULL.
> Because, In function BuildIndexInfo, it seems does not have NULL-check for 
> index->rd_index.
> Like the following:
> 
> BuildIndexInfo(Relation index)
> {
> IndexInfo  *ii;
> Form_pg_index indexStruct = index->rd_index;
> int i;
> int numAtts;
>
> /* check the number of keys, and copy attr numbers into the IndexInfo 
> */
> numAtts = indexStruct->indnatts;
> 
>
> And the patch do not have NULL-check for index->rd_index too.
> So I thought we can assume index->rd_index is not null, but it seems I may 
> missed something ?
>
> Can you please share the test case with me ?
>
> I use the following code to replace the call of BuildIndexInfo.
> And the installcheck passed.
>
> Example:
> +Form_pg_index indexStruct = index_rel->rd_index;
> +List *ii_Expressions = 
> RelationGetIndexExpressions(index_rel);
> +int ii_NumIndexAttrs = indexStruct->indnatts;
> +AttrNumber  ii_IndexAttrNumbers[INDEX_MAX_KEYS];
>
> +for (int i = 0; i < ii_NumIndexAttrs; i++)
> +ii_IndexAttrNumbers[i] = 
> indexStruct->indkey.values[i];

Sorry, I was looking at rel->rd_index, not index_rel->rd_index, my fault.
Your code looks OK. I've taken it and reduced some of the lines and
got rid of the C99-only intermingled variable declarations (see
https://www.postgresql.org/docs/13/source-conventions.html).
The changes are below.
The regression tests all pass, so should be OK (my test case was taken
from insert_parallel regression tests).
Thanks for your help.

-Oid index_oid = lfirst_oid(lc);
-Relationindex_rel;
-IndexInfo  *index_info;
+Relationindex_rel;
+Form_pg_index   indexStruct;
+List*ii_Expressions;
+Oid index_oid = lfirst_oid(lc);

 index_rel = index_open(index_oid, lockmode);

-index_info = BuildIndexInfo(index_rel);
+indexStruct = index_rel->rd_index;
+ii_Expressions = RelationGetIndexExpressions(index_rel);

-if (index_info->ii_Expressions != NIL)
+if (ii_Expressions != NIL)
 {
 int i;
-ListCell   *index_expr_item =
list_head(index_info->ii_Expressions);
+ListCell*index_expr_item = list_head(ii_Expressions);

-for (i = 0; i < index_info->ii_NumIndexAttrs; i++)
+for (i = 0; i < indexStruct->indnatts; i++)
 {
-int keycol = index_info->ii_IndexAttrNumbers[i];
+int keycol = indexStruct->indkey.values[i];

 if (keycol == 0)
 {
@@ -912,7 +914,7 @@ index_expr_max_parallel_hazard_for_modify(Relation rel,
 return true;
 }

-index_expr_item =
lnext(index_info->ii_Expressions, index_expr_item);
+index_expr_item = lnext(ii_Expressions, index_expr_item);
 }
 }


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-22 Thread Greg Nancarrow

repareParallelMode(estate->es_plannedstmt->commandType);
> +   boolisParallelModifyLeader = IsA(planstate, 
> GatherState) && IsA(outerPlanState(planstate), ModifyTableState);
> +
> +   PrepareParallelMode(estate->es_plannedstmt->commandType, 
> isParallelModifyLeader);
> EnterParallelMode();
> }
>
> @@ -1021,12 +1039,25 @@ IsInParallelMode(void)
>   * Prepare for entering parallel mode, based on command-type.
>   */
>  void
> -PrepareParallelMode(CmdType commandType)
> +PrepareParallelMode(CmdType commandType, bool isParallelModifyLeader)
>  {
> if (IsModifySupportedInParallelMode(commandType))
> {
> Assert(!IsInParallelMode());
>
> +   if (isParallelModifyLeader)
> +   {
> +   /*
> +* Set currentCommandIdUsed to true, to ensure that 
> the current
> +* CommandId (which will be used by the parallel 
> workers) won't
> +* change during this parallel operation, as starting 
> new
> +* commands in parallel-mode is not currently 
> supported.
> +* See related comments in GetCurrentCommandId and
> +* CommandCounterIncrement.
> +*/
> +   (void) GetCurrentCommandId(true);
> +   }
>
> I think we can eliminate the second argument of PrepareParallelMode() and the 
> new code in ExecutePlan().  PrepareParallelMode() can use !IsParallelWorker() 
> in the if condition, because the caller is either a would-be parallel leader 
> or a parallel worker.

You could, but I'm not sure it would make the code easier to read,
especially for those who don't know !isParallelWorker() means it's a
parallel leader.

>
> BTW, why do we want to add PrepareParallelMode() separately from 
> EnterParallelMode()?  Someone who will read other call sites of 
> EnterParallelMode() (index build, VACUUM) may be worried that 
> PrepareParallelMode() call is missing there.  Can we just add an argument to 
> EnterParallelMode()?  Other call sites can use CMD_UNKNOWN or CMD_UTILITY, if 
> we want to use CMD_XX.
>

I really can't see a problem. PrepareParallelMode() is only needed
prior to execution of a parallel plan, so it's not needed for "other
call sites" using EnterParallelMode().
Perhaps the name can be changed to disassociate it from generic
EnterParallelMode() usage. So far, I've only thought of long names
like: PrepareParallelModePlanExec().
Ideas?

Regards,
Greg Nancarrow
Fujitsu Australia

.

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-24 Thread Greg Nancarrow

in a parallel worker to GetCurrentCommandId()
will result in an Assert being fired because (prior to the patch)
currentCommandIdUsed is forbidden to be set in a parallel worker, and
calling GetCurrentCommandId(true) (to signify the intent to use the
returned CommandId to mark inserted/updated/deleted tuples) will result in
currentCommandIdUsed being set to true.
So it is clear that this cannot remain the same, in order to support
Parallel INSERT by workers.
So for each worker, the patch sets "currentCommandIdUsed" to true at the
start of the parallel operation (using SetCurrentCommandIdUsedForWorker())
and the Assert condition in GetCurrentCommandId() is tweaked to fire the
Assert if GetCurrentCommandId(true) is called in a parallel worker when
currentCommandIdUsed is false;
To me, this makes perfect sense.


>If this kind of change is really necessary, it seems more natural to pass
currentCommandIdUsed together with currentCommandId through
SerializeTransactionState() and StartParallelWorkerTransaction(), instead
of the above changes.

No, I don't agree with that. That approach doesn't sound right to me at all.
All the patch really changes is WHERE "currentCurrentIdUsed" can be set for
a parallel worker - now it is only allowed to be set to true at the start
of the parallel operation for each worker, and the Assert (which is just a
sanity check) is updated to ensure that for workers, it can only be set
true at that time. That's all it does. It's completely consistent with the
old comment that said "We could relax this restriction when
currentCommandIdUsed was already true at the start of the parallel
operation" - that's what we are now doing with the patch.


> As an aside, SetCurrentCommandIdUsed() in the comment should be
SetCurrentCommandIdUsedForWorker().
>

Thanks, I'll fix that in the comments.


>
> (8)
> +   /*
> +* If the trigger type is RI_TRIGGER_FK, this indicates a
FK exists in
> +* the relation, and this would result in creation of new
CommandIds
> +* on insert/update/delete and this isn't supported in a
parallel
> +* worker (but is safe in the parallel leader).
> +*/
> +   trigtype = RI_FKey_trigger_type(trigger->tgfoid);
> +   if (trigtype == RI_TRIGGER_FK)
> +   {
> +   if
(max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context))
> +   return true;
> +   }
>
> Here, RI_TRIGGER_FK should instead be RI_TRIGGER_PK, because
RI_TRIGGER_FK triggers do not generate command IDs.  See RI_FKey_check()
which is called in RI_TRIGGER_FK case.  In there, ri_PerformCheck() is
called with the detectNewRows argument set to false, which causes
CommandCounterIncrement() to not be called.
>

Hmmm, I'm not sure that you have read and interpreted the patch code
correctly.
The existence of a RI_TRIGGER_FK trigger indicates the table has a foreign
key, and an insert into such a table will generate a new commandId (so we
must avoid that, as we don't currently have the technology to support
sharing of new command IDs across the participants in the parallel
operation). This is what the code comment says, It does not say that such a
trigger generates a new command ID.

See Amit's updated comment here:
https://github.com/postgres/postgres/commit/0d32511eca5aec205cb6b609638ea67129ef6665

In addition, the 2nd patch has an explicit test case for this (testing
insert into a table that has a FK).

If you have a test case that breaks the existing patch, please let me know.


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-24 Thread Greg Nancarrow

On Mon, Jan 25, 2021 at 2:22 PM Hou, Zhijie 
wrote:

> Hi,
>
> After doing some test to cover the code path in the PATCH 0001.
> I have some suggestions for the 0002 testcase.
>
>
> (1)
> +   /* Check parallel-safety of any expressions in the
> partition key */
> +   if (get_partition_col_attnum(pkey, i) == 0)
> +   {
> +   Node   *check_expr = (Node *)
> lfirst(partexprs_item);
> +
> +   if (max_parallel_hazard_walker(check_expr,
> context))
> +   {
> +   table_close(rel, lockmode);
> +   return true;
> +   }
>
> The testcase seems does not cover the above code(test when the table have
> parallel unsafe expression in the partition key).
>
> Personally, I use the following sql to cover this:
> -
> create table partkey_unsafe_key_expr_t (a int4, b name) partition by range
> ((fullname_parallel_unsafe('',a::varchar)));
> explain (costs off) insert into partkey_unsafe_key_expr_t select unique1,
> stringu1 from tenk1;
> -
>
>
Thanks. It looks like that test case was accidently missed (since the
comment said to test the index expressions, but it actually tested the
support functions).
I'll update the test code (and comments) accordingly, using your
suggestion.


>
> (2)
> I noticed that most of testcase test both (parallel
> safe/unsafe/restricted).
> But the index expression seems does not test the parallel restricted.
> How about add a testcase like:
> -
> create or replace function fullname_parallel_restricted(f text, l text)
> returns text as $$
> begin
> return f || l;
> end;
> $$ language plpgsql immutable parallel restricted;
>
> create table names4(index int, first_name text, last_name text);
> create index names4_fullname_idx on names4
> (fullname_parallel_restricted(first_name, last_name));
>
> --
> -- Test INSERT with parallel-restricted index expression
> -- (should create a parallel plan)
> --
> explain (costs off) insert into names4 select * from names;
> -
>
>
Thanks, looks like that test case is missing, I'll add it as you suggest.


> (3)
> +   /* Recursively check each partition ... */
> +   pdesc = RelationGetPartitionDesc(rel);
> +   for (i = 0; i < pdesc->nparts; i++)
> +   {
> +   if
> (rel_max_parallel_hazard_for_modify(pdesc->oids[i],
> +
>  command_type,
> +
>  context,
> +
>  AccessShareLock))
> +   {
> +   table_close(rel, lockmode);
> +   return true;
> +   }
> +   }
>
> It seems we do not have a testcase to test (some parallel unsafe
> expression or.. in partition)
> Hoe about add one testcase to test parallel unsafe partition ?
>
>
>
OK, I have to create a more complex table to test those other potential
parallel-safety issues of partitions (other than what was tested before the
recursive call, or support functions and expression in index key), but
since it's a recursive call, invoking code that's already been tested, I
would not anticipate any problems.


Thanks,

Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-24 Thread Greg Nancarrow

On Mon, Jan 25, 2021 at 4:37 PM tsunakawa.ta...@fujitsu.com
 wrote:
>
>
> > > (4)
> > You could, but I'm not sure it would make the code easier to read,
> > especially for those who don't know !isParallelWorker() means it's a
> > parallel leader.
> ...
> > I really can't see a problem. PrepareParallelMode() is only needed
> > prior to execution of a parallel plan, so it's not needed for "other
> > call sites" using EnterParallelMode().
> My frank first impressions were (and are):
>
> * Why do we have to call a separate function for preparation despite the 
> actual entering follows immediately?  We can do necessary preparation in the 
> entering function.
>
> * Those who read the parallel index build and parallel VACUUM code for the 
> first time might be startled at the missing PrepareParallelMode() call: "Oh, 
> EnterParallelMode() is called without preparation unlike the other site I saw 
> the other day.  Isn't this a but?"
>
>
> > Perhaps the name can be changed to disassociate it from generic
> > EnterParallelMode() usage. So far, I've only thought of long names
> > like: PrepareParallelModePlanExec().
> > Ideas?
>
> What PrepareParallelMode() handles is the XID and command ID, which are 
> managed by access/transam/ module and are not executor-specific.  It's 
> natural (or at least not unnatural) that EnterParallelMode() prepares them, 
> because EnterParallelMode() is part of access/transam/.
>
>

EnterParallelMode() is part of a generic interface for execution of a
parallel operation, and EnterParallelMode() is called in several
different places to enter parallel mode prior to execution of
different parallel operations. At the moment it is assumed that
EnterParallelMode() just essentially sets a flag to prohibit certain
unsafe operations when doing the parallel operation. If I move
PrepareParallelMode() into EnterParallelMode() then I need to pass in
contextual information to distinguish who the caller is, and possibly
extra information needed by that caller - and change the function call
for each caller, and probably update the comments for each, and in
other places, etc. etc.
I think that it just complicates things doing this. The other callers
of EnterParallelMode() are obviously currently doing their own "pre"
parallel-mode code themselves, specific to whatever parallel operation
they are doing - but nobody has thought it necessary to have to hook
this code into EnterParallelMode().
I think the "PrepareParallelMode()" name can just be changed to
something specific to plan execution, so nobody gets confused with a
name like "PrepareParallelMode()", which as you point out sounds
generic to all callers of EnterParallelMode().

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-25 Thread Greg Nancarrow

On Fri, Jan 22, 2021 at 7:52 PM tsunakawa.ta...@fujitsu.com
 wrote:
>
>
> (1)
> -* (Note that we do allow CREATE TABLE AS, SELECT INTO, and CREATE
> -* MATERIALIZED VIEW to use parallel plans, but as of now, only the 
> leader
> -* backend writes into a completely new table.  In the future, we can
> -* extend it to allow workers to write into the table.  However, to 
> allow
> -* parallel updates and deletes, we have to solve other problems,
> -* especially around combo CIDs.)
> +* (Note that we do allow CREATE TABLE AS, INSERT INTO...SELECT, 
> SELECT
> +* INTO, and CREATE MATERIALIZED VIEW to use parallel plans. However, 
> as
> +* of now, only the leader backend writes into a completely new 
> table. In
>
> This can read "In INSERT INTO...SELECT case, like other existing cases, only 
> the leader backend writes into a completely new table."  The reality is that 
> workers as well as the leader can write into an empty or non-empty table in 
> parallel, isn't it?
>
>

Sorry, I've just realized that this is in reference to the 1st patch
(v12-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch),
which implements parallel SELECT for INSERT.
In that case, data is SELECTed in parallel by the workers, but only
INSERTed by the parallel leader.
So the patch comment is, in fact, correct.
In the 3rd patch
(v12-0003-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO.patch),
which implements parallel INSERT, the wording for this comment is
again altered, to reflect the fact that parallel workers also write
into the table.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-25 Thread Greg Nancarrow

On Mon, Jan 25, 2021 at 10:40 PM Hou, Zhijie  wrote:
>
> Hi,
>
> When reading the code of rel_max_parallel_hazard_for_modify in 0001.
>
> I thought there are so many places call table_close().
> Personally, It's a little confused to me.
>
> Do you think it's better to do the table_open/close outside of 
> rel_max_parallel_hazard_for_modify ?
>
> Like:
>
> static bool rel_max_parallel_hazard_for_modify(Relation rel,
>CmdType command_type,
>max_parallel_hazard_context 
> *context);
> ...
> Relation relation = table_open(rte->relid, NoLock);
> (void) rel_max_parallel_hazard_for_modify(relation, 
> parse->commandType, &context);
> table_close(relation, NoLock);
>
>
> And we seems do not need the lockmode param with the above define.
>
>

Yeah, the repeated cleanup at the point of return is a bit ugly.
It could be solved by changing the function to do cleanup at a common
return point, but I agree with you that in this case it could simply
be done outside the function.
Thanks, I'll make that change.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-27 Thread Greg Nancarrow

On Wed, Jan 27, 2021 at 2:13 PM Hou, Zhijie  wrote:
>
> Hi,
>
> When testing the patch with the following kind of sql.
>
> ---
> Insert into part_table select 1;
> Insert into part_table select generate_series(1,1,1);
> Insert into part_table select * from testfunc();
> ---
>
> we usually use these sqls to initialize the table or for testing purpose.
>
> Personally I think we do not need to do the parallel safety-check for these 
> cases,
> because there seems no chance for the select part to consider parallel.
>
> I thought we aim to not check the safety unless parallel is possible.
> , So I was thinking is it possible to avoid the check it these cases ?
>
> I did some quick check on the code, An Immature ideal is to check if there is 
> RTE_RELATION in query.
> If no we do not check the safety-check.
>
> I am not sure is it worth to do that, any thoughts ?
>

Yes, I think it's worth it. It's surprising that there's not really
any optimizations for these with just the current Postgres parallel
SELECT functionality (as there's currently no way to divide the work
for these amongst the workers, even if the function/expression is
parallel-safe).
For the additional parallel-safety checks for INSERT, currently we
check that RTE_SUBQUERY is in the range-table. So I think we can
additionally check that RTE_RELATION is in the subquery range-table
(otherwise treat it as unsafe).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Determine parallel-safety of partition relations for Inserts

2021-01-31 Thread Greg Nancarrow

On Fri, Jan 29, 2021 at 5:44 PM Hou, Zhijie  wrote:
>
>
> Attatching v1 patches, introducing options which let user manually control 
> whether to use parallel dml.
>
> About the patch:
> 1. add a new guc option: enable_parallel_dml (boolean)
> 2. add a new tableoption: parallel_dml (boolean)
>
>The default of both is off(false).
>
> User can set enable_parallel_dml in postgresql.conf or seesion to enable 
> parallel dml.
> If user want to choose some specific to use parallel insert, they can set 
> table.parallel_dml to on.
>
> Some attention：
> (1)
> Currently if guc option enable_parallel_dml is set to on but table's 
> parallel_dml is off,
> planner still do not consider parallel for dml.
>
> In this way, If user want to use parallel dml , they have to set 
> enable_parallel_dml=on and set parallel_dml = on.
> If someone dislike this, I think we can also let tableoption. parallel_dml's 
> default value to on ,with this user only need
> to set enable_parallel_dml=on
>
> (2)
> For the parallel_dml.
> If target table is partitioned, and it's parallel_dml is set to on, planner 
> will ignore
> The option value of it's child.
> This is beacause we can not divide dml plan to separate table, so we just 
> check the target table itself.
>
>
> Thoughts and comments are welcome.
>

Personally, I think a table's "parallel_dml" option should be ON by default.
It's annoying to have to separately enable it for each and every table
being used, when I think the need to turn it selectively OFF should be
fairly rare.
And I'm not sure that "parallel_dml" is the best name for the table
option - because it sort of implies parallel dml WILL be used - but
that isn't true, it depends on other factors too.
So I think (to be consistent with other table option naming) it would
have to be something like "parallel_dml_enabled".
I'm still looking at the patches, but have so far noticed that there
are some issues in the documentation updates (grammatical issues and
consistency issues with current documentation), and also some of the
explanations are not clear. I guess I can address these separately.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-01 Thread Greg Nancarrow

On Mon, Feb 1, 2021 at 8:19 PM Hou, Zhijie  wrote:
>
>
> When developing the reloption patch, I noticed some issues in the patch.
>
> 1).
> > - Reduce Insert parallel-safety checks required for some SQL, by noting
> > that the subquery must operate on a relation (check for RTE_RELATION in
> > subquery range-table)
>
> +   foreach(lcSub, rte->subquery->rtable)
> +   {
> +   rteSub = lfirst_node(RangeTblEntry, lcSub);
> +   if (rteSub->rtekind == RTE_RELATION)
> +   {
> +   hasSubQueryOnRelation = true;
> +   break;
> +   }
> +   }
> It seems we can not only search RTE_RELATION in rtable,
> because RTE_RELATION may exist in other place like:
>
> ---
> --** explain insert into target select (select * from test);
> Subplan's subplan
>
> --** with cte as (select * from test) insert into target select * from cte;
> In query's ctelist.
> ---
>
> May be we should use a walker function [1] to
> search the subquery and ctelist.
>

Yes, the current checks are too simple, as you point out, there seem
to be more complex cases that it doesn't pick up. Unfortunately
expanding the testing for them does detract from the original
intention of this code (which was to avoid extra parallel-safety check
processing on code which can't be run in parallel). I guess the
relation walker function should additionally check for SELECT queries
only (commandType == CMD_SELECT), and exclude SELECT FOR UPDATE/SHARE
(rowMarks != NIL) too. I'll need to look further into it, but will
certainly update the code for the next version of the patch.

> 2).
>
> +--
> +-- Test INSERT into temporary table with underlying query.
> +-- (should not use a parallel plan)
> +--
>
> May be the comment here need some change since
> we currently support parallel plan for temp table.
>

Thanks, it should say something like "should create the plan with
INSERT + parallel SELECT".

> 3)
> Do you think we can add a testcase for foreign-table ?
> To test parallel query with serial insert on foreign table.
>

I have intended to do it, but as a lower-priority task.

>
> [1]
> static bool
> relation_walker(Node *node)
> {
> if (node == NULL)
> return false;
>
> else if (IsA(node, RangeTblEntry))
> {
> RangeTblEntry *rte = (RangeTblEntry *) node;
> if (rte->rtekind == RTE_RELATION)
> return true;
>
> return false;
> }
>
> else if (IsA(node, Query))
> {
> Query  *query = (Query *) node;
>
> /* Recurse into subselects */
> return query_tree_walker(query, relation_walker,
>  NULL, 
> QTW_EXAMINE_RTES_BEFORE);
> }
>
>     /* Recurse to check arguments */
> return expression_tree_walker(node,
>   
> relation_walker,
>   NULL);
> }
>

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Determine parallel-safety of partition relations for Inserts

2021-02-01 Thread Greg Nancarrow

On Mon, Feb 1, 2021 at 4:02 PM Hou, Zhijie  wrote:
>
> Attatching v2 patch which addressed the comments above.
>
> Some further refactor:
>
> Introducing a new function is_parallel_possible_for_modify() which decide 
> whether to do safety check.
>
> IMO, It seems more readable to extract all the check that we can do before 
> the safety-check and put them
> in the new function.
>
> Please consider it for further review.
>

I've updated your v2 patches and altered some comments and
documentation changes (but made no code changes) - please compare
against your v2 patches, and see whether you agree with the changes to
the wording.
In the documentation, you will also notice that in your V2 patch, it
says that the "parallel_dml_enabled" table option defaults to false.
As it actually defaults to true, I changed that in the documentation
too.

Regards,
Greg Nancarrow
Fujitsu Australia

v3_0004-reloption-parallel_dml-test-and-doc.patch
Description: Binary data

v3_0001-guc-option-enable_parallel_dml-src.patch
Description: Binary data

v3_0002-guc-option-enable_parallel_dml-doc-and-test.patch
Description: Binary data

v3_0003-reloption-parallel_dml-src.patch
Description: Binary data

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-02 Thread Greg Nancarrow

On Mon, Feb 1, 2021 at 8:19 PM Hou, Zhijie  wrote:
>
> Hi,
>
> When developing the reloption patch, I noticed some issues in the patch.
>
> 1).
> > - Reduce Insert parallel-safety checks required for some SQL, by noting
> > that the subquery must operate on a relation (check for RTE_RELATION in
> > subquery range-table)
>
> +   foreach(lcSub, rte->subquery->rtable)
> +   {
> +   rteSub = lfirst_node(RangeTblEntry, lcSub);
> +   if (rteSub->rtekind == RTE_RELATION)
> +   {
> +   hasSubQueryOnRelation = true;
> +   break;
> +   }
> +   }
> It seems we can not only search RTE_RELATION in rtable,
> because RTE_RELATION may exist in other place like:
>
> ---
> --** explain insert into target select (select * from test);
> Subplan's subplan
>
> --** with cte as (select * from test) insert into target select * from cte;
> In query's ctelist.
> ---
>
> May be we should use a walker function [1] to
> search the subquery and ctelist.
>
>
>
> [1]
> static bool
> relation_walker(Node *node)
> {
> if (node == NULL)
> return false;
>
> else if (IsA(node, RangeTblEntry))
> {
> RangeTblEntry *rte = (RangeTblEntry *) node;
> if (rte->rtekind == RTE_RELATION)
> return true;
>
> return false;
> }
>
> else if (IsA(node, Query))
> {
> Query  *query = (Query *) node;
>
> /* Recurse into subselects */
> return query_tree_walker(query, relation_walker,
>  NULL, 
> QTW_EXAMINE_RTES_BEFORE);
> }
>
> /* Recurse to check arguments */
> return expression_tree_walker(node,
>   
> relation_walker,
>   NULL);
> }
>

I've had a further look at this, and this walker function is doing a
lot of work recursing the parse tree, and I'm not sure that it
reliably retrieves the information that we;re looking for, for all
cases of different SQL queries. Unless it can be made much more
efficient and specific to our needs, I think we should not try to do
this optimization, because there's too much overhead. Also, keep in
mind that for the current parallel SELECT functionality in Postgres, I
don't see any similar optimization being attempted (and such
optimization should be attempted at the SELECT level). So I don't
think we should be attempting such optimization in this patch (but
could be attempted in a separate patch, just related to current
parallel SELECT functionality).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-04 Thread Greg Nancarrow

On Thu, Feb 4, 2021 at 11:56 AM tsunakawa.ta...@fujitsu.com
 wrote:
>
> >
> > So, the results indicate that after the patch we touch more buffers
> > during planning which I think is because of accessing the partition
> > information, and during execution, the patch touches fewer buffers for
> > the same reason. But why this can reduce the time with patch? I think
> > this needs some investigation.
>
> I guess another factor other than shared buffers is relcache and catcache.  
> The patched version loads those cached entries for all partitions of the 
> insert target table during the parallel-safety check in planning, while the 
> unpatched version has to gradually build those cache entries during 
> execution.  How can wee confirm its effect?
>

I believe that we can confirm its effect by invalidating relcache and
catcache, in both the patched and unpatched versions, just after the
parallel-safety checks are performed in the planner, and then running
tests and comparing the performance.

So that's exactly what I did (adding a call to
InvalidateSystemCaches() just after the parallel-safety checks in the
planner).
I found that then the unpatched version always performed better than
the patched version for tests inserting 1000 records into a table with
100,200,500 and 1000 partitions.
Looking at the breakdown of the timing for each Insert, the Planning
Time was always significantly more for the patched version (expected,
because it does extra checks), but the Execution Time was very similar
for both the patched and unpatched versions.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-04 Thread Greg Nancarrow

On Fri, Feb 5, 2021 at 2:58 PM Hou, Zhijie  wrote:
>
> Hi,
>
> I took a look into the hasModifyingCTE bugfix recently,
> and found a possible bug case without the parallel insert patch.
>
> -
> drop table if exists test_data1;
> create table test_data1(a int, b int) ;
> insert into test_data1 select generate_series(1,1000), 
> generate_series(1,1000);
> set force_parallel_mode=on;
>
> CREATE TEMP TABLE bug6051 AS
>   select i from generate_series(1,3) as i;
>
> SELECT * FROM bug6051;
> CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD select a as i from 
> test_data1;
>
> WITH t1 AS ( DELETE FROM bug6051 RETURNING * ) INSERT INTO bug6051 SELECT * 
> FROM t1;
>
> ***
> ***ERROR:  cannot assign XIDs during a parallel operation
> ***
> -
>
> I debugged it and it did have modifycte in the parsetree after rewrite.
> I think if we can properly set the hasModifyingCTE, we can avoid this error 
> by not consider parallel for this.
>

Thanks. You've identified that the bug exists for SELECT too. I've
verified that the issue is fixed by the bugfix included in the
Parallel INSERT patch.
Are you able to review my bugfix?
Since the problem exists for SELECT in the current Postgres code, I'd
like to pull that bugfix out and provide it as a separate fix.
My concern is that there may well be a better way to fix the issue -
for example, during the re-writing, rather than after the query has
been re-written.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-04 Thread Greg Nancarrow

On Fri, Feb 5, 2021 at 4:25 PM Hou, Zhijie  wrote:
>
> > >
> > > That is very close to what I was going to suggest, which is this:
> > >
> > > diff --git a/src/backend/rewrite/rewriteHandler.c
> > > b/src/backend/rewrite/rewriteHandler.c
> > > index 0672f497c6..3c4417af98 100644
> > > --- a/src/backend/rewrite/rewriteHandler.c
> > > +++ b/src/backend/rewrite/rewriteHandler.c
> > > @@ -631,6 +631,8 @@ rewriteRuleAction(Query *parsetree,
> > > checkExprHasSubLink((Node *)
> > > rule_action->returningList);
> > > }
> > >
> > > +   rule_action->hasModifyingCTE |= parsetree->hasModifyingCTE;
> > > +
> > > return rule_action;
> > >  }
> >
> >
> >   if (parsetree->cteList != NIL && sub_action->commandType !=
> > CMD_UTILITY)
> >   {
> >   ...
> >   sub_action->cteList = list_concat(sub_action->cteList,
> >   }
> >
> > Is is possible when sub_action is CMD_UTILITY ?
> > In this case CTE will be copied to the newone, should we set the set the
> > flag in this case ?
>
> Sorry , a typo in my word.
> In this case CTE will not be copied to the newone, should we set the set the 
> flag in this case ?
>

No, strictly speaking, we probably shouldn't, because the CTE wasn't
copied in that case.
Also, I know the bitwise OR "works" in this case, but I think some
will frown on use of that for a bool.
IMHO better to use:

   if (parsetree->hasModifyingCTE)
   rule_action->hasModifyingCTE = true;

So patch might be something like:

diff --git a/src/backend/rewrite/rewriteHandler.c
b/src/backend/rewrite/rewriteHandler.c
index 0672f497c6..a989e02925 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -557,6 +557,8 @@ rewriteRuleAction(Query *parsetree,
 /* OK, it's safe to combine the CTE lists */
 sub_action->cteList = list_concat(sub_action->cteList,
   copyObject(parsetree->cteList));
+if (parsetree->hasModifyingCTE)
+sub_action->hasModifyingCTE = true;
 }

 /*
@@ -594,6 +596,9 @@ rewriteRuleAction(Query *parsetree,
 *sub_action_ptr = sub_action;
 else
 rule_action = sub_action;
+
+if (parsetree->hasModifyingCTE)
+sub_action->hasModifyingCTE = true;
 }

 /*

I'll do some further checks, because the rewriting is recursive and
tricky, so don't want to miss any cases ...

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-04 Thread Greg Nancarrow

On Fri, Feb 5, 2021 at 5:21 PM Amit Langote  wrote:
>
>
> BTW, the original query's cteList is copied into sub_action query but
> not into rule_action for reasons I haven't looked very closely into,
> even though we'd like to ultimately set the latter's hasModifyingCTE
> to reflect the original query's, right?  So we should do the following
> at some point before returning:
>
> if (sub_action->hasModifyingCTE)
> rule_action->hasModifyingCTE = true;
>

Actually, rule_action will usually point to sub_action (in which case,
no need to copy to rule_action), except if the rule action is an
INSERT...SELECT, which seems to be handled by some "kludge" according
to the following comment (and KLUDGE ALERT comment in the function
that is called):

/*
 * Adjust rule action and qual to offset its varnos, so that we can merge
 * its rtable with the main parsetree's rtable.
 *
 * If the rule action is an INSERT...SELECT, the OLD/NEW rtable entries
 * will be in the SELECT part, and we have to modify that rather than the
 * top-level INSERT (kluge!).
 */
sub_action = getInsertSelectQuery(rule_action, &sub_action_ptr);

So in that case (sub_action_ptr != NULL), within rule_action there is
a pointer to sub_action (RTE for the subquery), so whenever sub_action
is re-created, this pointer needs to be fixed-up.
It looks like I might need to copy hasModifyingCTE back to rule_action
in this case - but not 100% sure on it yet - still checking that. All
tests run so far pass without doing that though.
This is one reason for my original approach (though I admit, it was
not optimal) because at least it was reliable and detected the
modifyingCTE after all the rewriting and kludgy code had finished.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-05 Thread Greg Nancarrow

On Fri, Feb 5, 2021 at 8:07 PM Amit Langote  wrote:
>
>
> > This is one reason for my original approach (though I admit, it was
> > not optimal) because at least it was reliable and detected the
> > modifyingCTE after all the rewriting and kludgy code had finished.
>
> Yeah it's hard to go through all of this highly recursive legacy code
> to be sure that hasModifyingCTE is consistent with reality in *all*
> cases, but let's try to do it.  No other has* flags are set
> after-the-fact, so I wouldn't bet on a committer letting this one
> through.
>

I have debugged the code a bit more now, and the following patch seems
to correctly fix the issue, at least for the known test cases.
(i.e. SELECT case, shared by houzj, and the INSERT...SELECT case, as
in the "with" regression tests, for which I originally detected the
issue)

diff --git a/src/backend/rewrite/rewriteHandler.c
b/src/backend/rewrite/rewriteHandler.c
index 0672f497c6..8f695b32ec 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -557,6 +557,12 @@ rewriteRuleAction(Query *parsetree,
 /* OK, it's safe to combine the CTE lists */
 sub_action->cteList = list_concat(sub_action->cteList,
   copyObject(parsetree->cteList));
+if (parsetree->hasModifyingCTE)
+{
+sub_action->hasModifyingCTE = true;
+if (sub_action_ptr)
+    rule_action->hasModifyingCTE = true;
+}
 }

 /*

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-05 Thread Greg Nancarrow

On Fri, Feb 5, 2021 at 11:12 PM Amit Langote  wrote:
>
>
> That seems good enough as far as I am concerned.   Although either an
> Assert as follows or a comment why the if (sub_action_ptr) is needed
> seems warranted.
>
> if (sub_action_ptr)
> rule_action->hasModifyingCTE = true;
> else
> Assert(sub_action == rule_action);
>
> Does the Assert seem overly confident?
>

No, the Assert is exactly right, and I'll add a comment too.
See below.
I'll post the patch separately, if you can't see any further issues.


diff --git a/src/backend/rewrite/rewriteHandler.c
b/src/backend/rewrite/rewriteHandler.c
index 0672f497c6..05b80bd347 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -557,6 +557,21 @@ rewriteRuleAction(Query *parsetree,
/* OK, it's safe to combine the CTE lists */
sub_action->cteList = list_concat(sub_action->cteList,
  copyObject(parsetree->cteList));
+
+   /*
+* If the hasModifyingCTE flag is set in the source parsetree from
+* which the CTE list is copied, the flag needs to be set in the
+* sub_action and, if applicable, in the rule_action (INSERT...SELECT
+* case).
+*/
+   if (parsetree->hasModifyingCTE)
+   {
+   sub_action->hasModifyingCTE = true;
+   if (sub_action_ptr)
+   rule_action->hasModifyingCTE = true;
+   else
+   Assert(sub_action == rule_action);
+   }
}

/*


Regards,
Greg Nancarrow
Fujitsu Australia

Bug in query rewriter - hasModifyingCTE not getting set

2021-02-06 Thread Greg Nancarrow

Hi Hackers,

I found a bug in the query rewriter. If a query that has a modifying
CTE is re-written, the hasModifyingCTE flag is not getting set in the
re-written query. This bug can result in the query being allowed to
execute in parallel-mode, which results in an error.

I originally found the problem using INSERT (which doesn't actually
affect the current Postgres code, as it doesn't support INSERT in
parallel mode) but a colleague of mine (Hou, Zhijie) managed to
reproduce it using SELECT as well (see example below), and helped to
minimize the patch size.

I've attached the patch with the suggested fix (reviewed by Amit Langote).


The following reproduces the issue (adapted from a test case in the
"with" regression tests):

drop table if exists test_data1;
create table test_data1(a int, b int) ;
insert into test_data1 select generate_series(1,1000), generate_series(1,1000);
set force_parallel_mode=on;
CREATE TEMP TABLE bug6051 AS
select i from generate_series(1,3) as i;
SELECT * FROM bug6051;
CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD select a as
i from test_data1;
WITH t1 AS ( DELETE FROM bug6051 RETURNING * ) INSERT INTO bug6051
SELECT * FROM t1;

produces the error:

ERROR:  cannot assign XIDs during a parallel operation


Regards,
Greg Nancarrow
Fujitsu Australia


v1-0001-Fix-bug-in-the-query-rewriter-hasModifyingCTE-not-set.patch
Description: Binary data

Re: Bug in query rewriter - hasModifyingCTE not getting set

2021-02-07 Thread Greg Nancarrow

On Sun, Feb 7, 2021 at 10:03 AM Tom Lane  wrote:
>
> Greg Nancarrow  writes:
> > I found a bug in the query rewriter. If a query that has a modifying
> > CTE is re-written, the hasModifyingCTE flag is not getting set in the
> > re-written query.
>
> Ugh.
>
> > I've attached the patch with the suggested fix (reviewed by Amit Langote).
>
> I think either the bit about rule_action is unnecessary, or most of
> the code immediately above this is wrong, because it's only updating
> flags in sub_action.  Why do you think it's necessary to change
> rule_action in addition to sub_action?
>

I believe that the bit about rule_action IS necessary, as it's needed
for the case of INSERT...SELECT, so that hasModifyingCTE is set on the
rewritten INSERT (see comment above the call to
getInsertSelectQuery(), and the "KLUDGE ALERT" comment within that
function).

In the current Postgres code, it doesn't let INSERT run in
parallel-mode (only SELECT), but in the debugger you can clearly see
that for an INSERT with a subquery that uses a modifying CTE, the
hasModifyingCTE flag is not getting set on the rewritten INSERT query
by the query rewriter. As I've been working on parallel INSERT, I
found the issue first for INSERT (one test failure in the "with" tests
when force_parallel_mode=regress).

Here's some silly SQL (very similar to existing test case in the
"with" tests) to reproduce the issue for INSERT (as I said, it won't
give an error like the SELECT case, as currently INSERT is not allowed
in parallel-mode anyway, but the issue can be seen in the debugger):

set force_parallel_mode=on;
CREATE TABLE bug6051 AS
  select i from generate_series(1,3) as i;
SELECT * FROM bug6051;
CREATE TABLE bug6051_2 (i int);
CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD
 INSERT INTO bug6051_2
 SELECT NEW.i;
WITH t1 AS ( DELETE FROM bug6051 RETURNING * )
INSERT INTO bug6051 SELECT * FROM t1;

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-07 Thread Greg Nancarrow

On Mon, Feb 1, 2021 at 7:20 PM Tang, Haiying  wrote:
>
> Hi Greg,
>
> Recently, I was keeping evaluating performance of this patch(1/28 V13).
> Here I find a regression test case which is parallel insert with bitmap heap 
> scan.
> when the target table has primary key or index, then the patched performance 
> will have a 7%-19% declines than unpatched.
>
> Could you please have a look about this?
>
> I tried max_parallel_workers_per_gather=2/4/8, and I didn't tune other 
> parameters(like GUCs or other enforce parallel parameters).
>
> 1. max_parallel_workers_per_gather=2(default)
> target_tablepatched   master  %reg
> --
> without_PK_index83.683142.183-41%
> with_PK 382.824   321.10119%
> with_index  372.682   324.24615%
>
> 2. max_parallel_workers_per_gather=4
> target_tablepatched   master  %reg
> --
> without_PK_index73.189141.879 -48%
> with_PK 362.104   329.759 10%
> with_index  372.237   333.718 12%
>
> 3. max_parallel_workers_per_gather=8 (also set max_parallel_workers=16, 
> max_worker_processes = 16)
> target_tablepatched   master  %reg
> --
> without_PK_index75.072146.100 -49%
> with_PK 365.312   324.339 13%
> with_index  362.636   338.366 7%
>
> Attached test_bitmap.sql which includes my test data and sql if you want to 
> have a look.
>

Hi,

Did it actually use a parallel plan in your testing?
When I ran these tests with the Parallel INSERT patch applied, it did
not naturally choose a parallel plan for any of these cases.
So we can hardly blame the parallel insert with bitmap heap scan for
having worse performance, when based on costings, it doesn't actually
choose to use a parallel plan in this case.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-08 Thread Greg Nancarrow

On Mon, Feb 8, 2021 at 6:00 PM Hou, Zhijie  wrote:
>
> > Posting an updated set of patches.
>
> A minor comment about doc.
>
> +  
> +Where the above target table features are determined to be, at worst,
> +parallel-restricted, rather than parallel-unsafe, at least a parallel 
> table
> +scan may be used in the query plan for the INSERT
> +statement. For more information about Parallel Safety, see
> +.
> +  
>
> It seems does not mention that if target table is a foreign/temp table, a 
> parallel table scan may be used.
>
> So how about:
>
> +  
> +Where the target table is a foreign/temporary table or the above target 
> table features
> +are determined to be, at worst, parallel-restricted, rather than 
> parallel-unsafe,
> +at least a parallel table scan may be used in the query plan for the
> +INSERT statement. For more information about Parallel 
> Safety,
> +see .
> +  
>

Thanks. You're right, I should probably update the docs to clarify
those two cases.
(I had removed them from the list of parallel-unsafe things, but not
pointed out that a parallel table scan could still be used in these
cases).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2021-02-08 Thread Greg Nancarrow

On Mon, Feb 8, 2021 at 8:17 PM vignesh C  wrote:
>
> >
> > I think what we want to do is mark default_transaction_read_only as
> > GUC_REPORT, instead.  That will give a reliable report of what the
> > state of its GUC is, and you can combine it with is_hot_standby
> > to decide whether the session should be considered read-only.
> > If you don't get those two GUC values during connection, then you
> > can fall back on "SHOW transaction_read_only".
> >
>
> I have made a patch for the above with the changes suggested and
> rebased it with the head code.
> Attached v21 patch which has the changes for the same.
> Thoughts?
>

I'm still looking at the patch code, but I noticed that the
documentation update describing how support of read-write transactions
is determined isn't quite right and it isn't clear how the parameters
work.
I'd suggest something like the following (you'd need to fix the line
lengths and line-wrapping appropriately) - please check it for
correctness:

   
The support of read-write transactions is determined by the value of the
default_transaction_read_only and
in_hot_standby configuration parameters,
that, if supported,
are reported by the server upon successful connection. If the
value of either
of these parameters is on, it means the
server doesn't support
read-write transactions. If either/both of these parameters
are not reported,
then the support of read-write transactions is determined by
an explicit query,
by sending SHOW transaction_read_only after
successful
connection; if it returns on, it means the
server doesn't
support read-write transactions. The standby mode state is
determined by either
the value of the in_hot_standby configuration
parameter, that is reported by the server (if supported) upon
successful connection, or is otherwise explicitly queried by sending
SELECT pg_is_in_recovery() after successful
    connection; if it returns t, it means the server is
in hot standby mode.
   


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-08 Thread Greg Nancarrow

On Tue, Feb 9, 2021 at 1:04 AM Amit Langote  wrote:
>
> Greg, all
>
> Thanks a lot for your work on this.
>
> On Mon, Feb 8, 2021 at 3:53 PM Greg Nancarrow  wrote:
> > Posting an updated set of patches.
>
> I've been looking at these patches, initially with an intention to
> review mainly any partitioning-related concerns, but have some general
> thoughts as well concerning mostly the patches 0001 and 0002.
>
> * I've seen review comments on this thread where I think it's been
> suggested that whatever max_parallel_hazard_for_modify() does had
> better have been integrated into max_parallel_hazard() such that
> there's no particular need for that function to exist.  For example,
> the following:
>
> +   /*
> +* UPDATE is not currently supported in parallel-mode, so prohibit
> +* INSERT...ON CONFLICT...DO UPDATE...
> +* In order to support update, even if only in the leader, some
> +* further work would need to be done. A mechanism would be needed
> +* for sharing combo-cids between leader and workers during
> +* parallel-mode, since for example, the leader might generate a
> +* combo-cid and it needs to be propagated to the workers.
> +*/
> +   if (parse->onConflict != NULL && parse->onConflict->action ==
> ONCONFLICT_UPDATE)
> +   return PROPARALLEL_UNSAFE;
>
> could be placed in the following block in max_parallel_hazard():
>
> /*
>  * When we're first invoked on a completely unplanned tree, we must
>  * recurse into subqueries so to as to locate parallel-unsafe constructs
>  * anywhere in the tree.
>  */
> else if (IsA(node, Query))
> {
> Query  *query = (Query *) node;
>
> /* SELECT FOR UPDATE/SHARE must be treated as unsafe */
> if (query->rowMarks != NULL)
> {
> context->max_hazard = PROPARALLEL_UNSAFE;
> return true;
> }
>
> Furthermore, the following:
>
> +   rte = rt_fetch(parse->resultRelation, parse->rtable);
> +
> +   /*
> +* The target table is already locked by the caller (this is done in the
> +* parse/analyze phase).
> +*/
> +   rel = table_open(rte->relid, NoLock);
> +   (void) rel_max_parallel_hazard_for_modify(rel, parse->commandType,
> &context);
> +   table_close(rel, NoLock);
>
> can itself be wrapped in a function that's called from
> max_parallel_hazard() by adding a new block for RangeTblEntry nodes
> and passing QTW_EXAMINE_RTES_BEFORE to query_tree_walker().
>

Thanks, I think those suggestions look good to me.

> That brings me to to this part of the hunk:
>
> +   /*
> +* If there is no underlying SELECT, a parallel table-modification
> +* operation is not possible (nor desirable).
> +*/
> +   hasSubQuery = false;
> +   foreach(lc, parse->rtable)
> +   {
> +   rte = lfirst_node(RangeTblEntry, lc);
> +   if (rte->rtekind == RTE_SUBQUERY)
> +   {
> +   hasSubQuery = true;
> +   break;
> +   }
> +   }
> +   if (!hasSubQuery)
> +   return PROPARALLEL_UNSAFE;
>
> The justification for this given in:
>
> https://www.postgresql.org/message-id/CAJcOf-dF9ohqub_D805k57Y_AuDLeAQfvtaax9SpwjTSEVdiXg%40mail.gmail.com
>
> seems to be that the failure of a test case in
> partition-concurrent-attach isolation suite is prevented if finding no
> subquery RTEs in the query is flagged as parallel unsafe, which in
> turn stops max_parallel_hazard_modify() from locking partitions for
> safety checks in such cases.  But it feels unprincipled to have this
> code to work around a specific test case that's failing.  I'd rather
> edit the failing test case to disable parallel execution as
> Tsunakawa-san suggested.
>

The code was not changed because of the test case (though it was
fortunate that the test case worked after the change).
The code check that you have identified above ensures that the INSERT
has an underlying SELECT, because the planner won't (and shouldn't
anyway) generate a parallel plan for INSERT...VALUES, so there is no
point doing any parallel-safety checks in this case.
It just so happens that the problem test case uses INSERT...VALUES -
and it shouldn't have triggered the parallel-safety checks for
parallel INSERT for this case anyway, because INSERT...VALUES can't
(and shouldn't) be parallelized.
So I will need to keep that check in the code somewhere, to avoid
overhead of parallel-safety checks in the case of INSERT...VALUES.

> * Regarding function names:
>
> +static bool trigger_max_parallel_hazard_for_modify(TriggerDesc *trigdesc,
> +
> max_parallel_hazard_context *context)

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-08 Thread Greg Nancarrow

On Mon, Feb 8, 2021 at 8:13 PM Hou, Zhijie  wrote:
>
> > > Did it actually use a parallel plan in your testing?
> > > When I ran these tests with the Parallel INSERT patch applied, it did
> > > not naturally choose a parallel plan for any of these cases.
> >
> > Yes, these cases pick parallel plan naturally on my test environment.
> >
> > postgres=# explain verbose insert into testscan select a from x where
> > a<8 or (a%2=0 and a>19990);
> > QUERY PLAN
> > --
> > -
> >  Gather  (cost=4346.89..1281204.64 rows=81372 width=0)
> >Workers Planned: 4
> >->  Insert on public.testscan  (cost=3346.89..1272067.44 rows=0
> > width=0)
> >  ->  Parallel Bitmap Heap Scan on public.x1
> > (cost=3346.89..1272067.44 rows=20343 width=8)
> >Output: x1.a, NULL::integer
> >Recheck Cond: ((x1.a < 8) OR (x1.a > 19990))
> >Filter: ((x1.a < 8) OR (((x1.a % 2) = 0) AND (x1.a >
> > 19990)))
> >->  BitmapOr  (cost=3346.89..3346.89 rows=178808
> > width=0)
> >  ->  Bitmap Index Scan on x1_a_idx
> > (cost=0.00..1495.19 rows=80883 width=0)
> >Index Cond: (x1.a < 8)
> >  ->  Bitmap Index Scan on x1_a_idx
> > (cost=0.00..1811.01 rows=97925 width=0)
> >Index Cond: (x1.a > 19990)
> >
> > PSA is my postgresql.conf file, maybe you can have a look. Besides, I didn't
> > do any parameters tuning in my test session.
>
> I reproduced this on my machine.
>
> I think we'd better do "analyze" before insert which helps reproduce this 
> easier.
> Like:
>
> -
> analyze;
> explain analyze verbose insert into testscan select a from x where a<8 or 
> (a%2=0 and a>19990);
> -

OK then.
Can you check if just the underlying SELECTs are run (without INSERT),
is there any performance degradation when compared to a non-parallel
scan?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2021-02-09 Thread Greg Nancarrow

On Mon, Feb 8, 2021 at 8:17 PM vignesh C  wrote:
>
> >
> > I think what we want to do is mark default_transaction_read_only as
> > GUC_REPORT, instead.  That will give a reliable report of what the
> > state of its GUC is, and you can combine it with is_hot_standby
> > to decide whether the session should be considered read-only.
> > If you don't get those two GUC values during connection, then you
> > can fall back on "SHOW transaction_read_only".
> >
>
> I have made a patch for the above with the changes suggested and
> rebased it with the head code.
> Attached v21 patch which has the changes for the same.
> Thoughts?

Further to my other doc change feedback, I can only spot the following
minor things (otherwise the changes that you have made seek OK to me).

1) doc/src/sgml/protocol.sgml

   default_transaction_read_only  and
   in_hot_standby were not reported by releases before
   14.)

should be:

   default_transaction_read_only  and
   in_hot_standby were not reported by releases before
   14.0)

2) doc/src/sgml/high-availability,sgml

During hot standby, the parameter in_hot_standby and
default_transaction_read_only are always true and may
not be changed.

should be:

During hot standby, the parameters in_hot_standby and
transaction_read_only are always true and may
not be changed.

[I believe that there's only checks on attempts to change
"transaction_read_only" when in hot_standby, not
"default_transaction_read_only"; see  check_transaction_read_only()]

3) src/interfaces/libpq/fe-connect.c

In rejectCheckedReadOrWriteConnection() and
rejectCheckedStandbyConnection(), now that host and port info are
emitted separately and are not included in each error message string
(as parameters in a format string), I think those functions should use
appendPQExpBufferStr() instead of appendPQExpBuffer(), as it's more
efficient if there is just a single string argument.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-09 Thread Greg Nancarrow

On Wed, Jan 6, 2021 at 7:39 PM Antonin Houska  wrote:
>
>
> @@ -252,6 +252,7 @@ set_plan_references(PlannerInfo *root, Plan *plan)
> PlannerGlobal *glob = root->glob;
> int rtoffset = list_length(glob->finalrtable);
> ListCell   *lc;
> +   Plan   *finalPlan;
>
> /*
>  * Add all the query's RTEs to the flattened rangetable.  The live 
> ones
> @@ -302,7 +303,17 @@ set_plan_references(PlannerInfo *root, Plan *plan)
> }
>
> /* Now fix the Plan tree */
> -   return set_plan_refs(root, plan, rtoffset);
> +   finalPlan = set_plan_refs(root, plan, rtoffset);
> +   if (finalPlan != NULL && IsA(finalPlan, Gather))
> +   {
> +   Plan   *subplan = outerPlan(finalPlan);
> +
> +   if (IsA(subplan, ModifyTable) && castNode(ModifyTable, 
> subplan)->returningLists != NULL)
> +   {
> +   finalPlan->targetlist = 
> copyObject(subplan->targetlist);
> +   }
> +   }
> +   return finalPlan;
>  }
>
> I'm not sure if the problem of missing targetlist should be handled here (BTW,
> NIL is the constant for an empty list, not NULL). Obviously this is a
> consequence of the fact that the ModifyTable node has no regular targetlist.
>
> Actually I don't quite understand why (in the current master branch) the
> targetlist initialized in set_plan_refs()
>
> /*
>  * Set up the visible plan targetlist as being the same as
>  * the first RETURNING list. This is for the use of
>  * EXPLAIN; the executor won't pay any attention to the
>  * targetlist.  We postpone this step until here so that
>  * we don't have to do set_returning_clause_references()
>  * twice on identical targetlists.
>  */
> splan->plan.targetlist = copyObject(linitial(newRL));
>
> is not used. Instead, ExecInitModifyTable() picks the first returning list
> again:
>
> /*
>  * Initialize result tuple slot and assign its rowtype using the first
>  * RETURNING list.  We assume the rest will look the same.
>  */
> mtstate->ps.plan->targetlist = (List *) 
> linitial(node->returningLists);
>
> So if you set the targetlist in create_modifytable_plan() (according to
> best_path->returningLists), or even in create_modifytable_path(), and ensure
> that it gets propagated to the Gather node (generate_gather_pahs currently
> uses rel->reltarget), then you should no longer need to tweak
> setrefs.c. Moreover, ExecInitModifyTable() would no longer need to set the
> targetlist. However I don't guarantee that this is the best approach - some
> planner expert should speak up.
>

I've had a bit closer look at this particular issue.
I can see what you mean about the ModifyTable targetlist (that is set
in set_plan_refs()) getting overwritten by ExecInitModifyTable() -
which also contradicts the comment in set_plan_refs() that claims the
targetlist being set is used by EXPLAIN (which it is not). So the
current Postgres master branch does seem to be broken in that respect.

I did try your suggestion (and also remove my tweak), but I found that
in the T_Gather case of set_plan_refs() it does some processing (see
set_upper_references()) of the current Gather targetlist and subplan's
targetlist (and will then overwrite the Gather targetlist after that),
but in doing that processing it produces the error:
ERROR:  variable not found in subplan target list
I think that one of the fundamental problems is that, up to now,
ModifyTable has always been the top node in a (non-parallel) plan, but
now for Parallel INSERT we have a Gather node with ModifyTable in its
subplan. So the expected order of processing and node configuration
has changed.
For the moment (until perhaps a planner expert DOES speak up) I've
parked my temporary "fix" at the bottom of set_plan_refs(), simply
pointing the Gather node's targetlist to subplan's ModifyTable
targetlist.

if (nodeTag(plan) == T_Gather)
{
Plan   *subplan = plan->lefttree;

if (IsA(subplan, ModifyTable) &&
castNode(ModifyTable, subplan)->returningLists != NIL)
{
plan->targetlist = subplan->targetlist;
}
}

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-09 Thread Greg Nancarrow

On Wed, Feb 10, 2021 at 2:39 PM Amit Langote  wrote:
>
> > The code check that you have identified above ensures that the INSERT
> > has an underlying SELECT, because the planner won't (and shouldn't
> > anyway) generate a parallel plan for INSERT...VALUES, so there is no
> > point doing any parallel-safety checks in this case.
> > It just so happens that the problem test case uses INSERT...VALUES -
> > and it shouldn't have triggered the parallel-safety checks for
> > parallel INSERT for this case anyway, because INSERT...VALUES can't
> > (and shouldn't) be parallelized.
>
> AFAICS, max_parallel_hazard() path never bails from doing further
> safety checks based on anything other than finding a query component
> whose hazard level crosses context->max_interesting.

It's parallel UNSAFE because it's not safe or even possible to have a
parallel plan for this.
(as UNSAFE is the max hazard level, no point in referencing
context->max_interesting).
And there are existing cases of bailing out and not doing further
safety checks (even your v15_delta.diff patch placed code - for
bailing out on "ON CONFLICT ... DO UPDATE" - underneath one such
existing case in max_parallel_hazard_walker()):

else if (IsA(node, Query))
{
Query  *query = (Query *) node;

/* SELECT FOR UPDATE/SHARE must be treated as unsafe */
if (query->rowMarks != NULL)
{
context->max_hazard = PROPARALLEL_UNSAFE;
return true;
}

>You're trying to
> add something that bails based on second-guessing that a parallel path
> would not be chosen, which I find somewhat objectionable.
>
> If the main goal of bailing out is to avoid doing the potentially
> expensive modification safety check on the target relation, maybe we
> should try to somehow make the check less expensive.  I remember
> reading somewhere in the thread about caching the result of this check
> in relcache, but haven't closely studied the feasibility of doing so.
>

There's no "second-guessing" involved here.
There is no underlying way of dividing up the VALUES data of
"INSERT...VALUES" amongst the parallel workers, even if the planner
was updated to produce a parallel-plan for the "INSERT...VALUES" case
(apart from the fact that spawning off parallel workers to insert that
data would almost always result in worse performance than a
non-parallel plan...)
The division of work for parallel workers is part of the table AM
(scan) implementation, which is not invoked for "INSERT...VALUES".

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-10 Thread Greg Nancarrow

be used for the 
> specified
> + * table-modification statement.
> + * It's not possible in the following cases:
> + *
> + *  1) enable_parallel_dml is off
> + *  2) UPDATE or DELETE command
> + *  3) INSERT...ON CONFLICT...DO UPDATE
> + *  4) INSERT without SELECT on a relation
> + *  5) the reloption parallel_dml_enabled is not set for the target table
> + *
> + * (Note: we don't do in-depth parallel-safety checks here, we do only the
> + * cheaper tests that can quickly exclude obvious cases for which
> + * parallelism isn't supported, to avoid having to do further parallel-safety
> + * checks for these)
>   */
> +bool
> +is_parallel_possible_for_modify(Query *parse)
>
>
>
>
>
> And I put the function at earlier place like the following:
>
>
> if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
> IsUnderPostmaster &&
> (parse->commandType == CMD_SELECT ||
> -   (enable_parallel_dml &&
> -   IsModifySupportedInParallelMode(parse->commandType))) &&
> +   is_parallel_possible_for_modify(parse)) &&
> !parse->hasModifyingCTE &&
> max_parallel_workers_per_gather > 0 &&
> !IsParallelWorker())
>
>
> If this looks good, maybe we can merge this change.
>

If I've understood correctly, you're suggesting to merge the
"is_parallel_possible_for_modify()" code from your parallel_dml patch
into the main parallel INSERT patch, right?
If so, I think that's a good idea, as it will help simplify both
patches (and then, if need be, we can still discuss where best to
place certain checks ...).
I'll post an update soon that includes that change (then the
parallel_dml patch will need to be rebased accordingly).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-10 Thread Greg Nancarrow

On Tue, Feb 9, 2021 at 1:04 AM Amit Langote  wrote:
>
> * I think that the concerns raised by Tsunakawa-san in:
>
> https://www.postgresql.org/message-id/TYAPR01MB2990CCB6E24B10D35D28B949FEA30%40TYAPR01MB2990.jpnprd01.prod.outlook.com
>
> regarding how this interacts with plancache.c deserve a look.
> Specifically, a plan that uses parallel insert may fail to be
> invalidated when partitions are altered directly (that is without
> altering their root parent).  That would be because we are not adding
> partition OIDs to PlannerGlobal.invalItems despite making a plan
> that's based on checking their properties.  See this (tested with all
> patches applied!):
>

Does any current Postgres code add partition OIDs to
PlannerGlobal.invalItems for a similar reason?
I would have thought that, for example,  partitions with a default
column expression, using a function that is changed from SAFE to
UNSAFE, would suffer the same plancache issue (for current parallel
SELECT functionality) as we're talking about here - but so far I
haven't seen any code handling this.

(Currently invalItems seems to support PROCID and TYPEOID; relation
OIDs seem to be handled through a different mechanism)..
Can you elaborate on what you believe is required here, so that the
partition OID dependency is registered and the altered partition
results in the plan being invalidated?
Thanks in advance for any help you can provide here.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-10 Thread Greg Nancarrow

On Thu, Feb 11, 2021 at 5:33 PM Greg Nancarrow  wrote:
>
> On Tue, Feb 9, 2021 at 1:04 AM Amit Langote  wrote:
> >
> > * I think that the concerns raised by Tsunakawa-san in:
> >
> > https://www.postgresql.org/message-id/TYAPR01MB2990CCB6E24B10D35D28B949FEA30%40TYAPR01MB2990.jpnprd01.prod.outlook.com
> >
> > regarding how this interacts with plancache.c deserve a look.
> > Specifically, a plan that uses parallel insert may fail to be
> > invalidated when partitions are altered directly (that is without
> > altering their root parent).  That would be because we are not adding
> > partition OIDs to PlannerGlobal.invalItems despite making a plan
> > that's based on checking their properties.  See this (tested with all
> > patches applied!):
> >
>
> Does any current Postgres code add partition OIDs to
> PlannerGlobal.invalItems for a similar reason?
> I would have thought that, for example,  partitions with a default
> column expression, using a function that is changed from SAFE to
> UNSAFE, would suffer the same plancache issue (for current parallel
> SELECT functionality) as we're talking about here - but so far I
> haven't seen any code handling this.
>
> (Currently invalItems seems to support PROCID and TYPEOID; relation
> OIDs seem to be handled through a different mechanism)..
> Can you elaborate on what you believe is required here, so that the
> partition OID dependency is registered and the altered partition
> results in the plan being invalidated?
> Thanks in advance for any help you can provide here.
>

Actually, I tried adding the following in the loop that checks the
parallel-safety of each partition and it seemed to work:

glob->relationOids =
lappend_oid(glob->relationOids, pdesc->oids[i]);

Can you confirm, is that what you were referring to?
(note that I've already updated the code to use
CreatePartitionDirectory() and PartitionDirectoryLookup())

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Libpq support to connect to standby server as priority

2021-02-11 Thread Greg Nancarrow

On Wed, Feb 10, 2021 at 5:09 PM vignesh C  wrote:
>
> Modified.
> These comments are handled in v22 patch posted in my earlier mail.
>

Thanks, just one minor thing I missed in doc/src/sgml/libpq.sgml.

+The support of read-write transactions is determined by the
value of the
+default_transaction_read_only and
+in_hot_standby configuration parameters, that is
+reported by the server (if supported) upon successful connection.  If


should be:

+The support of read-write transactions is determined by the
values of the
+default_transaction_read_only and
+in_hot_standby configuration parameters, that are
+reported by the server (if supported) upon successful connection.  If


(i.e. "value" -> "values" and "is" -> "are")


Regards,
Greg Nancarrow
Fujitsu Australia

1 2 3 4 5 >

1 - 100 of 431 matches

Mail list logo