Re: Allow parallel plan for referential integrity checks?

Juan José Santamaría Flecha Wed, 16 Aug 2023 02:18:54 -0700

On Thu, Aug 10, 2023 at 5:06 PM Juan José Santamaría Flecha <
juanjo.santama...@gmail.com> wrote:

> On Tue, Jul 4, 2023 at 9:45 AM Daniel Gustafsson <dan...@yesql.se> wrote:
>
>> As there is no new patch submitted I will go ahead and do that, please
>> feel
>> free to resubmit when there is renewed interest in working on this.
>>
>
> Recently I restored a database from a directory format backup and having
> this feature would have been quite useful. So, I would like to resume the
> work on this patch, unless OP or someone else is already on it.
>

Hearing no one advising me otherwise I'll pick up the work on the patch.
First, I'll try to address all previous comments.

On Mon, Feb 14, 2022 at 3:34 PM Robert Haas <robertmh...@gmail.com> wrote:

> On Mon, Feb 7, 2022 at 5:26 AM Frédéric Yhuel <frederic.yh...@dalibo.com>
> wrote:
> > I noticed that referential integrity checks aren't currently
> > parallelized. Is it on purpose?
>
> It's not 100% clear to me that it is safe. But on the other hand, it's
> also not 100% clear to me that it is unsafe.
>
> Generally, I only added CURSOR_OPT_PARALLEL_OK in places where I was
> confident that nothing bad would happen, and this wasn't one of those
> places. It's something of a nested-query environment -- your criterion
> #6. How do we know that the surrounding query isn't already parallel?
> Perhaps because it must be DML, but I think it must be more important
> to support parallel DML than to support this.
>

Currently I don't see a scenario where the validation might run inside
another parallel query, but I could be missing something. Suggestions on
some edge cases to test are more than welcome.

We already have a case where parallel operations can occur when adding a
table constraint, and that's the index creation for a primary key. Take for
example:

postgres=# SET max_parallel_maintenance_workers TO 4;
postgres=# SET parallel_setup_cost TO 0;
postgres=# SET parallel_tuple_cost TO 0;
postgres=# SET parallel_leader_participation TO 0;
postgres=# SET min_parallel_table_scan_size TO 0;
postgres=# SET client_min_messages TO debug1;
postgres=# CREATE TABLE parallel_pk_table (a int) WITH (autovacuum_enabled
= off);
postgres=# ALTER TABLE parallel_pk_table ADD PRIMARY KEY (a);
DEBUG:  ALTER TABLE / ADD PRIMARY KEY will create implicit index
"parallel_pk_table_pkey" for table "parallel_pk_table"
DEBUG:  building index "parallel_pk_table_pkey" on table
"parallel_pk_table" with request for 1 parallel workers
...
DEBUG:  index "parallel_pk_table_pkey" can safely use deduplication
DEBUG:  verifying table "parallel_pk_table"

This should be better documented. Maybe "parallel index build" [1] could
have its own section, so it can be cited from other sections?

> I'm not sure what the right thing to do here is, but I think it would
> be good if your patch included a test case.
>

I have included a basic test for parallel ADD PRIMARY KEY and ADD FOREIGN
KEY.

On Sat, Mar 19, 2022 at 1:57 AM Imseih (AWS), Sami <sims...@amazon.com>
wrote:

> It would be valuable to add logging to ensure that the ActiveSnapshot and
> TransactionSnapshot
> is the same for the leader and the workers. This logging could be tested
> in the TAP test.
>

That machinery is used in any parallel query, I'm not sure this patch
should be responsible for carrying that requirement.

Also, inside RI_Initial_Check you may want to set max_parallel_workers to
> max_parallel_maintenance_workers.
>
> Currently the work_mem is set to maintenance_work_mem. This will also
> require
> a doc change to call out.
>

Done.

PFA the patch.

[1] https://www.postgresql.org/docs/current/sql-createindex.html

Regards,

Juan José Santamaría Flecha

v2-0001-Allow-parallel-plan-for-referential-integrity-checks.patch
Description: Binary data

Re: Allow parallel plan for referential integrity checks?

Reply via email to