On Mon, Jun 21, 2021 at 4:40 PM houzj.f...@fujitsu.com <houzj.f...@fujitsu.com> wrote: > > To be honest, I didn't find a cheap way to invalidate partitioned table's > parallel safety automatically. >
I also don't see the feasibility for doing parallelism checks for partitioned tables both because it is expensive due to traversing/locking all the partitions and then the invalidations are difficult to handle due to deadlock hazards as discussed above. Let me try to summarize the discussion so far and see if we can have any better ideas than what we have discussed so far or we want to go with one of the ideas discussed till now. I think we have broadly discussed two approaches (a) to automatically decide whether parallelism can be enabled for inserts, (b) provide an option to the user to specify whether inserts can be parallelized on a relation. For the first approach (a), we have evaluated both the partitioned and non-partitioned relation cases. For non-partitioned relations, we can compute the parallel-safety of relation during the planning and save it in the relation cache entry. This is normally safe because we have a lock on the relation and any change to the relation should raise an invalidation which will lead to re-computation of parallel-safety information for a relation. Now, there are cases where the parallel-safety of some trigger function or a function used in index expression can be changed by the user which won't register an invalidation for a relation. To handle such cases, we can register a new kind of invalidation only when a function's parallel-safety information is changed. And every backend in the same database then needs to re-evaluate the parallel-safety of every relation for which it has cached a value. For partitioned relations, the similar idea won't work because of multiple reasons (a) We need to traverse and lock all the partitions to compute the parallel-safety of the root relation which could be very expensive; (b) Whenever we invalidate a particular partition, we need to invalidate its parent hierarchy as well. We can't traverse the parent hierarchy without taking locks on the parent table which can lead to deadlock. The alternative could be that for partitioned relations we can rely on the user-specified information about parallel-safety (like the approach-b mentioned in the previous paragraph). We can additionally check the parallel safety of partitions when we are trying to insert into a particular partition and error out if we detect any parallel-unsafe clause and we are in parallel-mode. So, in this case, we won't be completely relying on the users. Users can either change the parallel safe option of the table or remove/change the parallel-unsafe clause after an error. For the second approach (b), we can provide an option to the user to specify whether inserts (or other dml's) can be parallelized for a relation. One of the ideas is to provide some options like below to the user: CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE }; ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE }; This property is recorded in pg_class's relparallel column as 'u', 'r', or 's', just like pg_proc's proparallel. The default is UNSAFE. Additionally, provide a function pg_get_parallel_safety(oid) using which users can determine whether it is safe to enable parallelism. Surely, after the user has checked with that function, one can add some unsafe constraints to the table by altering the table but it will still be an aid to enable parallelism on a relation. The first approach (a) has an appeal because it would allow to automatically parallelize inserts in many cases but might have some overhead in some cases due to processing of relcache entries after the parallel-safety of the relation is changed. The second approach (b) has an appeal because of its consistent behavior for partitioned and non-partitioned relations. Among the above options, I would personally prefer (b) mainly because of the consistent handling for partition and non-partition table cases but I am fine with approach (a) as well if that is what other people feel is better. Thoughts? -- With Regards, Amit Kapila.