Hi Nitin, While testing further I got a crash with partition wise join enabled for multi-col list partitions. please find test case & stack-trace below.
SET enable_partitionwise_join TO on; CREATE TABLE plt1 (c varchar, d varchar) PARTITION BY LIST(c,d); CREATE TABLE plt1_p1 PARTITION OF plt1 FOR VALUES IN (('0001','0001'),('0002','0002'),(NULL,NULL)); CREATE TABLE plt1_p2 PARTITION OF plt1 FOR VALUES IN (('0004','0004'),('0005','0005'),('0006','0006')); INSERT INTO plt1 SELECT to_char(i % 11, 'FM0000'), to_char(i % 11, 'FM0000') FROM generate_series(0, 500) i WHERE i % 11 NOT IN (0,10,3,7,8,9); INSERT INTO plt1 SELECT NULL,NULL FROM generate_series(0, 500) i WHERE i % 11 IN (3); ANALYSE plt1; CREATE TABLE plt2 (c varchar, d varchar) PARTITION BY LIST(c,d); CREATE TABLE plt2_p1 PARTITION OF plt2 FOR VALUES IN (('0001','0001'),('0002','0002')); CREATE TABLE plt2_p2 PARTITION OF plt2 FOR VALUES IN (('0004','0004'),('0005','0005'),('0006','0006')); CREATE TABLE plt2_p3 PARTITION OF plt2 DEFAULT; INSERT INTO plt2 SELECT to_char(i % 11, 'FM0000'), to_char(i % 11, 'FM0000') FROM generate_series(0, 500) i WHERE i % 11 NOT IN (0,10,3); INSERT INTO plt2 SELECT NULL,NULL FROM generate_series(0, 500) i WHERE i % 11 IN (3); ANALYSE plt2; EXPLAIN (COSTS OFF) SELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d); postgres=# EXPLAIN (COSTS OFF) postgres-# SELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !?> \q [edb@localhost bin]$ gdb -q -c data/core.66926 postgres Reading symbols from /home/edb/WORK/pg_src/PG_TEMP/postgresql/inst/bin/postgres...done. [New LWP 66926] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `postgres: edb postgres [local] EXPLAIN '. Program terminated with signal 11, Segmentation fault. #0 0x000000000082be39 in is_dummy_rel (rel=0x40) at joinrels.c:1221 1221 if (rel->pathlist == NIL) (gdb) bt #0 0x000000000082be39 in is_dummy_rel (rel=0x40) at joinrels.c:1221 #1 0x000000000089341c in is_dummy_partition (rel=0x2f86e88, part_index=2) at partbounds.c:1959 #2 0x0000000000891d38 in merge_list_bounds (partnatts=2, partsupfunc=0x2f70058, partcollation=0x2fd3c98, outer_rel=0x2f86e88, inner_rel=0x2fd4368, jointype=JOIN_LEFT, outer_parts=0x7ffea91f8cc0, inner_parts=0x7ffea91f8cb8) at partbounds.c:1325 #3 0x0000000000891991 in partition_bounds_merge (partnatts=2, partsupfunc=0x2f70058, partcollation=0x2fd3c98, outer_rel=0x2f86e88, inner_rel=0x2fd4368, jointype=JOIN_LEFT, outer_parts=0x7ffea91f8cc0, inner_parts=0x7ffea91f8cb8) at partbounds.c:1198 #4 0x000000000082cc5a in compute_partition_bounds (root=0x2f9e910, rel1=0x2f86e88, rel2=0x2fd4368, joinrel=0x2fae388, parent_sjinfo=0x2f7dfa8, parts1=0x7ffea91f8cc0, parts2=0x7ffea91f8cb8) at joinrels.c:1644 #5 0x000000000082c474 in try_partitionwise_join (root=0x2f9e910, rel1=0x2f86e88, rel2=0x2fd4368, joinrel=0x2fae388, parent_sjinfo=0x2f7dfa8, parent_restrictlist=0x2fae650) at joinrels.c:1402 #6 0x000000000082b6e2 in populate_joinrel_with_paths (root=0x2f9e910, rel1=0x2f86e88, rel2=0x2fd4368, joinrel=0x2fae388, sjinfo=0x2f7dfa8, restrictlist=0x2fae650) at joinrels.c:926 #7 0x000000000082b135 in make_join_rel (root=0x2f9e910, rel1=0x2f86e88, rel2=0x2fd4368) at joinrels.c:760 #8 0x000000000082a643 in make_rels_by_clause_joins (root=0x2f9e910, old_rel=0x2f86e88, other_rels_list=0x2f90148, other_rels=0x2f90160) at joinrels.c:312 #9 0x000000000082a119 in join_search_one_level (root=0x2f9e910, level=3) at joinrels.c:123 #10 0x000000000080cd97 in standard_join_search (root=0x2f9e910, levels_needed=3, initial_rels=0x2f90148) at allpaths.c:3020 #11 0x000000000080cd10 in make_rel_from_joinlist (root=0x2f9e910, joinlist=0x2fd7550) at allpaths.c:2951 #12 0x000000000080899a in make_one_rel (root=0x2f9e910, joinlist=0x2fd7550) at allpaths.c:228 #13 0x000000000084516a in query_planner (root=0x2f9e910, qp_callback=0x84ad85 <standard_qp_callback>, qp_extra=0x7ffea91f9140) at planmain.c:276 #14 0x000000000084788d in grouping_planner (root=0x2f9e910, tuple_fraction=0) at planner.c:1447 #15 0x0000000000846f56 in subquery_planner (glob=0x2fa0c08, parse=0x2f56d30, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:1025 #16 0x000000000084578b in standard_planner (parse=0x2f56d30, query_string=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);", cursorOptions=2048, boundParams=0x0) at planner.c:406 #17 0x0000000000845536 in planner (parse=0x2f56d30, query_string=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);", cursorOptions=2048, boundParams=0x0) at planner.c:277 #18 0x0000000000978faf in pg_plan_query (querytree=0x2f56d30, query_string=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);", cursorOptions=2048, boundParams=0x0) at postgres.c:847 #19 0x0000000000693e50 in ExplainOneQuery (query=0x2f56d30, cursorOptions=2048, into=0x0, es=0x2fa0920, queryString=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);", params=0x0, queryEnv=0x0) at explain.c:397 #20 0x00000000006939a5 in ExplainQuery (pstate=0x2f9e0a0, stmt=0x2f56b50, params=0x0, dest=0x2f9e008) at explain.c:281 #21 0x0000000000981de8 in standard_ProcessUtility (pstmt=0x2fd2220, queryString=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x2f9e008, qc=0x7ffea91f9aa0) at utility.c:862 #22 0x0000000000981585 in ProcessUtility (pstmt=0x2fd2220, queryString=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x2f9e008, qc=0x7ffea91f9aa0) at utility.c:527 #23 0x00000000009801ba in PortalRunUtility (portal=0x2f10180, pstmt=0x2fd2220, isTopLevel=true, setHoldSnapshot=true, dest=0x2f9e008, qc=0x7ffea91f9aa0) at pquery.c:1155 #24 0x000000000097ff20 in FillPortalStore (portal=0x2f10180, isTopLevel=true) at pquery.c:1028 #25 0x000000000097f883 in PortalRun (portal=0x2f10180, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2fd2310, altdest=0x2fd2310, qc=0x7ffea91f9c60) at pquery.c:760 #26 0x00000000009795d1 in exec_simple_query ( query_string=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);") at postgres.c:1214 #27 0x000000000097da8d in PostgresMain (dbname=0x2ed8068 "postgres", username=0x2ed8048 "edb") at postgres.c:4497 #28 0x00000000008b9699 in BackendRun (port=0x2ecfd00) at postmaster.c:4560 Thanks & Regards, Rajkumar Raghuwanshi On Mon, Oct 11, 2021 at 11:05 AM Rajkumar Raghuwanshi < rajkumar.raghuwan...@enterprisedb.com> wrote: > Thanks for the patch, it applied cleanly and fixed the reported issue. I > observed another case where > In case of multi-col list partition on the same column query is not > picking partition wise join. Is this expected? > > CREATE TABLE plt1 (a int, b int, c varchar) PARTITION BY LIST(c,c); > CREATE TABLE plt1_p1 PARTITION OF plt1 FOR VALUES IN > (('0001','0001'),('0002','0002'),('0003','0003')); > CREATE TABLE plt1_p2 PARTITION OF plt1 FOR VALUES IN > (('0004','0004'),('0005','0005'),('0006','0006')); > CREATE TABLE plt1_p3 PARTITION OF plt1 DEFAULT; > INSERT INTO plt1 SELECT i, i % 47, to_char(i % 11, 'FM0000') FROM > generate_series(0, 500) i WHERE i % 11 NOT IN (0,10); > ANALYSE plt1; > CREATE TABLE plt2 (a int, b int, c varchar) PARTITION BY LIST(c,c); > CREATE TABLE plt2_p1 PARTITION OF plt2 FOR VALUES IN > (('0001','0001'),('0002','0002'),('0003','0003')); > CREATE TABLE plt2_p2 PARTITION OF plt2 FOR VALUES IN > (('0004','0004'),('0005','0005'),('0006','0006')); > CREATE TABLE plt2_p3 PARTITION OF plt2 DEFAULT; > INSERT INTO plt2 SELECT i, i % 47, to_char(i % 11, 'FM0000') FROM > generate_series(0, 500) i WHERE i % 11 NOT IN (0,10); > ANALYSE plt2; > SET enable_partitionwise_join TO true; > EXPLAIN (COSTS OFF) SELECT t1.a, t1.c, t2.a, t2.c FROM plt1 t1 INNER JOIN > plt2 t2 ON t1.c = t2.c; > > postgres=# EXPLAIN (COSTS OFF) SELECT t1.a, t1.c, t2.a, t2.c FROM plt1 t1 > INNER JOIN plt2 t2 ON t1.c = t2.c; > QUERY PLAN > -------------------------------------------- > Hash Join > Hash Cond: ((t1.c)::text = (t2.c)::text) > -> Append > -> Seq Scan on plt1_p1 t1_1 > -> Seq Scan on plt1_p2 t1_2 > -> Seq Scan on plt1_p3 t1_3 > -> Hash > -> Append > -> Seq Scan on plt2_p1 t2_1 > -> Seq Scan on plt2_p2 t2_2 > -> Seq Scan on plt2_p3 t2_3 > (11 rows) > > Thanks & Regards, > Rajkumar Raghuwanshi > > > > On Thu, Oct 7, 2021 at 6:03 PM Nitin Jadhav <nitinjadhavpostg...@gmail.com> > wrote: > >> Thanks Rajkumar for testing. >> >> > I think it should throw an error as the partition by list has only 1 >> column but we are giving 2 values. >> >> I also agree that it should throw an error in the above case. Fixed the >> issue in the attached patch. Also added related test cases to the >> regression test suite. >> >> >> > also if you see \d+ showing plt1_p1 partition value as ‘(0001,0001)’ >> instead of ('0001','0001'). >> >> Now throwing errors in the initial stage, this case doesn't arise. >> >> Please share if you find any other issues. >> >> Thanks & Regards, >> Nitin Jadhav >> >> >> >> >> >> On Thu, Oct 7, 2021 at 4:05 PM Rajkumar Raghuwanshi < >> rajkumar.raghuwan...@enterprisedb.com> wrote: >> >>> Thanks Nitin, >>> >>> v4 patches applied cleanly and make check is passing now. While testing >>> further I observed that if multiple values are given for a single >>> column list partition it is not giving error instead it is changing >>> values itself. Please find the example below. >>> >>> postgres=# CREATE TABLE plt1 (a int, b varchar) PARTITION BY LIST(b); >>> CREATE TABLE >>> postgres=# CREATE TABLE plt1_p1 PARTITION OF plt1 FOR VALUES IN >>> (('0001','0001'),('0002','0002')); >>> CREATE TABLE >>> postgres=# \d+ plt1; >>> Partitioned table "public.plt1" >>> Column | Type | Collation | Nullable | Default | Storage >>> | Compression | Stats target | Description >>> >>> --------+-------------------+-----------+----------+---------+----------+-------------+--------------+------------- >>> a | integer | | | | plain >>> | | | >>> b | character varying | | | | extended >>> | | | >>> Partition key: LIST (b) >>> Partitions: plt1_p1 FOR VALUES IN ('(0001,0001)', '(0002,0002)') >>> >>> I think it should throw an error as the partition by list has only 1 >>> column but we are giving 2 values. >>> also if you see \d+ showing plt1_p1 partition value as ‘(0001,0001)’ >>> instead of ('0001','0001'). >>> >>> Thanks & Regards, >>> Rajkumar Raghuwanshi >>> >>> >>> >>> On Sun, Oct 3, 2021 at 1:52 AM Nitin Jadhav < >>> nitinjadhavpostg...@gmail.com> wrote: >>> >>>> > > On PG head + Nitin's v3 patch + Amit's Delta patch. Make check is >>>> failing with below errors. >>>> > >>>> > Thanks Rajkumar for testing. >>>> > >>>> > Here's a v2 of the delta patch that should fix both of these test >>>> > failures. As I mentioned in my last reply, my delta patch fixed what >>>> > I think were problems in Nitin's v3 patch but were not complete by >>>> > themselves. Especially, I hadn't bothered to investigate various /* >>>> > TODO: handle multi-column list partitioning */ sites to deal with my >>>> > own changes. >>>> >>>> Thanks Rajkumar for testing and Thank you Amit for working on v2 of >>>> the delta patch. Actually I had done the code changes related to >>>> partition-wise join and I was in the middle of fixing the review >>>> comments, So I could not share the patch. Anyways thanks for your >>>> efforts. >>>> >>>> > I noticed that multi-column list partitions containing NULLs don't >>>> > work correctly with partition pruning yet. >>>> > >>>> > create table p0 (a int, b text, c bool) partition by list (a, b, c); >>>> > create table p01 partition of p0 for values in ((1, 1, true), (NULL, >>>> 1, false)); >>>> > create table p02 partition of p0 for values in ((1, NULL, false)); >>>> > explain select * from p0 where a is null; >>>> > QUERY PLAN >>>> > -------------------------------------------------------- >>>> > Seq Scan on p01 p0 (cost=0.00..22.50 rows=6 width=37) >>>> > Filter: (a IS NULL) >>>> > (2 rows) >>>> > >>>> > In the attached updated version, I've dealt with some of those such >>>> > that at least the existing cases exercising partition pruning and >>>> > partition wise joins now pass. >>>> >>>> wrt partition pruning, I have checked the output of the above case >>>> with the v2 version of the delta patch and without that. The output >>>> remains same. Kindly let me know if I am missing something. But I feel >>>> the above output is correct as the partition p01 is the only partition >>>> which contains NULL value for column a, hence it is showing "Seq scan >>>> on p01" in the output. Kindly correct me if I am wrong. I feel the >>>> code changes related to 'null_keys' is not required, hence not >>>> incorporated that in the attached patch. >>>> >>>> wrt partition-wise join, I had run the regression test (with new cases >>>> related to partition-wise join) on v2 of the delta patch and observed >>>> the crash. Hence I have not incorporated the partition-wise join >>>> related code from v2 of delta patch to main v4 patch. Instead I have >>>> added the partition-wise join related code done by me in the attached >>>> patch. Please share your thoughts and if possible we can improvise the >>>> code. Rest of the changes looks good to me and I have incorporated >>>> that in the attached patch. >>>> >>>> >>>> > I guess that may be due to the following newly added code being >>>> incomplete: >>>> > Maybe this function needs to return a "bitmapset" of indexes, because >>>> > multiple partitions can now contain NULL values. >>>> >>>> I feel this function is not required at all as we are not separating >>>> the non null and null partitions now. Removed in the attached patch. >>>> Also removed the "scan_null' variable from the structure >>>> "PruneStepResult" and cleaned up the corresponding code blocks. >>>> >>>> >>>> > This function name may be too generic. Given that it is specific to >>>> > implementing list bound de-duplication, maybe the following signature >>>> > is more appropriate: >>>> > >>>> > static bool >>>> > checkListBoundDuplicated(List *list_bounds, List *new_bound) >>>> >>>> Yes. The function name looks more generic. How about using >>>> "isListBoundDuplicated()"? I have used this name in the patch. Please >>>> let me know if that does not look correct. >>>> >>>> >>>> > Also, better if the function comment mentions those parameter names, >>>> like: >>>> > >>>> > "Returns TRUE if the list bound element 'new_bound' is already present >>>> > in the target list 'list_bounds', FALSE otherwise." >>>> >>>> Fixed. >>>> >>>> >>>> > +/* >>>> > + * transformPartitionListBounds >>>> > + * >>>> > + * Converts the expressions of list partition bounds from the raw >>>> grammar >>>> > + * representation. >>>> > >>>> > A sentence about the result format would be helpful, like: >>>> > >>>> > The result is a List of Lists of Const nodes to account for the >>>> > partition key possibly containing more than one column. >>>> >>>> Fixed. >>>> >>>> >>>> > + int i = 0; >>>> > + int j = 0; >>>> > >>>> > Better to initialize such loop counters closer to the loop. >>>> >>>> Fixed in all the places. >>>> >>>> >>>> > + colname[i] = (char *) palloc0(NAMEDATALEN * sizeof(char)); >>>> > + colname[i] = get_attname(RelationGetRelid(parent), >>>> > + key->partattrs[i], false); >>>> > >>>> > The palloc in the 1st statement is wasteful, because the 2nd statement >>>> > overwrites its pointer by the pointer to the string palloc'd by >>>> > get_attname(). >>>> >>>> Removed the 1st statement as it is not required. >>>> >>>> >>>> > + ListCell *cell2 = NULL; >>>> > >>>> > No need to explicitly initialize the loop variable. >>>> >>>> Fixed in all the places. >>>> >>>> >>>> > + RowExpr *rowexpr = NULL; >>>> > + >>>> > + if (!IsA(expr, RowExpr)) >>>> > + ereport(ERROR, >>>> > + (errcode(ERRCODE_INVALID_TABLE_DEFINITION), >>>> > + errmsg("Invalid list bound specification"), >>>> > + parser_errposition(pstate, exprLocation((Node >>>> > *) spec)))); >>>> > + >>>> > + rowexpr = (RowExpr *) expr; >>>> > >>>> > It's okay to assign rowexpr at the top here instead of the dummy >>>> > NULL-initialization and write the condition as: >>>> > >>>> > if (!IsA(rowexpr, RowExpr)) >>>> >>>> Fixed. >>>> >>>> >>>> > + if (isDuplicate) >>>> > + continue; >>>> > + >>>> > + result = lappend(result, values); >>>> > >>>> > I can see you copied this style from the existing code, but how about >>>> > writing this simply as: >>>> > >>>> > if (!isDuplicate) >>>> > result = lappend(result, values); >>>> >>>> This looks good. I have changed in the patch. >>>> >>>> >>>> > -/* One value coming from some (index'th) list partition */ >>>> > +/* One bound of a list partition */ >>>> > typedef struct PartitionListValue >>>> > { >>>> > int index; >>>> > - Datum value; >>>> > + Datum *values; >>>> > + bool *isnulls; >>>> > } PartitionListValue; >>>> > >>>> > Given that this is a locally-defined struct, I wonder if it makes >>>> > sense to rename the struct while we're at it. Call it, say, >>>> > PartitionListBound? >>>> >>>> Yes. PartitionListBound looks more appropriate and it also matches the >>>> similar structures of the other partition strategies. >>>> >>>> > Also, please keep part of the existing comment that says that the >>>> > bound belongs to index'th partition. >>>> >>>> Retained the old comment. >>>> >>>> >>>> > + * partition_bound_accepts_nulls >>>> > + * >>>> > + * Returns TRUE if partition bound has NULL value, FALSE otherwise. >>>> > */ >>>> > >>>> > I suggest slight rewording, as follows: >>>> > >>>> > "Returns TRUE if any of the partition bounds contains a NULL value, >>>> > FALSE otherwise." >>>> >>>> Fixed. >>>> >>>> >>>> > - PartitionListValue *all_values; >>>> > + PartitionListValue **all_values; >>>> > ... >>>> > - all_values = (PartitionListValue *) >>>> > - palloc(ndatums * sizeof(PartitionListValue)); >>>> > + ndatums = get_list_datum_count(boundspecs, nparts); >>>> > + all_values = (PartitionListValue **) >>>> > + palloc(ndatums * sizeof(PartitionListValue *)); >>>> > >>>> > I don't see the need to redefine all_values's pointer type. No need >>>> > to palloc PartitionListValue repeatedly for every datum as done >>>> > further down as follows: >>>> > >>>> > + all_values[j] = (PartitionListValue *) >>>> > palloc(sizeof(PartitionListValue)); >>>> > >>>> > You do need the following two though: >>>> > >>>> > + all_values[j]->values = (Datum *) palloc0(key->partnatts * >>>> > sizeof(Datum)); >>>> > + all_values[j]->isnulls = (bool *) palloc0(key->partnatts * >>>> > sizeof(bool)); >>>> > >>>> > If you change the above the way I suggest, you'd also need to revert >>>> > the following change: >>>> > >>>> > - qsort_arg(all_values, ndatums, sizeof(PartitionListValue), >>>> > + qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), >>>> > qsort_partition_list_value_cmp, (void *) key); >>>> > >>>> > + int orig_index = all_values[i]->index; >>>> > + boundinfo->datums[i] = (Datum *) palloc(key->partnatts * >>>> sizeof(Datum)); >>>> > >>>> > Missing a newline between these two statements. >>>> >>>> Fixed. Made necessary changes to keep the intent of existing code. >>>> >>>> >>>> > @@ -915,7 +949,7 @@ partition_bounds_equal(int partnatts, int16 >>>> > *parttyplen, bool *parttypbyval, >>>> > if (b1->nindexes != b2->nindexes) >>>> > return false; >>>> > >>>> > - if (b1->null_index != b2->null_index) >>>> > + if (get_partition_bound_null_index(b1) != >>>> > get_partition_bound_null_index(b2)) >>>> > >>>> > As mentioned in the last message, this bit in partition_bounds_equal() >>>> > needs to be comparing "bitmapsets" of null bound indexes, that is >>>> > after fixing get_partition_bound_null_index() as previously mentioned. >>>> >>>> As mentioned earlier, removed the functionality of >>>> get_partition_bound_null_index(), hence the above condition is not >>>> required and removed. >>>> >>>> > But... >>>> > >>>> > @@ -988,7 +1022,22 @@ partition_bounds_equal(int partnatts, int16 >>>> > *parttyplen, bool *parttypbyval, >>>> > * context. datumIsEqual() should be simple enough to >>>> be >>>> > * safe. >>>> > */ >>>> > - if (!datumIsEqual(b1->datums[i][j], b2->datums[i][j], >>>> > + if (b1->isnulls) >>>> > + b1_isnull = b1->isnulls[i][j]; >>>> > + if (b2->isnulls) >>>> > + b2_isnull = b2->isnulls[i][j]; >>>> > + >>>> > + /* >>>> > + * If any of the partition bound has NULL value, then >>>> check >>>> > + * equality for the NULL value instead of comparing >>>> the datums >>>> > + * as it does not contain valid value in case of NULL. >>>> > + */ >>>> > + if (b1_isnull || b2_isnull) >>>> > + { >>>> > + if (b1_isnull != b2_isnull) >>>> > + return false; >>>> > + } >>>> > >>>> > ...if you have this in the main loop, I don't think we need the above >>>> > code stanza which appears to implement a short-cut for this long-form >>>> > logic. >>>> >>>> Yes. May be we could have ignored the above code stanza if we would >>>> have comparing the null indexes using get_partition_bound_null_index() >>>> in the beginning of the function. But hence we are not separating the >>>> non null partitions and null partitions, I would like to keep the >>>> logic in the inner loop as we are doing it for non null bound values >>>> in the above code stanza, just to give a feel that null bound values >>>> are also handled the same way as non null values. Please correct me if >>>> I am wrong. >>>> >>>> >>>> > + (key->strategy != PARTITION_STRATEGY_LIST || >>>> > + !src->isnulls[i][j])) >>>> > >>>> > I think it's better to write this condition as follows just like the >>>> > accompanying condition involving src->kind: >>>> > >>>> > (src->nulls == NULL || !src->isnulls[i][j]) >>>> >>>> Fixed. >>>> >>>> >>>> > In check_new_partition_bound(): >>>> > >>>> > + Datum *values = (Datum *) >>>> > palloc0(key->partnatts * sizeof(Datum)); >>>> > + bool *isnulls = (bool *) >>>> > palloc0(key->partnatts * sizeof(bool)); >>>> > >>>> > Doesn't seem like a bad idea to declare these as: >>>> > >>>> > Datum values[PARTITION_MAX_KEYS]; >>>> > bool isnulls[PARTITION_MAX_KEYS]; >>>> >>>> Thanks for the suggestion. I have changed as above. >>>> >>>> > I looked at get_qual_for_list_multi_column() and immediately thought >>>> > that it may be a bad idea. I think it's better to integrate the logic >>>> > for multi-column case into the existing function even if that makes >>>> > the function appear more complex. Having two functions with the same >>>> > goal and mostly the same code is not a good idea mainly because it >>>> > becomes a maintenance burden. >>>> >>>> Actually I had written a separate function because of the complexity. >>>> Now I have understood that since the objective is same, it should be >>>> done in a single function irrespective of complexity. >>>> >>>> > I have attempted a rewrite such that get_qual_for_list() now handles >>>> > both the single-column and multi-column cases. Changes included in >>>> > the delta patch. The patch updates some outputs of the newly added >>>> > tests for multi-column list partitions, because the new code emits the >>>> > IS NOT NULL tests a bit differently than >>>> > get_qual_for_list_mutli_column() would. Notably, the old approach >>>> > would emit IS NOT NULL for every non-NULL datum matched to a given >>>> > column, not just once for the column. However, the patch makes a few >>>> > other tests fail, mainly because I had to fix >>>> > partition_bound_accepts_nulls() to handle the multi-column case, >>>> > though didn't bother to update all callers of it to also handle the >>>> > multi-column case correctly. I guess that's a TODO you're going to >>>> > deal with at some point anyway. :) >>>> >>>> Thank you very much for your efforts. The changes looks good to me and >>>> I have incorporated these changes in the attached patch. >>>> >>>> I have completed the coding for all the TODOs and hence removed in the >>>> patch. The naming conventions used for function/variable names varies >>>> across the files. Some places it is like 'namesLikeThis' and in some >>>> place it is like 'names_like_this'. I have used the naming conventions >>>> based on the surrounding styles used. I am happy to change those if >>>> required. >>>> >>>> I have verified 'make check' with the attached patch and it is working >>>> fine. >>>> >>>> >>>> Thanks & Regards, >>>> Nitin Jadhav >>>> >>>> >>>> On Mon, Sep 13, 2021 at 3:47 PM Rajkumar Raghuwanshi >>>> <rajkumar.raghuwan...@enterprisedb.com> wrote: >>>> > >>>> > On PG head + Nitin's v3 patch + Amit's Delta patch. Make check is >>>> failing with below errors. >>>> > >>>> > --inherit.sql is failing with error :"ERROR: negative bitmapset >>>> member not allowed" >>>> > update mlparted_tab mlp set c = 'xxx' >>>> > from >>>> > (select a from some_tab union all select a+1 from some_tab) ss (a) >>>> > where (mlp.a = ss.a and mlp.b = 'b') or mlp.a = 3; >>>> > ERROR: negative bitmapset member not allowed >>>> > >>>> > --partition_join.sql is crashing with enable_partitionwise_join set >>>> to true. >>>> > CREATE TABLE plt1_adv (a int, b int, c text) PARTITION BY LIST (c); >>>> > CREATE TABLE plt1_adv_p1 PARTITION OF plt1_adv FOR VALUES IN ('0001', >>>> '0003'); >>>> > CREATE TABLE plt1_adv_p2 PARTITION OF plt1_adv FOR VALUES IN ('0004', >>>> '0006'); >>>> > CREATE TABLE plt1_adv_p3 PARTITION OF plt1_adv FOR VALUES IN ('0008', >>>> '0009'); >>>> > INSERT INTO plt1_adv SELECT i, i, to_char(i % 10, 'FM0000') FROM >>>> generate_series(1, 299) i WHERE i % 10 IN (1, 3, 4, 6, 8, 9); >>>> > ANALYZE plt1_adv; >>>> > CREATE TABLE plt2_adv (a int, b int, c text) PARTITION BY LIST (c); >>>> > CREATE TABLE plt2_adv_p1 PARTITION OF plt2_adv FOR VALUES IN ('0002', >>>> '0003'); >>>> > CREATE TABLE plt2_adv_p2 PARTITION OF plt2_adv FOR VALUES IN ('0004', >>>> '0006'); >>>> > CREATE TABLE plt2_adv_p3 PARTITION OF plt2_adv FOR VALUES IN ('0007', >>>> '0009'); >>>> > INSERT INTO plt2_adv SELECT i, i, to_char(i % 10, 'FM0000') FROM >>>> generate_series(1, 299) i WHERE i % 10 IN (2, 3, 4, 6, 7, 9); >>>> > ANALYZE plt2_adv; >>>> > -- inner join >>>> > EXPLAIN (COSTS OFF) >>>> > SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 >>>> ON (t1.a = t2.a AND t1.c = t2.c) WHERE t1.b < 10 ORDER BY t1.a; >>>> > server closed the connection unexpectedly >>>> > This probably means the server terminated abnormally >>>> > before or while processing the request. >>>> > connection to server was lost >>>> > >>>> > >>>> > --stack-trace >>>> > Core was generated by `postgres: edb regression [local] EXPLAIN >>>> '. >>>> > Program terminated with signal 6, Aborted. >>>> > #0 0x00007f7d339ba277 in raise () from /lib64/libc.so.6 >>>> > Missing separate debuginfos, use: debuginfo-install >>>> glibc-2.17-222.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 >>>> krb5-libs-1.15.1-19.el7.x86_64 libcom_err-1.42.9-12.el7_5.x86_64 >>>> libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-12.el7.x86_64 >>>> openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 >>>> zlib-1.2.7-17.el7.x86_64 >>>> > (gdb) bt >>>> > #0 0x00007f7d339ba277 in raise () from /lib64/libc.so.6 >>>> > #1 0x00007f7d339bb968 in abort () from /lib64/libc.so.6 >>>> > #2 0x0000000000b0fbc3 in ExceptionalCondition >>>> (conditionName=0xcbda10 "part_index >= 0", errorType=0xcbd1c3 >>>> "FailedAssertion", fileName=0xcbd2fe "partbounds.c", lineNumber=1957) >>>> > at assert.c:69 >>>> > #3 0x0000000000892aa1 in is_dummy_partition (rel=0x19b37c0, >>>> part_index=-1) at partbounds.c:1957 >>>> > #4 0x00000000008919bd in merge_list_bounds (partnatts=1, >>>> partsupfunc=0x1922798, partcollation=0x1922738, outer_rel=0x19b37c0, >>>> inner_rel=0x1922938, jointype=JOIN_INNER, >>>> > outer_parts=0x7fffd67751b0, inner_parts=0x7fffd67751a8) at >>>> partbounds.c:1529 >>>> > #5 0x00000000008910de in partition_bounds_merge (partnatts=1, >>>> partsupfunc=0x1922798, partcollation=0x1922738, outer_rel=0x19b37c0, >>>> inner_rel=0x1922938, jointype=JOIN_INNER, >>>> > outer_parts=0x7fffd67751b0, inner_parts=0x7fffd67751a8) at >>>> partbounds.c:1223 >>>> > #6 0x000000000082c41a in compute_partition_bounds (root=0x1a19ed0, >>>> rel1=0x19b37c0, rel2=0x1922938, joinrel=0x1ab7f30, >>>> parent_sjinfo=0x7fffd67752a0, parts1=0x7fffd67751b0, >>>> > parts2=0x7fffd67751a8) at joinrels.c:1644 >>>> > #7 0x000000000082bc34 in try_partitionwise_join (root=0x1a19ed0, >>>> rel1=0x19b37c0, rel2=0x1922938, joinrel=0x1ab7f30, >>>> parent_sjinfo=0x7fffd67752a0, parent_restrictlist=0x1ab3318) >>>> > at joinrels.c:1402 >>>> > #8 0x000000000082aea2 in populate_joinrel_with_paths >>>> (root=0x1a19ed0, rel1=0x19b37c0, rel2=0x1922938, joinrel=0x1ab7f30, >>>> sjinfo=0x7fffd67752a0, restrictlist=0x1ab3318) >>>> > at joinrels.c:926 >>>> > #9 0x000000000082a8f5 in make_join_rel (root=0x1a19ed0, >>>> rel1=0x19b37c0, rel2=0x1922938) at joinrels.c:760 >>>> > #10 0x0000000000829e03 in make_rels_by_clause_joins (root=0x1a19ed0, >>>> old_rel=0x19b37c0, other_rels_list=0x1ab2970, other_rels=0x1ab2990) at >>>> joinrels.c:312 >>>> > #11 0x00000000008298d9 in join_search_one_level (root=0x1a19ed0, >>>> level=2) at joinrels.c:123 >>>> > #12 0x000000000080c566 in standard_join_search (root=0x1a19ed0, >>>> levels_needed=2, initial_rels=0x1ab2970) at allpaths.c:3020 >>>> > #13 0x000000000080c4df in make_rel_from_joinlist (root=0x1a19ed0, >>>> joinlist=0x199d538) at allpaths.c:2951 >>>> > #14 0x000000000080816b in make_one_rel (root=0x1a19ed0, >>>> joinlist=0x199d538) at allpaths.c:228 >>>> > #15 0x000000000084491d in query_planner (root=0x1a19ed0, >>>> qp_callback=0x84a538 <standard_qp_callback>, qp_extra=0x7fffd6775630) at >>>> planmain.c:276 >>>> > #16 0x0000000000847040 in grouping_planner (root=0x1a19ed0, >>>> tuple_fraction=0) at planner.c:1447 >>>> > #17 0x0000000000846709 in subquery_planner (glob=0x19b39d8, >>>> parse=0x1aaa290, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at >>>> planner.c:1025 >>>> > #18 0x0000000000844f3e in standard_planner (parse=0x1aaa290, >>>> > query_string=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c, >>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c >>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;", cursorOptions=2048, >>>> boundParams=0x0) at planner.c:406 >>>> > #19 0x0000000000844ce9 in planner (parse=0x1aaa290, >>>> > query_string=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c, >>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c >>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;", cursorOptions=2048, >>>> boundParams=0x0) at planner.c:277 >>>> > #20 0x0000000000978483 in pg_plan_query (querytree=0x1aaa290, >>>> > query_string=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c, >>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c >>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;", cursorOptions=2048, >>>> boundParams=0x0) at postgres.c:847 >>>> > #21 0x00000000006937fc in ExplainOneQuery (query=0x1aaa290, >>>> cursorOptions=2048, into=0x0, es=0x19b36f0, >>>> > queryString=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c, >>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c >>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;", >>>> > params=0x0, queryEnv=0x0) at explain.c:397 >>>> > #22 0x0000000000693351 in ExplainQuery (pstate=0x197c410, >>>> stmt=0x1aaa0b0, params=0x0, dest=0x197c378) at explain.c:281 >>>> > #23 0x00000000009811fa in standard_ProcessUtility (pstmt=0x1a0bfc8, >>>> > queryString=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c, >>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c >>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;", >>>> > readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, >>>> queryEnv=0x0, dest=0x197c378, qc=0x7fffd6775f90) at utility.c:845 >>>> > #24 0x00000000009809ec in ProcessUtility (pstmt=0x1a0bfc8, >>>> > queryString=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c, >>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c >>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;", >>>> > readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, >>>> queryEnv=0x0, dest=0x197c378, qc=0x7fffd6775f90) at utility.c:527 >>>> > #25 0x000000000097f636 in PortalRunUtility (portal=0x1893b40, >>>> pstmt=0x1a0bfc8, isTopLevel=true, setHoldSnapshot=true, dest=0x197c378, >>>> qc=0x7fffd6775f90) at pquery.c:1147 >>>> > #26 0x000000000097f3a5 in FillPortalStore (portal=0x1893b40, >>>> isTopLevel=true) at pquery.c:1026 >>>> > #27 0x000000000097ed11 in PortalRun (portal=0x1893b40, >>>> count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x1a0c0b8, >>>> altdest=0x1a0c0b8, qc=0x7fffd6776150) at pquery.c:758 >>>> > #28 0x0000000000978aa5 in exec_simple_query ( >>>> > >>>> > Thanks & Regards, >>>> > Rajkumar Raghuwanshi >>>> > >>>> > >>>> > On Fri, Sep 3, 2021 at 7:17 PM Amit Langote <amitlangot...@gmail.com> >>>> wrote: >>>> >> >>>> >> On Wed, Sep 1, 2021 at 2:31 PM Amit Langote <amitlangot...@gmail.com> >>>> wrote: >>>> >> > On Tue, Aug 31, 2021 at 8:02 PM Nitin Jadhav >>>> >> > <nitinjadhavpostg...@gmail.com> wrote: >>>> >> > > The attached patch also fixes the above comments. >>>> >> > >>>> >> > I noticed that multi-column list partitions containing NULLs don't >>>> >> > work correctly with partition pruning yet. >>>> >> > >>>> >> > create table p0 (a int, b text, c bool) partition by list (a, b, >>>> c); >>>> >> > create table p01 partition of p0 for values in ((1, 1, true), >>>> (NULL, 1, false)); >>>> >> > create table p02 partition of p0 for values in ((1, NULL, false)); >>>> >> > explain select * from p0 where a is null; >>>> >> > QUERY PLAN >>>> >> > -------------------------------------------------------- >>>> >> > Seq Scan on p01 p0 (cost=0.00..22.50 rows=6 width=37) >>>> >> > Filter: (a IS NULL) >>>> >> > (2 rows) >>>> >> > >>>> >> > I guess that may be due to the following newly added code being >>>> incomplete: >>>> >> > >>>> >> > +/* >>>> >> > + * get_partition_bound_null_index >>>> >> > + * >>>> >> > + * Returns the partition index of the partition bound which >>>> accepts NULL. >>>> >> > + */ >>>> >> > +int >>>> >> > +get_partition_bound_null_index(PartitionBoundInfo boundinfo) >>>> >> > +{ >>>> >> > + int i = 0; >>>> >> > + int j = 0; >>>> >> > + >>>> >> > + if (!boundinfo->isnulls) >>>> >> > + return -1; >>>> >> > >>>> >> > - if (!val->constisnull) >>>> >> > - count++; >>>> >> > + for (i = 0; i < boundinfo->ndatums; i++) >>>> >> > + { >>>> >> > + //TODO: Handle for multi-column cases >>>> >> > + for (j = 0; j < 1; j++) >>>> >> > + { >>>> >> > + if (boundinfo->isnulls[i][j]) >>>> >> > + return boundinfo->indexes[i]; >>>> >> > } >>>> >> > } >>>> >> > >>>> >> > + return -1; >>>> >> > +} >>>> >> > >>>> >> > Maybe this function needs to return a "bitmapset" of indexes, >>>> because >>>> >> > multiple partitions can now contain NULL values. >>>> >> > >>>> >> > Some other issues I noticed and suggestions for improvement: >>>> >> > >>>> >> > +/* >>>> >> > + * checkForDuplicates >>>> >> > + * >>>> >> > + * Returns TRUE if the list bound element is already present in >>>> the list of >>>> >> > + * list bounds, FALSE otherwise. >>>> >> > + */ >>>> >> > +static bool >>>> >> > +checkForDuplicates(List *source, List *searchElem) >>>> >> > >>>> >> > This function name may be too generic. Given that it is specific >>>> to >>>> >> > implementing list bound de-duplication, maybe the following >>>> signature >>>> >> > is more appropriate: >>>> >> > >>>> >> > static bool >>>> >> > checkListBoundDuplicated(List *list_bounds, List *new_bound) >>>> >> > >>>> >> > Also, better if the function comment mentions those parameter >>>> names, like: >>>> >> > >>>> >> > "Returns TRUE if the list bound element 'new_bound' is already >>>> present >>>> >> > in the target list 'list_bounds', FALSE otherwise." >>>> >> > >>>> >> > +/* >>>> >> > + * transformPartitionListBounds >>>> >> > + * >>>> >> > + * Converts the expressions of list partition bounds from the raw >>>> grammar >>>> >> > + * representation. >>>> >> > >>>> >> > A sentence about the result format would be helpful, like: >>>> >> > >>>> >> > The result is a List of Lists of Const nodes to account for the >>>> >> > partition key possibly containing more than one column. >>>> >> > >>>> >> > + int i = 0; >>>> >> > + int j = 0; >>>> >> > >>>> >> > Better to initialize such loop counters closer to the loop. >>>> >> > >>>> >> > + colname[i] = (char *) palloc0(NAMEDATALEN * >>>> sizeof(char)); >>>> >> > + colname[i] = get_attname(RelationGetRelid(parent), >>>> >> > + key->partattrs[i], false); >>>> >> > >>>> >> > The palloc in the 1st statement is wasteful, because the 2nd >>>> statement >>>> >> > overwrites its pointer by the pointer to the string palloc'd by >>>> >> > get_attname(). >>>> >> > >>>> >> > + ListCell *cell2 = NULL; >>>> >> > >>>> >> > No need to explicitly initialize the loop variable. >>>> >> > >>>> >> > + RowExpr *rowexpr = NULL; >>>> >> > + >>>> >> > + if (!IsA(expr, RowExpr)) >>>> >> > + ereport(ERROR, >>>> >> > + (errcode(ERRCODE_INVALID_TABLE_DEFINITION), >>>> >> > + errmsg("Invalid list bound specification"), >>>> >> > + parser_errposition(pstate, >>>> exprLocation((Node >>>> >> > *) spec)))); >>>> >> > + >>>> >> > + rowexpr = (RowExpr *) expr; >>>> >> > >>>> >> > It's okay to assign rowexpr at the top here instead of the dummy >>>> >> > NULL-initialization and write the condition as: >>>> >> > >>>> >> > if (!IsA(rowexpr, RowExpr)) >>>> >> > >>>> >> > + if (isDuplicate) >>>> >> > + continue; >>>> >> > + >>>> >> > + result = lappend(result, values); >>>> >> > >>>> >> > I can see you copied this style from the existing code, but how >>>> about >>>> >> > writing this simply as: >>>> >> > >>>> >> > if (!isDuplicate) >>>> >> > result = lappend(result, values); >>>> >> > >>>> >> > -/* One value coming from some (index'th) list partition */ >>>> >> > +/* One bound of a list partition */ >>>> >> > typedef struct PartitionListValue >>>> >> > { >>>> >> > int index; >>>> >> > - Datum value; >>>> >> > + Datum *values; >>>> >> > + bool *isnulls; >>>> >> > } PartitionListValue; >>>> >> > >>>> >> > Given that this is a locally-defined struct, I wonder if it makes >>>> >> > sense to rename the struct while we're at it. Call it, say, >>>> >> > PartitionListBound? >>>> >> > >>>> >> > Also, please keep part of the existing comment that says that the >>>> >> > bound belongs to index'th partition. >>>> >> > >>>> >> > Will send more comments in a bit... >>>> >> >>>> >> + * partition_bound_accepts_nulls >>>> >> + * >>>> >> + * Returns TRUE if partition bound has NULL value, FALSE otherwise. >>>> >> */ >>>> >> >>>> >> I suggest slight rewording, as follows: >>>> >> >>>> >> "Returns TRUE if any of the partition bounds contains a NULL value, >>>> >> FALSE otherwise." >>>> >> >>>> >> - PartitionListValue *all_values; >>>> >> + PartitionListValue **all_values; >>>> >> ... >>>> >> - all_values = (PartitionListValue *) >>>> >> - palloc(ndatums * sizeof(PartitionListValue)); >>>> >> + ndatums = get_list_datum_count(boundspecs, nparts); >>>> >> + all_values = (PartitionListValue **) >>>> >> + palloc(ndatums * sizeof(PartitionListValue *)); >>>> >> >>>> >> I don't see the need to redefine all_values's pointer type. No need >>>> >> to palloc PartitionListValue repeatedly for every datum as done >>>> >> further down as follows: >>>> >> >>>> >> + all_values[j] = (PartitionListValue *) >>>> >> palloc(sizeof(PartitionListValue)); >>>> >> >>>> >> You do need the following two though: >>>> >> >>>> >> + all_values[j]->values = (Datum *) palloc0(key->partnatts >>>> * >>>> >> sizeof(Datum)); >>>> >> + all_values[j]->isnulls = (bool *) palloc0(key->partnatts >>>> * >>>> >> sizeof(bool)); >>>> >> >>>> >> If you change the above the way I suggest, you'd also need to revert >>>> >> the following change: >>>> >> >>>> >> - qsort_arg(all_values, ndatums, sizeof(PartitionListValue), >>>> >> + qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), >>>> >> qsort_partition_list_value_cmp, (void *) key); >>>> >> >>>> >> + int orig_index = all_values[i]->index; >>>> >> + boundinfo->datums[i] = (Datum *) palloc(key->partnatts * >>>> sizeof(Datum)); >>>> >> >>>> >> Missing a newline between these two statements. >>>> >> >>>> >> BTW, I noticed that the boundDatums variable is no longer used in >>>> >> create_list_bounds. I traced back its origin and found that a recent >>>> >> commit 53d86957e98 introduced it to implement an idea to reduce the >>>> >> finer-grained pallocs that were being done in create_list_bounds(). >>>> I >>>> >> don't think that this patch needs to throw away that work. You can >>>> >> make it work as the attached delta patch that applies on top of v3. >>>> >> Please check. >>>> >> >>>> >> @@ -915,7 +949,7 @@ partition_bounds_equal(int partnatts, int16 >>>> >> *parttyplen, bool *parttypbyval, >>>> >> if (b1->nindexes != b2->nindexes) >>>> >> return false; >>>> >> >>>> >> - if (b1->null_index != b2->null_index) >>>> >> + if (get_partition_bound_null_index(b1) != >>>> >> get_partition_bound_null_index(b2)) >>>> >> >>>> >> As mentioned in the last message, this bit in >>>> partition_bounds_equal() >>>> >> needs to be comparing "bitmapsets" of null bound indexes, that is >>>> >> after fixing get_partition_bound_null_index() as previously >>>> mentioned. >>>> >> >>>> >> But... >>>> >> >>>> >> @@ -988,7 +1022,22 @@ partition_bounds_equal(int partnatts, int16 >>>> >> *parttyplen, bool *parttypbyval, >>>> >> * context. datumIsEqual() should be simple enough >>>> to be >>>> >> * safe. >>>> >> */ >>>> >> - if (!datumIsEqual(b1->datums[i][j], b2->datums[i][j], >>>> >> + if (b1->isnulls) >>>> >> + b1_isnull = b1->isnulls[i][j]; >>>> >> + if (b2->isnulls) >>>> >> + b2_isnull = b2->isnulls[i][j]; >>>> >> + >>>> >> + /* >>>> >> + * If any of the partition bound has NULL value, >>>> then check >>>> >> + * equality for the NULL value instead of comparing >>>> the datums >>>> >> + * as it does not contain valid value in case of >>>> NULL. >>>> >> + */ >>>> >> + if (b1_isnull || b2_isnull) >>>> >> + { >>>> >> + if (b1_isnull != b2_isnull) >>>> >> + return false; >>>> >> + } >>>> >> >>>> >> ...if you have this in the main loop, I don't think we need the above >>>> >> code stanza which appears to implement a short-cut for this long-form >>>> >> logic. >>>> >> >>>> >> + (key->strategy != PARTITION_STRATEGY_LIST || >>>> >> + !src->isnulls[i][j])) >>>> >> >>>> >> I think it's better to write this condition as follows just like the >>>> >> accompanying condition involving src->kind: >>>> >> >>>> >> (src->nulls == NULL || !src->isnulls[i][j]) >>>> >> >>>> >> (Skipped looking at merge_list_bounds() and related changes for now >>>> as >>>> >> I see a lot of TODOs remain to be done.) >>>> >> >>>> >> In check_new_partition_bound(): >>>> >> >>>> >> + Datum *values = (Datum *) >>>> >> palloc0(key->partnatts * sizeof(Datum)); >>>> >> + bool *isnulls = (bool *) >>>> >> palloc0(key->partnatts * sizeof(bool)); >>>> >> >>>> >> Doesn't seem like a bad idea to declare these as: >>>> >> >>>> >> Datum values[PARTITION_MAX_KEYS]; >>>> >> bool isnulls[PARTITION_MAX_KEYS]; >>>> >> >>>> >> >>>> >> I looked at get_qual_for_list_multi_column() and immediately thought >>>> >> that it may be a bad idea. I think it's better to integrate the >>>> logic >>>> >> for multi-column case into the existing function even if that makes >>>> >> the function appear more complex. Having two functions with the same >>>> >> goal and mostly the same code is not a good idea mainly because it >>>> >> becomes a maintenance burden. >>>> >> >>>> >> I have attempted a rewrite such that get_qual_for_list() now handles >>>> >> both the single-column and multi-column cases. Changes included in >>>> >> the delta patch. The patch updates some outputs of the newly added >>>> >> tests for multi-column list partitions, because the new code emits >>>> the >>>> >> IS NOT NULL tests a bit differently than >>>> >> get_qual_for_list_mutli_column() would. Notably, the old approach >>>> >> would emit IS NOT NULL for every non-NULL datum matched to a given >>>> >> column, not just once for the column. However, the patch makes a few >>>> >> other tests fail, mainly because I had to fix >>>> >> partition_bound_accepts_nulls() to handle the multi-column case, >>>> >> though didn't bother to update all callers of it to also handle the >>>> >> multi-column case correctly. I guess that's a TODO you're going to >>>> >> deal with at some point anyway. :) >>>> >> >>>> >> I still have more than half of v3 left to look at, so will continue >>>> >> looking. In the meantime, please check the changes I suggested, >>>> >> including the delta patch, and let me know your thoughts. >>>> >> >>>> >> -- >>>> >> Amit Langote >>>> >> EDB: http://www.enterprisedb.com >>>> >>>