Re: Multi-Column List Partitioning

Rajkumar Raghuwanshi Mon, 11 Oct 2021 03:43:11 -0700

Hi Nitin,

While testing further I got a crash with partition wise join enabled for
multi-col list partitions. please find test case & stack-trace below.


SET enable_partitionwise_join TO on;
CREATE TABLE plt1 (c varchar, d varchar) PARTITION BY LIST(c,d);
CREATE TABLE plt1_p1 PARTITION OF plt1 FOR VALUES IN
(('0001','0001'),('0002','0002'),(NULL,NULL));
CREATE TABLE plt1_p2 PARTITION OF plt1 FOR VALUES IN
(('0004','0004'),('0005','0005'),('0006','0006'));
INSERT INTO plt1 SELECT to_char(i % 11, 'FM0000'), to_char(i % 11,
'FM0000') FROM generate_series(0, 500) i WHERE i % 11 NOT IN (0,10,3,7,8,9);
INSERT INTO plt1 SELECT NULL,NULL FROM generate_series(0, 500) i WHERE i %
11 IN (3);
ANALYSE plt1;
CREATE TABLE plt2 (c varchar, d varchar) PARTITION BY LIST(c,d);
CREATE TABLE plt2_p1 PARTITION OF plt2 FOR VALUES IN
(('0001','0001'),('0002','0002'));
CREATE TABLE plt2_p2 PARTITION OF plt2 FOR VALUES IN
(('0004','0004'),('0005','0005'),('0006','0006'));
CREATE TABLE plt2_p3 PARTITION OF plt2 DEFAULT;
INSERT INTO plt2 SELECT to_char(i % 11, 'FM0000'), to_char(i % 11,
'FM0000') FROM generate_series(0, 500) i WHERE i % 11 NOT IN (0,10,3);
INSERT INTO plt2 SELECT NULL,NULL FROM generate_series(0, 500) i WHERE i %
11 IN (3);
ANALYSE plt2;

EXPLAIN (COSTS OFF)
SELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON
(t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d =
t3.d);

postgres=# EXPLAIN (COSTS OFF)
postgres-# SELECT t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN
plt2 t2 ON (t1.c = t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c
AND t2.d = t3.d);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?> \q
[edb@localhost bin]$ gdb -q -c data/core.66926 postgres
Reading symbols from
/home/edb/WORK/pg_src/PG_TEMP/postgresql/inst/bin/postgres...done.
[New LWP 66926]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: edb postgres [local] EXPLAIN
             '.
Program terminated with signal 11, Segmentation fault.
#0  0x000000000082be39 in is_dummy_rel (rel=0x40) at joinrels.c:1221
1221 if (rel->pathlist == NIL)
(gdb) bt
#0  0x000000000082be39 in is_dummy_rel (rel=0x40) at joinrels.c:1221
#1  0x000000000089341c in is_dummy_partition (rel=0x2f86e88, part_index=2)
at partbounds.c:1959
#2  0x0000000000891d38 in merge_list_bounds (partnatts=2,
partsupfunc=0x2f70058, partcollation=0x2fd3c98, outer_rel=0x2f86e88,
inner_rel=0x2fd4368, jointype=JOIN_LEFT,
    outer_parts=0x7ffea91f8cc0, inner_parts=0x7ffea91f8cb8) at
partbounds.c:1325
#3  0x0000000000891991 in partition_bounds_merge (partnatts=2,
partsupfunc=0x2f70058, partcollation=0x2fd3c98, outer_rel=0x2f86e88,
inner_rel=0x2fd4368, jointype=JOIN_LEFT,
    outer_parts=0x7ffea91f8cc0, inner_parts=0x7ffea91f8cb8) at
partbounds.c:1198
#4  0x000000000082cc5a in compute_partition_bounds (root=0x2f9e910,
rel1=0x2f86e88, rel2=0x2fd4368, joinrel=0x2fae388, parent_sjinfo=0x2f7dfa8,
parts1=0x7ffea91f8cc0,
    parts2=0x7ffea91f8cb8) at joinrels.c:1644
#5  0x000000000082c474 in try_partitionwise_join (root=0x2f9e910,
rel1=0x2f86e88, rel2=0x2fd4368, joinrel=0x2fae388, parent_sjinfo=0x2f7dfa8,
parent_restrictlist=0x2fae650)
    at joinrels.c:1402
#6  0x000000000082b6e2 in populate_joinrel_with_paths (root=0x2f9e910,
rel1=0x2f86e88, rel2=0x2fd4368, joinrel=0x2fae388, sjinfo=0x2f7dfa8,
restrictlist=0x2fae650) at joinrels.c:926
#7  0x000000000082b135 in make_join_rel (root=0x2f9e910, rel1=0x2f86e88,
rel2=0x2fd4368) at joinrels.c:760
#8  0x000000000082a643 in make_rels_by_clause_joins (root=0x2f9e910,
old_rel=0x2f86e88, other_rels_list=0x2f90148, other_rels=0x2f90160) at
joinrels.c:312
#9  0x000000000082a119 in join_search_one_level (root=0x2f9e910, level=3)
at joinrels.c:123
#10 0x000000000080cd97 in standard_join_search (root=0x2f9e910,
levels_needed=3, initial_rels=0x2f90148) at allpaths.c:3020
#11 0x000000000080cd10 in make_rel_from_joinlist (root=0x2f9e910,
joinlist=0x2fd7550) at allpaths.c:2951
#12 0x000000000080899a in make_one_rel (root=0x2f9e910, joinlist=0x2fd7550)
at allpaths.c:228
#13 0x000000000084516a in query_planner (root=0x2f9e910,
qp_callback=0x84ad85 <standard_qp_callback>, qp_extra=0x7ffea91f9140) at
planmain.c:276
#14 0x000000000084788d in grouping_planner (root=0x2f9e910,
tuple_fraction=0) at planner.c:1447
#15 0x0000000000846f56 in subquery_planner (glob=0x2fa0c08,
parse=0x2f56d30, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at
planner.c:1025
#16 0x000000000084578b in standard_planner (parse=0x2f56d30,
    query_string=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT
t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c =
t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);",
cursorOptions=2048, boundParams=0x0) at planner.c:406
#17 0x0000000000845536 in planner (parse=0x2f56d30,
    query_string=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT
t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c =
t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);",
cursorOptions=2048, boundParams=0x0) at planner.c:277
#18 0x0000000000978faf in pg_plan_query (querytree=0x2f56d30,
    query_string=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT
t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c =
t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);",
cursorOptions=2048, boundParams=0x0) at postgres.c:847
#19 0x0000000000693e50 in ExplainOneQuery (query=0x2f56d30,
cursorOptions=2048, into=0x0, es=0x2fa0920,
    queryString=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT
t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c =
t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);",
params=0x0, queryEnv=0x0) at explain.c:397
#20 0x00000000006939a5 in ExplainQuery (pstate=0x2f9e0a0, stmt=0x2f56b50,
params=0x0, dest=0x2f9e008) at explain.c:281
#21 0x0000000000981de8 in standard_ProcessUtility (pstmt=0x2fd2220,
    queryString=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT
t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c =
t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);",
readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
queryEnv=0x0, dest=0x2f9e008, qc=0x7ffea91f9aa0) at utility.c:862
#22 0x0000000000981585 in ProcessUtility (pstmt=0x2fd2220,
    queryString=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT
t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c =
t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);",
readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
queryEnv=0x0, dest=0x2f9e008, qc=0x7ffea91f9aa0) at utility.c:527
#23 0x00000000009801ba in PortalRunUtility (portal=0x2f10180,
pstmt=0x2fd2220, isTopLevel=true, setHoldSnapshot=true, dest=0x2f9e008,
qc=0x7ffea91f9aa0) at pquery.c:1155
#24 0x000000000097ff20 in FillPortalStore (portal=0x2f10180,
isTopLevel=true) at pquery.c:1028
#25 0x000000000097f883 in PortalRun (portal=0x2f10180,
count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2fd2310,
altdest=0x2fd2310, qc=0x7ffea91f9c60) at pquery.c:760
#26 0x00000000009795d1 in exec_simple_query (
    query_string=0x2eadcd0 "EXPLAIN (COSTS OFF)\nSELECT
t1.c,t2.c,t3.c,t1.d,t2.d,t3.d FROM plt1 t1 INNER JOIN plt2 t2 ON (t1.c =
t2.c AND t1.d = t2.d) LEFT JOIN plt1 t3 on (t2.c = t3.c AND t2.d = t3.d);")
at postgres.c:1214
#27 0x000000000097da8d in PostgresMain (dbname=0x2ed8068 "postgres",
username=0x2ed8048 "edb") at postgres.c:4497
#28 0x00000000008b9699 in BackendRun (port=0x2ecfd00) at postmaster.c:4560

Thanks & Regards,
Rajkumar Raghuwanshi



On Mon, Oct 11, 2021 at 11:05 AM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Thanks for the patch, it applied cleanly and fixed the reported issue.  I
> observed another case where
> In case of multi-col list partition on the same column query is not
> picking partition wise join. Is this expected?
>
> CREATE TABLE plt1 (a int, b int, c varchar) PARTITION BY LIST(c,c);
> CREATE TABLE plt1_p1 PARTITION OF plt1 FOR VALUES IN
> (('0001','0001'),('0002','0002'),('0003','0003'));
> CREATE TABLE plt1_p2 PARTITION OF plt1 FOR VALUES IN
> (('0004','0004'),('0005','0005'),('0006','0006'));
> CREATE TABLE plt1_p3 PARTITION OF plt1 DEFAULT;
> INSERT INTO plt1 SELECT i, i % 47, to_char(i % 11, 'FM0000') FROM
> generate_series(0, 500) i WHERE i % 11 NOT  IN (0,10);
> ANALYSE plt1;
> CREATE TABLE plt2 (a int, b int, c varchar) PARTITION BY LIST(c,c);
> CREATE TABLE plt2_p1 PARTITION OF plt2 FOR VALUES IN
> (('0001','0001'),('0002','0002'),('0003','0003'));
> CREATE TABLE plt2_p2 PARTITION OF plt2 FOR VALUES IN
> (('0004','0004'),('0005','0005'),('0006','0006'));
> CREATE TABLE plt2_p3 PARTITION OF plt2 DEFAULT;
> INSERT INTO plt2 SELECT i, i % 47, to_char(i % 11, 'FM0000') FROM
> generate_series(0, 500) i WHERE i % 11 NOT  IN (0,10);
> ANALYSE plt2;
> SET enable_partitionwise_join TO true;
> EXPLAIN (COSTS OFF) SELECT t1.a, t1.c, t2.a, t2.c FROM plt1 t1 INNER JOIN
> plt2 t2 ON t1.c = t2.c;
>
> postgres=# EXPLAIN (COSTS OFF) SELECT t1.a, t1.c, t2.a, t2.c FROM plt1 t1
> INNER JOIN plt2 t2 ON t1.c = t2.c;
>                  QUERY PLAN
> --------------------------------------------
>  Hash Join
>    Hash Cond: ((t1.c)::text = (t2.c)::text)
>    ->  Append
>          ->  Seq Scan on plt1_p1 t1_1
>          ->  Seq Scan on plt1_p2 t1_2
>          ->  Seq Scan on plt1_p3 t1_3
>    ->  Hash
>          ->  Append
>                ->  Seq Scan on plt2_p1 t2_1
>                ->  Seq Scan on plt2_p2 t2_2
>                ->  Seq Scan on plt2_p3 t2_3
> (11 rows)
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
>
>
>
> On Thu, Oct 7, 2021 at 6:03 PM Nitin Jadhav <nitinjadhavpostg...@gmail.com>
> wrote:
>
>> Thanks Rajkumar for testing.
>>
>> > I think it should throw an error as the partition by list has only 1
>> column but we are giving 2 values.
>>
>> I also agree that it should throw an error in the above case. Fixed the
>> issue in the attached patch. Also added related test cases to the
>> regression test suite.
>>
>>
>> > also if you see \d+ showing plt1_p1 partition value as ‘(0001,0001)’
>> instead of ('0001','0001').
>>
>> Now throwing errors in the initial stage, this case doesn't arise.
>>
>> Please share if you find any other issues.
>>
>> Thanks & Regards,
>> Nitin Jadhav
>>
>>
>>
>>
>>
>> On Thu, Oct 7, 2021 at 4:05 PM Rajkumar Raghuwanshi <
>> rajkumar.raghuwan...@enterprisedb.com> wrote:
>>
>>> Thanks Nitin,
>>>
>>> v4 patches applied cleanly and make check is passing now. While testing
>>> further I observed that if multiple values are given for a single
>>> column list partition it is not giving error instead it is changing
>>> values itself. Please find the example below.
>>>
>>> postgres=# CREATE TABLE plt1 (a int, b varchar) PARTITION BY LIST(b);
>>> CREATE TABLE
>>> postgres=# CREATE TABLE plt1_p1 PARTITION OF plt1 FOR VALUES IN
>>> (('0001','0001'),('0002','0002'));
>>> CREATE TABLE
>>> postgres=# \d+ plt1;
>>>                                           Partitioned table "public.plt1"
>>>  Column |       Type        | Collation | Nullable | Default | Storage
>>>  | Compression | Stats target | Description
>>>
>>> --------+-------------------+-----------+----------+---------+----------+-------------+--------------+-------------
>>>  a      | integer           |           |          |         | plain
>>>  |             |              |
>>>  b      | character varying |           |          |         | extended
>>> |             |              |
>>> Partition key: LIST (b)
>>> Partitions: plt1_p1 FOR VALUES IN ('(0001,0001)', '(0002,0002)')
>>>
>>> I think it should throw an error as the partition by list has only 1
>>> column but we are giving 2 values.
>>> also if you see \d+ showing plt1_p1 partition value as ‘(0001,0001)’
>>> instead of ('0001','0001').
>>>
>>> Thanks & Regards,
>>> Rajkumar Raghuwanshi
>>>
>>>
>>>
>>> On Sun, Oct 3, 2021 at 1:52 AM Nitin Jadhav <
>>> nitinjadhavpostg...@gmail.com> wrote:
>>>
>>>> > > On PG head + Nitin's v3 patch + Amit's Delta patch.  Make check is
>>>> failing with below errors.
>>>> >
>>>> > Thanks Rajkumar for testing.
>>>> >
>>>> > Here's a v2 of the delta patch that should fix both of these test
>>>> > failures.  As I mentioned in my last reply, my delta patch fixed what
>>>> > I think were problems in Nitin's v3 patch but were not complete by
>>>> > themselves.  Especially, I hadn't bothered to investigate various /*
>>>> > TODO: handle multi-column list partitioning */ sites to deal with my
>>>> > own changes.
>>>>
>>>> Thanks Rajkumar for testing and Thank you Amit for working on v2 of
>>>> the delta patch. Actually I had done the code changes related to
>>>> partition-wise join and I was in the middle of fixing the review
>>>> comments, So I could not share the patch. Anyways thanks for your
>>>> efforts.
>>>>
>>>> > I noticed that multi-column list partitions containing NULLs don't
>>>> > work correctly with partition pruning yet.
>>>> >
>>>> > create table p0 (a int, b text, c bool) partition by list (a, b, c);
>>>> > create table p01 partition of p0 for values in ((1, 1, true), (NULL,
>>>> 1, false));
>>>> > create table p02 partition of p0 for values in ((1, NULL, false));
>>>> > explain select * from p0 where a is null;
>>>> >                        QUERY PLAN
>>>> > --------------------------------------------------------
>>>> > Seq Scan on p01 p0  (cost=0.00..22.50 rows=6 width=37)
>>>> >  Filter: (a IS NULL)
>>>> > (2 rows)
>>>> >
>>>> > In the attached updated version, I've dealt with some of those such
>>>> > that at least the existing cases exercising partition pruning and
>>>> > partition wise joins now pass.
>>>>
>>>> wrt partition pruning, I have checked the output of the above case
>>>> with the v2 version of the delta patch and without that. The output
>>>> remains same. Kindly let me know if I am missing something. But I feel
>>>> the above output is correct as the partition p01 is the only partition
>>>> which contains NULL value for column a, hence it is showing "Seq scan
>>>> on p01" in the output. Kindly correct me if I am wrong. I feel the
>>>> code changes related to 'null_keys' is not required, hence not
>>>> incorporated that in the attached patch.
>>>>
>>>> wrt partition-wise join, I had run the regression test (with new cases
>>>> related to partition-wise join) on v2 of the delta patch and observed
>>>> the crash. Hence I have not incorporated the partition-wise join
>>>> related code from v2 of delta patch to main v4 patch. Instead I have
>>>> added the partition-wise join related code done by me in the attached
>>>> patch. Please share your thoughts and if possible we can improvise the
>>>> code. Rest of the changes looks good to me and I have incorporated
>>>> that in the attached patch.
>>>>
>>>>
>>>> > I guess that may be due to the following newly added code being
>>>> incomplete:
>>>> > Maybe this function needs to return a "bitmapset" of indexes, because
>>>> > multiple partitions can now contain NULL values.
>>>>
>>>> I feel this function is not required at all as we are not separating
>>>> the non null and null partitions now. Removed in the attached patch.
>>>> Also removed the "scan_null' variable from the structure
>>>> "PruneStepResult" and cleaned up the corresponding code blocks.
>>>>
>>>>
>>>> > This function name may be too generic.  Given that it is specific to
>>>> > implementing list bound de-duplication, maybe the following signature
>>>> > is more appropriate:
>>>> >
>>>> > static bool
>>>> > checkListBoundDuplicated(List *list_bounds, List *new_bound)
>>>>
>>>> Yes. The function name looks more generic. How about using
>>>> "isListBoundDuplicated()"? I have used this name in the patch. Please
>>>> let me know if that does not look correct.
>>>>
>>>>
>>>> > Also, better if the function comment mentions those parameter names,
>>>> like:
>>>> >
>>>> > "Returns TRUE if the list bound element 'new_bound' is already present
>>>> > in the target list 'list_bounds', FALSE otherwise."
>>>>
>>>> Fixed.
>>>>
>>>>
>>>> > +/*
>>>> > + * transformPartitionListBounds
>>>> > + *
>>>> > + * Converts the expressions of list partition bounds from the raw
>>>> grammar
>>>> > + * representation.
>>>> >
>>>> > A sentence about the result format would be helpful, like:
>>>> >
>>>> > The result is a List of Lists of Const nodes to account for the
>>>> > partition key possibly containing more than one column.
>>>>
>>>> Fixed.
>>>>
>>>>
>>>> > +   int             i = 0;
>>>> > +   int             j = 0;
>>>> >
>>>> > Better to initialize such loop counters closer to the loop.
>>>>
>>>> Fixed in all the places.
>>>>
>>>>
>>>> > +           colname[i] = (char *) palloc0(NAMEDATALEN * sizeof(char));
>>>> > +           colname[i] = get_attname(RelationGetRelid(parent),
>>>> > +                                    key->partattrs[i], false);
>>>> >
>>>> > The palloc in the 1st statement is wasteful, because the 2nd statement
>>>> > overwrites its pointer by the pointer to the string palloc'd by
>>>> > get_attname().
>>>>
>>>> Removed the 1st statement as it is not required.
>>>>
>>>>
>>>> > +           ListCell   *cell2 = NULL;
>>>> >
>>>> > No need to explicitly initialize the loop variable.
>>>>
>>>> Fixed in all the places.
>>>>
>>>>
>>>> > +           RowExpr     *rowexpr = NULL;
>>>> > +
>>>> > +           if (!IsA(expr, RowExpr))
>>>> > +               ereport(ERROR,
>>>> > +                       (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
>>>> > +                       errmsg("Invalid list bound specification"),
>>>> > +                       parser_errposition(pstate, exprLocation((Node
>>>> > *) spec))));
>>>> > +
>>>> > +           rowexpr = (RowExpr *) expr;
>>>> >
>>>> > It's okay to assign rowexpr at the top here instead of the dummy
>>>> > NULL-initialization and write the condition as:
>>>> >
>>>> >    if (!IsA(rowexpr, RowExpr))
>>>>
>>>> Fixed.
>>>>
>>>>
>>>> > +       if (isDuplicate)
>>>> > +           continue;
>>>> > +
>>>> > +       result = lappend(result, values);
>>>> >
>>>> > I can see you copied this style from the existing code, but how about
>>>> > writing this simply as:
>>>> >
>>>> >    if (!isDuplicate)
>>>> >        result = lappend(result, values);
>>>>
>>>> This looks good. I have changed in the patch.
>>>>
>>>>
>>>> > -/* One value coming from some (index'th) list partition */
>>>> > +/* One bound of a list partition */
>>>> > typedef struct PartitionListValue
>>>> > {
>>>> >    int         index;
>>>> > -   Datum       value;
>>>> > +   Datum      *values;
>>>> > +   bool       *isnulls;
>>>> >  } PartitionListValue;
>>>> >
>>>> > Given that this is a locally-defined struct, I wonder if it makes
>>>> > sense to rename the struct while we're at it.  Call it, say,
>>>> > PartitionListBound?
>>>>
>>>> Yes. PartitionListBound looks more appropriate and it also matches the
>>>> similar structures of the other partition strategies.
>>>>
>>>> > Also, please keep part of the existing comment that says that the
>>>> > bound belongs to index'th partition.
>>>>
>>>> Retained the old comment.
>>>>
>>>>
>>>> > + * partition_bound_accepts_nulls
>>>> > + *
>>>> > + * Returns TRUE if partition bound has NULL value, FALSE otherwise.
>>>> >  */
>>>> >
>>>> > I suggest slight rewording, as follows:
>>>> >
>>>> > "Returns TRUE if any of the partition bounds contains a NULL value,
>>>> > FALSE otherwise."
>>>>
>>>> Fixed.
>>>>
>>>>
>>>> > -   PartitionListValue *all_values;
>>>> > +   PartitionListValue **all_values;
>>>> > ...
>>>> > -   all_values = (PartitionListValue *)
>>>> > -       palloc(ndatums * sizeof(PartitionListValue));
>>>> > +   ndatums = get_list_datum_count(boundspecs, nparts);
>>>> > +   all_values = (PartitionListValue **)
>>>> > +       palloc(ndatums * sizeof(PartitionListValue *));
>>>> >
>>>> > I don't see the need to redefine all_values's pointer type.  No need
>>>> > to palloc PartitionListValue repeatedly for every datum as done
>>>> > further down as follows:
>>>> >
>>>> > +           all_values[j] = (PartitionListValue *)
>>>> > palloc(sizeof(PartitionListValue));
>>>> >
>>>> > You do need the following two though:
>>>> >
>>>> > +           all_values[j]->values = (Datum *) palloc0(key->partnatts *
>>>> > sizeof(Datum));
>>>> > +           all_values[j]->isnulls = (bool *) palloc0(key->partnatts *
>>>> > sizeof(bool));
>>>> >
>>>> > If you change the above the way I suggest, you'd also need to revert
>>>> > the following change:
>>>> >
>>>> > -   qsort_arg(all_values, ndatums, sizeof(PartitionListValue),
>>>> > +   qsort_arg(all_values, ndatums, sizeof(PartitionListValue *),
>>>> >              qsort_partition_list_value_cmp, (void *) key);
>>>> >
>>>> > +       int         orig_index = all_values[i]->index;
>>>> > +       boundinfo->datums[i] = (Datum *) palloc(key->partnatts *
>>>> sizeof(Datum));
>>>> >
>>>> > Missing a newline between these two statements.
>>>>
>>>> Fixed. Made necessary changes to keep the intent of existing code.
>>>>
>>>>
>>>> > @@ -915,7 +949,7 @@ partition_bounds_equal(int partnatts, int16
>>>> > *parttyplen, bool *parttypbyval,
>>>> >    if (b1->nindexes != b2->nindexes)
>>>> >        return false;
>>>> >
>>>> > -   if (b1->null_index != b2->null_index)
>>>> > +   if (get_partition_bound_null_index(b1) !=
>>>> > get_partition_bound_null_index(b2))
>>>> >
>>>> > As mentioned in the last message, this bit in partition_bounds_equal()
>>>> > needs to be comparing "bitmapsets" of null bound indexes, that is
>>>> > after fixing get_partition_bound_null_index() as previously mentioned.
>>>>
>>>> As mentioned earlier, removed the functionality of
>>>> get_partition_bound_null_index(), hence the above condition is not
>>>> required and removed.
>>>>
>>>> > But...
>>>> >
>>>> > @@ -988,7 +1022,22 @@ partition_bounds_equal(int partnatts, int16
>>>> > *parttyplen, bool *parttypbyval,
>>>> >                 * context.  datumIsEqual() should be simple enough to
>>>> be
>>>> >                 * safe.
>>>> >                 */
>>>> > -               if (!datumIsEqual(b1->datums[i][j], b2->datums[i][j],
>>>> > +               if (b1->isnulls)
>>>> > +                   b1_isnull = b1->isnulls[i][j];
>>>> > +               if (b2->isnulls)
>>>> > +                   b2_isnull = b2->isnulls[i][j];
>>>> > +
>>>> > +               /*
>>>> > +                * If any of the partition bound has NULL value, then
>>>> check
>>>> > +                * equality for the NULL value instead of comparing
>>>> the datums
>>>> > +                * as it does not contain valid value in case of NULL.
>>>> > +                */
>>>> > +               if (b1_isnull || b2_isnull)
>>>> > +               {
>>>> > +                   if (b1_isnull != b2_isnull)
>>>> > +                       return false;
>>>> > +               }
>>>> >
>>>> > ...if you have this in the main loop, I don't think we need the above
>>>> > code stanza which appears to implement a short-cut for this long-form
>>>> > logic.
>>>>
>>>> Yes. May be we could have ignored the above code stanza if we would
>>>> have comparing the null indexes using get_partition_bound_null_index()
>>>> in the beginning of the function. But hence we are not separating the
>>>> non null partitions and null partitions, I would like to keep the
>>>> logic in the inner loop as we are doing it for non null bound values
>>>> in the above code stanza, just to give a feel that null bound values
>>>> are also handled the same way as non null values. Please correct me if
>>>> I am wrong.
>>>>
>>>>
>>>> > +               (key->strategy != PARTITION_STRATEGY_LIST ||
>>>> > +                !src->isnulls[i][j]))
>>>> >
>>>> > I think it's better to write this condition as follows just like the
>>>> > accompanying condition involving src->kind:
>>>> >
>>>> >    (src->nulls == NULL || !src->isnulls[i][j])
>>>>
>>>> Fixed.
>>>>
>>>>
>>>> > In check_new_partition_bound():
>>>> >
>>>> > +                       Datum      *values = (Datum *)
>>>> > palloc0(key->partnatts * sizeof(Datum));
>>>> > +                       bool       *isnulls = (bool *)
>>>> > palloc0(key->partnatts * sizeof(bool));
>>>> >
>>>> > Doesn't seem like a bad idea to declare these as:
>>>> >
>>>> >     Datum    values[PARTITION_MAX_KEYS];
>>>> >    bool        isnulls[PARTITION_MAX_KEYS];
>>>>
>>>> Thanks for the suggestion. I have changed as above.
>>>>
>>>> > I looked at get_qual_for_list_multi_column() and immediately thought
>>>> > that it may be a bad idea.  I think it's better to integrate the logic
>>>> > for multi-column case into the existing function even if that makes
>>>> > the function appear more complex.  Having two functions with the same
>>>> > goal and mostly the same code is not a good idea mainly because it
>>>> > becomes a maintenance burden.
>>>>
>>>> Actually I had written a separate function because of the complexity.
>>>> Now I have understood that since the objective is same, it should be
>>>> done in a single function irrespective of complexity.
>>>>
>>>> > I have attempted a rewrite such that get_qual_for_list() now handles
>>>> > both the single-column and multi-column cases.  Changes included in
>>>> > the delta patch.  The patch updates some outputs of the newly added
>>>> > tests for multi-column list partitions, because the new code emits the
>>>> > IS NOT NULL tests a bit differently than
>>>> > get_qual_for_list_mutli_column() would.  Notably, the old approach
>>>> > would emit IS NOT NULL for every non-NULL datum matched to a given
>>>> > column, not just once for the column.  However, the patch makes a few
>>>> > other tests fail, mainly because I had to fix
>>>> > partition_bound_accepts_nulls() to handle the multi-column case,
>>>> > though didn't bother to update all callers of it to also handle the
>>>> > multi-column case correctly.  I guess that's a TODO you're going to
>>>> > deal with at some point anyway. :)
>>>>
>>>> Thank you very much for your efforts. The changes looks good to me and
>>>> I have incorporated these changes in the attached patch.
>>>>
>>>> I have completed the coding for all the TODOs and hence removed in the
>>>> patch. The naming conventions used for function/variable names varies
>>>> across the files. Some places it is like 'namesLikeThis' and in some
>>>> place it is like 'names_like_this'. I have used the naming conventions
>>>> based on the surrounding styles used. I am happy to change those if
>>>> required.
>>>>
>>>> I have verified 'make check' with the attached patch and it is working
>>>> fine.
>>>>
>>>>
>>>> Thanks & Regards,
>>>> Nitin Jadhav
>>>>
>>>>
>>>> On Mon, Sep 13, 2021 at 3:47 PM Rajkumar Raghuwanshi
>>>> <rajkumar.raghuwan...@enterprisedb.com> wrote:
>>>> >
>>>> > On PG head + Nitin's v3 patch + Amit's Delta patch.  Make check is
>>>> failing with below errors.
>>>> >
>>>> > --inherit.sql is failing with error :"ERROR:  negative bitmapset
>>>> member not allowed"
>>>> > update mlparted_tab mlp set c = 'xxx'
>>>> > from
>>>> >   (select a from some_tab union all select a+1 from some_tab) ss (a)
>>>> > where (mlp.a = ss.a and mlp.b = 'b') or mlp.a = 3;
>>>> > ERROR:  negative bitmapset member not allowed
>>>> >
>>>> > --partition_join.sql is crashing with enable_partitionwise_join set
>>>> to true.
>>>> > CREATE TABLE plt1_adv (a int, b int, c text) PARTITION BY LIST (c);
>>>> > CREATE TABLE plt1_adv_p1 PARTITION OF plt1_adv FOR VALUES IN ('0001',
>>>> '0003');
>>>> > CREATE TABLE plt1_adv_p2 PARTITION OF plt1_adv FOR VALUES IN ('0004',
>>>> '0006');
>>>> > CREATE TABLE plt1_adv_p3 PARTITION OF plt1_adv FOR VALUES IN ('0008',
>>>> '0009');
>>>> > INSERT INTO plt1_adv SELECT i, i, to_char(i % 10, 'FM0000') FROM
>>>> generate_series(1, 299) i WHERE i % 10 IN (1, 3, 4, 6, 8, 9);
>>>> > ANALYZE plt1_adv;
>>>> > CREATE TABLE plt2_adv (a int, b int, c text) PARTITION BY LIST (c);
>>>> > CREATE TABLE plt2_adv_p1 PARTITION OF plt2_adv FOR VALUES IN ('0002',
>>>> '0003');
>>>> > CREATE TABLE plt2_adv_p2 PARTITION OF plt2_adv FOR VALUES IN ('0004',
>>>> '0006');
>>>> > CREATE TABLE plt2_adv_p3 PARTITION OF plt2_adv FOR VALUES IN ('0007',
>>>> '0009');
>>>> > INSERT INTO plt2_adv SELECT i, i, to_char(i % 10, 'FM0000') FROM
>>>> generate_series(1, 299) i WHERE i % 10 IN (2, 3, 4, 6, 7, 9);
>>>> > ANALYZE plt2_adv;
>>>> > -- inner join
>>>> > EXPLAIN (COSTS OFF)
>>>> > SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2
>>>> ON (t1.a = t2.a AND t1.c = t2.c) WHERE t1.b < 10 ORDER BY t1.a;
>>>> > server closed the connection unexpectedly
>>>> >         This probably means the server terminated abnormally
>>>> >         before or while processing the request.
>>>> > connection to server was lost
>>>> >
>>>> >
>>>> > --stack-trace
>>>> > Core was generated by `postgres: edb regression [local] EXPLAIN
>>>>                                 '.
>>>> > Program terminated with signal 6, Aborted.
>>>> > #0  0x00007f7d339ba277 in raise () from /lib64/libc.so.6
>>>> > Missing separate debuginfos, use: debuginfo-install
>>>> glibc-2.17-222.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64
>>>> krb5-libs-1.15.1-19.el7.x86_64 libcom_err-1.42.9-12.el7_5.x86_64
>>>> libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-12.el7.x86_64
>>>> openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64
>>>> zlib-1.2.7-17.el7.x86_64
>>>> > (gdb) bt
>>>> > #0  0x00007f7d339ba277 in raise () from /lib64/libc.so.6
>>>> > #1  0x00007f7d339bb968 in abort () from /lib64/libc.so.6
>>>> > #2  0x0000000000b0fbc3 in ExceptionalCondition
>>>> (conditionName=0xcbda10 "part_index >= 0", errorType=0xcbd1c3
>>>> "FailedAssertion", fileName=0xcbd2fe "partbounds.c", lineNumber=1957)
>>>> >     at assert.c:69
>>>> > #3  0x0000000000892aa1 in is_dummy_partition (rel=0x19b37c0,
>>>> part_index=-1) at partbounds.c:1957
>>>> > #4  0x00000000008919bd in merge_list_bounds (partnatts=1,
>>>> partsupfunc=0x1922798, partcollation=0x1922738, outer_rel=0x19b37c0,
>>>> inner_rel=0x1922938, jointype=JOIN_INNER,
>>>> >     outer_parts=0x7fffd67751b0, inner_parts=0x7fffd67751a8) at
>>>> partbounds.c:1529
>>>> > #5  0x00000000008910de in partition_bounds_merge (partnatts=1,
>>>> partsupfunc=0x1922798, partcollation=0x1922738, outer_rel=0x19b37c0,
>>>> inner_rel=0x1922938, jointype=JOIN_INNER,
>>>> >     outer_parts=0x7fffd67751b0, inner_parts=0x7fffd67751a8) at
>>>> partbounds.c:1223
>>>> > #6  0x000000000082c41a in compute_partition_bounds (root=0x1a19ed0,
>>>> rel1=0x19b37c0, rel2=0x1922938, joinrel=0x1ab7f30,
>>>> parent_sjinfo=0x7fffd67752a0, parts1=0x7fffd67751b0,
>>>> >     parts2=0x7fffd67751a8) at joinrels.c:1644
>>>> > #7  0x000000000082bc34 in try_partitionwise_join (root=0x1a19ed0,
>>>> rel1=0x19b37c0, rel2=0x1922938, joinrel=0x1ab7f30,
>>>> parent_sjinfo=0x7fffd67752a0, parent_restrictlist=0x1ab3318)
>>>> >     at joinrels.c:1402
>>>> > #8  0x000000000082aea2 in populate_joinrel_with_paths
>>>> (root=0x1a19ed0, rel1=0x19b37c0, rel2=0x1922938, joinrel=0x1ab7f30,
>>>> sjinfo=0x7fffd67752a0, restrictlist=0x1ab3318)
>>>> >     at joinrels.c:926
>>>> > #9  0x000000000082a8f5 in make_join_rel (root=0x1a19ed0,
>>>> rel1=0x19b37c0, rel2=0x1922938) at joinrels.c:760
>>>> > #10 0x0000000000829e03 in make_rels_by_clause_joins (root=0x1a19ed0,
>>>> old_rel=0x19b37c0, other_rels_list=0x1ab2970, other_rels=0x1ab2990) at
>>>> joinrels.c:312
>>>> > #11 0x00000000008298d9 in join_search_one_level (root=0x1a19ed0,
>>>> level=2) at joinrels.c:123
>>>> > #12 0x000000000080c566 in standard_join_search (root=0x1a19ed0,
>>>> levels_needed=2, initial_rels=0x1ab2970) at allpaths.c:3020
>>>> > #13 0x000000000080c4df in make_rel_from_joinlist (root=0x1a19ed0,
>>>> joinlist=0x199d538) at allpaths.c:2951
>>>> > #14 0x000000000080816b in make_one_rel (root=0x1a19ed0,
>>>> joinlist=0x199d538) at allpaths.c:228
>>>> > #15 0x000000000084491d in query_planner (root=0x1a19ed0,
>>>> qp_callback=0x84a538 <standard_qp_callback>, qp_extra=0x7fffd6775630) at
>>>> planmain.c:276
>>>> > #16 0x0000000000847040 in grouping_planner (root=0x1a19ed0,
>>>> tuple_fraction=0) at planner.c:1447
>>>> > #17 0x0000000000846709 in subquery_planner (glob=0x19b39d8,
>>>> parse=0x1aaa290, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at
>>>> planner.c:1025
>>>> > #18 0x0000000000844f3e in standard_planner (parse=0x1aaa290,
>>>> >     query_string=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c,
>>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c
>>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;", cursorOptions=2048,
>>>> boundParams=0x0) at planner.c:406
>>>> > #19 0x0000000000844ce9 in planner (parse=0x1aaa290,
>>>> >     query_string=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c,
>>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c
>>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;", cursorOptions=2048,
>>>> boundParams=0x0) at planner.c:277
>>>> > #20 0x0000000000978483 in pg_plan_query (querytree=0x1aaa290,
>>>> >     query_string=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c,
>>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c
>>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;", cursorOptions=2048,
>>>> boundParams=0x0) at postgres.c:847
>>>> > #21 0x00000000006937fc in ExplainOneQuery (query=0x1aaa290,
>>>> cursorOptions=2048, into=0x0, es=0x19b36f0,
>>>> >     queryString=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c,
>>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c
>>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;",
>>>> >     params=0x0, queryEnv=0x0) at explain.c:397
>>>> > #22 0x0000000000693351 in ExplainQuery (pstate=0x197c410,
>>>> stmt=0x1aaa0b0, params=0x0, dest=0x197c378) at explain.c:281
>>>> > #23 0x00000000009811fa in standard_ProcessUtility (pstmt=0x1a0bfc8,
>>>> >     queryString=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c,
>>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c
>>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;",
>>>> >     readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
>>>> queryEnv=0x0, dest=0x197c378, qc=0x7fffd6775f90) at utility.c:845
>>>> > #24 0x00000000009809ec in ProcessUtility (pstmt=0x1a0bfc8,
>>>> >     queryString=0x1830fa0 "EXPLAIN (COSTS OFF)\nSELECT t1.a, t1.c,
>>>> t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c
>>>> = t2.c) WHERE t1.b < 10 ORDER BY t1.a;",
>>>> >     readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0,
>>>> queryEnv=0x0, dest=0x197c378, qc=0x7fffd6775f90) at utility.c:527
>>>> > #25 0x000000000097f636 in PortalRunUtility (portal=0x1893b40,
>>>> pstmt=0x1a0bfc8, isTopLevel=true, setHoldSnapshot=true, dest=0x197c378,
>>>> qc=0x7fffd6775f90) at pquery.c:1147
>>>> > #26 0x000000000097f3a5 in FillPortalStore (portal=0x1893b40,
>>>> isTopLevel=true) at pquery.c:1026
>>>> > #27 0x000000000097ed11 in PortalRun (portal=0x1893b40,
>>>> count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x1a0c0b8,
>>>> altdest=0x1a0c0b8, qc=0x7fffd6776150) at pquery.c:758
>>>> > #28 0x0000000000978aa5 in exec_simple_query (
>>>> >
>>>> > Thanks & Regards,
>>>> > Rajkumar Raghuwanshi
>>>> >
>>>> >
>>>> > On Fri, Sep 3, 2021 at 7:17 PM Amit Langote <amitlangot...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> On Wed, Sep 1, 2021 at 2:31 PM Amit Langote <amitlangot...@gmail.com>
>>>> wrote:
>>>> >> > On Tue, Aug 31, 2021 at 8:02 PM Nitin Jadhav
>>>> >> > <nitinjadhavpostg...@gmail.com> wrote:
>>>> >> > > The attached patch also fixes the above comments.
>>>> >> >
>>>> >> > I noticed that multi-column list partitions containing NULLs don't
>>>> >> > work correctly with partition pruning yet.
>>>> >> >
>>>> >> > create table p0 (a int, b text, c bool) partition by list (a, b,
>>>> c);
>>>> >> > create table p01 partition of p0 for values in ((1, 1, true),
>>>> (NULL, 1, false));
>>>> >> > create table p02 partition of p0 for values in ((1, NULL, false));
>>>> >> > explain select * from p0 where a is null;
>>>> >> >                        QUERY PLAN
>>>> >> > --------------------------------------------------------
>>>> >> >  Seq Scan on p01 p0  (cost=0.00..22.50 rows=6 width=37)
>>>> >> >    Filter: (a IS NULL)
>>>> >> > (2 rows)
>>>> >> >
>>>> >> > I guess that may be due to the following newly added code being
>>>> incomplete:
>>>> >> >
>>>> >> > +/*
>>>> >> > + * get_partition_bound_null_index
>>>> >> > + *
>>>> >> > + * Returns the partition index of the partition bound which
>>>> accepts NULL.
>>>> >> > + */
>>>> >> > +int
>>>> >> > +get_partition_bound_null_index(PartitionBoundInfo boundinfo)
>>>> >> > +{
>>>> >> > +   int i = 0;
>>>> >> > +   int j = 0;
>>>> >> > +
>>>> >> > +   if (!boundinfo->isnulls)
>>>> >> > +       return -1;
>>>> >> >
>>>> >> > -           if (!val->constisnull)
>>>> >> > -               count++;
>>>> >> > +   for (i = 0; i < boundinfo->ndatums; i++)
>>>> >> > +   {
>>>> >> > +       //TODO: Handle for multi-column cases
>>>> >> > +       for (j = 0; j < 1; j++)
>>>> >> > +       {
>>>> >> > +           if (boundinfo->isnulls[i][j])
>>>> >> > +               return boundinfo->indexes[i];
>>>> >> >         }
>>>> >> >     }
>>>> >> >
>>>> >> > +   return -1;
>>>> >> > +}
>>>> >> >
>>>> >> > Maybe this function needs to return a "bitmapset" of indexes,
>>>> because
>>>> >> > multiple partitions can now contain NULL values.
>>>> >> >
>>>> >> > Some other issues I noticed and suggestions for improvement:
>>>> >> >
>>>> >> > +/*
>>>> >> > + * checkForDuplicates
>>>> >> > + *
>>>> >> > + * Returns TRUE if the list bound element is already present in
>>>> the list of
>>>> >> > + * list bounds, FALSE otherwise.
>>>> >> > + */
>>>> >> > +static bool
>>>> >> > +checkForDuplicates(List *source, List *searchElem)
>>>> >> >
>>>> >> > This function name may be too generic.  Given that it is specific
>>>> to
>>>> >> > implementing list bound de-duplication, maybe the following
>>>> signature
>>>> >> > is more appropriate:
>>>> >> >
>>>> >> > static bool
>>>> >> > checkListBoundDuplicated(List *list_bounds, List *new_bound)
>>>> >> >
>>>> >> > Also, better if the function comment mentions those parameter
>>>> names, like:
>>>> >> >
>>>> >> > "Returns TRUE if the list bound element 'new_bound' is already
>>>> present
>>>> >> > in the target list 'list_bounds', FALSE otherwise."
>>>> >> >
>>>> >> > +/*
>>>> >> > + * transformPartitionListBounds
>>>> >> > + *
>>>> >> > + * Converts the expressions of list partition bounds from the raw
>>>> grammar
>>>> >> > + * representation.
>>>> >> >
>>>> >> > A sentence about the result format would be helpful, like:
>>>> >> >
>>>> >> > The result is a List of Lists of Const nodes to account for the
>>>> >> > partition key possibly containing more than one column.
>>>> >> >
>>>> >> > +   int             i = 0;
>>>> >> > +   int             j = 0;
>>>> >> >
>>>> >> > Better to initialize such loop counters closer to the loop.
>>>> >> >
>>>> >> > +           colname[i] = (char *) palloc0(NAMEDATALEN *
>>>> sizeof(char));
>>>> >> > +           colname[i] = get_attname(RelationGetRelid(parent),
>>>> >> > +                                    key->partattrs[i], false);
>>>> >> >
>>>> >> > The palloc in the 1st statement is wasteful, because the 2nd
>>>> statement
>>>> >> > overwrites its pointer by the pointer to the string palloc'd by
>>>> >> > get_attname().
>>>> >> >
>>>> >> > +           ListCell   *cell2 = NULL;
>>>> >> >
>>>> >> > No need to explicitly initialize the loop variable.
>>>> >> >
>>>> >> > +           RowExpr     *rowexpr = NULL;
>>>> >> > +
>>>> >> > +           if (!IsA(expr, RowExpr))
>>>> >> > +               ereport(ERROR,
>>>> >> > +                       (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
>>>> >> > +                       errmsg("Invalid list bound specification"),
>>>> >> > +                       parser_errposition(pstate,
>>>> exprLocation((Node
>>>> >> > *) spec))));
>>>> >> > +
>>>> >> > +           rowexpr = (RowExpr *) expr;
>>>> >> >
>>>> >> > It's okay to assign rowexpr at the top here instead of the dummy
>>>> >> > NULL-initialization and write the condition as:
>>>> >> >
>>>> >> >     if (!IsA(rowexpr, RowExpr))
>>>> >> >
>>>> >> > +       if (isDuplicate)
>>>> >> > +           continue;
>>>> >> > +
>>>> >> > +       result = lappend(result, values);
>>>> >> >
>>>> >> > I can see you copied this style from the existing code, but how
>>>> about
>>>> >> > writing this simply as:
>>>> >> >
>>>> >> >     if (!isDuplicate)
>>>> >> >         result = lappend(result, values);
>>>> >> >
>>>> >> > -/* One value coming from some (index'th) list partition */
>>>> >> > +/* One bound of a list partition */
>>>> >> >  typedef struct PartitionListValue
>>>> >> >  {
>>>> >> >     int         index;
>>>> >> > -   Datum       value;
>>>> >> > +   Datum      *values;
>>>> >> > +   bool       *isnulls;
>>>> >> >  } PartitionListValue;
>>>> >> >
>>>> >> > Given that this is a locally-defined struct, I wonder if it makes
>>>> >> > sense to rename the struct while we're at it.  Call it, say,
>>>> >> > PartitionListBound?
>>>> >> >
>>>> >> > Also, please keep part of the existing comment that says that the
>>>> >> > bound belongs to index'th partition.
>>>> >> >
>>>> >> > Will send more comments in a bit...
>>>> >>
>>>> >> + * partition_bound_accepts_nulls
>>>> >> + *
>>>> >> + * Returns TRUE if partition bound has NULL value, FALSE otherwise.
>>>> >>   */
>>>> >>
>>>> >> I suggest slight rewording, as follows:
>>>> >>
>>>> >> "Returns TRUE if any of the partition bounds contains a NULL value,
>>>> >> FALSE otherwise."
>>>> >>
>>>> >> -   PartitionListValue *all_values;
>>>> >> +   PartitionListValue **all_values;
>>>> >> ...
>>>> >> -   all_values = (PartitionListValue *)
>>>> >> -       palloc(ndatums * sizeof(PartitionListValue));
>>>> >> +   ndatums = get_list_datum_count(boundspecs, nparts);
>>>> >> +   all_values = (PartitionListValue **)
>>>> >> +       palloc(ndatums * sizeof(PartitionListValue *));
>>>> >>
>>>> >> I don't see the need to redefine all_values's pointer type.  No need
>>>> >> to palloc PartitionListValue repeatedly for every datum as done
>>>> >> further down as follows:
>>>> >>
>>>> >> +           all_values[j] = (PartitionListValue *)
>>>> >> palloc(sizeof(PartitionListValue));
>>>> >>
>>>> >> You do need the following two though:
>>>> >>
>>>> >> +           all_values[j]->values = (Datum *) palloc0(key->partnatts
>>>> *
>>>> >> sizeof(Datum));
>>>> >> +           all_values[j]->isnulls = (bool *) palloc0(key->partnatts
>>>> *
>>>> >> sizeof(bool));
>>>> >>
>>>> >> If you change the above the way I suggest, you'd also need to revert
>>>> >> the following change:
>>>> >>
>>>> >> -   qsort_arg(all_values, ndatums, sizeof(PartitionListValue),
>>>> >> +   qsort_arg(all_values, ndatums, sizeof(PartitionListValue *),
>>>> >>               qsort_partition_list_value_cmp, (void *) key);
>>>> >>
>>>> >> +       int         orig_index = all_values[i]->index;
>>>> >> +       boundinfo->datums[i] = (Datum *) palloc(key->partnatts *
>>>> sizeof(Datum));
>>>> >>
>>>> >> Missing a newline between these two statements.
>>>> >>
>>>> >> BTW, I noticed that the boundDatums variable is no longer used in
>>>> >> create_list_bounds.  I traced back its origin and found that a recent
>>>> >> commit 53d86957e98 introduced it to implement an idea to reduce the
>>>> >> finer-grained pallocs that were being done in create_list_bounds().
>>>> I
>>>> >> don't think that this patch needs to throw away that work.  You can
>>>> >> make it work as the attached delta patch that applies on top of v3.
>>>> >> Please check.
>>>> >>
>>>> >> @@ -915,7 +949,7 @@ partition_bounds_equal(int partnatts, int16
>>>> >> *parttyplen, bool *parttypbyval,
>>>> >>     if (b1->nindexes != b2->nindexes)
>>>> >>         return false;
>>>> >>
>>>> >> -   if (b1->null_index != b2->null_index)
>>>> >> +   if (get_partition_bound_null_index(b1) !=
>>>> >> get_partition_bound_null_index(b2))
>>>> >>
>>>> >> As mentioned in the last message, this bit in
>>>> partition_bounds_equal()
>>>> >> needs to be comparing "bitmapsets" of null bound indexes, that is
>>>> >> after fixing get_partition_bound_null_index() as previously
>>>> mentioned.
>>>> >>
>>>> >> But...
>>>> >>
>>>> >> @@ -988,7 +1022,22 @@ partition_bounds_equal(int partnatts, int16
>>>> >> *parttyplen, bool *parttypbyval,
>>>> >>                  * context.  datumIsEqual() should be simple enough
>>>> to be
>>>> >>                  * safe.
>>>> >>                  */
>>>> >> -               if (!datumIsEqual(b1->datums[i][j], b2->datums[i][j],
>>>> >> +               if (b1->isnulls)
>>>> >> +                   b1_isnull = b1->isnulls[i][j];
>>>> >> +               if (b2->isnulls)
>>>> >> +                   b2_isnull = b2->isnulls[i][j];
>>>> >> +
>>>> >> +               /*
>>>> >> +                * If any of the partition bound has NULL value,
>>>> then check
>>>> >> +                * equality for the NULL value instead of comparing
>>>> the datums
>>>> >> +                * as it does not contain valid value in case of
>>>> NULL.
>>>> >> +                */
>>>> >> +               if (b1_isnull || b2_isnull)
>>>> >> +               {
>>>> >> +                   if (b1_isnull != b2_isnull)
>>>> >> +                       return false;
>>>> >> +               }
>>>> >>
>>>> >> ...if you have this in the main loop, I don't think we need the above
>>>> >> code stanza which appears to implement a short-cut for this long-form
>>>> >> logic.
>>>> >>
>>>> >> +               (key->strategy != PARTITION_STRATEGY_LIST ||
>>>> >> +                !src->isnulls[i][j]))
>>>> >>
>>>> >> I think it's better to write this condition as follows just like the
>>>> >> accompanying condition involving src->kind:
>>>> >>
>>>> >>     (src->nulls == NULL || !src->isnulls[i][j])
>>>> >>
>>>> >> (Skipped looking at merge_list_bounds() and related changes for now
>>>> as
>>>> >> I see a lot of TODOs remain to be done.)
>>>> >>
>>>> >> In check_new_partition_bound():
>>>> >>
>>>> >> +                       Datum      *values = (Datum *)
>>>> >> palloc0(key->partnatts * sizeof(Datum));
>>>> >> +                       bool       *isnulls = (bool *)
>>>> >> palloc0(key->partnatts * sizeof(bool));
>>>> >>
>>>> >> Doesn't seem like a bad idea to declare these as:
>>>> >>
>>>> >>     Datum    values[PARTITION_MAX_KEYS];
>>>> >>     bool        isnulls[PARTITION_MAX_KEYS];
>>>> >>
>>>> >>
>>>> >> I looked at get_qual_for_list_multi_column() and immediately thought
>>>> >> that it may be a bad idea.  I think it's better to integrate the
>>>> logic
>>>> >> for multi-column case into the existing function even if that makes
>>>> >> the function appear more complex.  Having two functions with the same
>>>> >> goal and mostly the same code is not a good idea mainly because it
>>>> >> becomes a maintenance burden.
>>>> >>
>>>> >> I have attempted a rewrite such that get_qual_for_list() now handles
>>>> >> both the single-column and multi-column cases.  Changes included in
>>>> >> the delta patch.  The patch updates some outputs of the newly added
>>>> >> tests for multi-column list partitions, because the new code emits
>>>> the
>>>> >> IS NOT NULL tests a bit differently than
>>>> >> get_qual_for_list_mutli_column() would.  Notably, the old approach
>>>> >> would emit IS NOT NULL for every non-NULL datum matched to a given
>>>> >> column, not just once for the column.  However, the patch makes a few
>>>> >> other tests fail, mainly because I had to fix
>>>> >> partition_bound_accepts_nulls() to handle the multi-column case,
>>>> >> though didn't bother to update all callers of it to also handle the
>>>> >> multi-column case correctly.  I guess that's a TODO you're going to
>>>> >> deal with at some point anyway. :)
>>>> >>
>>>> >> I still have more than half of v3 left to look at, so will continue
>>>> >> looking.   In the meantime, please check the changes I suggested,
>>>> >> including the delta patch, and let me know your thoughts.
>>>> >>
>>>> >> --
>>>> >> Amit Langote
>>>> >> EDB: http://www.enterprisedb.com
>>>>
>>>

Re: Multi-Column List Partitioning

Reply via email to