On Mon, Apr 21, 2025 at 2:42 PM jian he <jian.universal...@gmail.com> wrote:
> On Mon, Apr 21, 2025 at 4:02 PM Fujii Masao <masao.fu...@oss.nttdata.com> > wrote: > > > > > > > > On 2025/04/21 11:30, jian he wrote: > > > hi. > > > While trying to make the virtual generated column be part of the > partition key, > > > I found this bug. > > I noticed that the check for a generated column in the partition key expression already exists a few lines below. Had that check placed before the if .. else ... block, we wouldn't have had this problem. if (IsA(expr, Var) && ((Var *) expr)->varattno > 0) { ... } else { I admit that pull_varattnos() + loop through bms_next_member() is a bit wasteful when the expression is just a Var. But probably it's worth taking that hit for the sake of avoiding code duplication. Further, I believe that the two loops: one for system attribute check and the other for generated attribute check can be combined, something like attached. Then it's not as wasteful as it looks and we probably save a loop. While looking at this I realised that a generated column may end up being part of the partition key if the partition key expression contains a whole row reference. Attached patch also has a fix and a testcase for the same. PARTITION BY RANGE ((gtest_part_key is not null)) expression in the test is kinda silly, but it tests the whole-row reference as part of an expression. I haven't looked for more sensible expressions. I have included your original tests, but ended up rewriting code. Please let me know what do you think. -- Best Wishes, Ashutosh Bapat
From 2adb796b1fdfa5e72c7302e4ecaf7d69b105d0f3 Mon Sep 17 00:00:00 2001 From: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Date: Mon, 21 Apr 2025 18:06:58 +0530 Subject: [PATCH] Tighten check for generated column in partition key expression A generated column may end up being part of the partition key expression, if it's specified as an expression e.g. "(<generated column name>)" or if the partition key expression contains a whole-row reference, even though we do not allow a generated column to be part of partition key expression. Fix this hole. Discussion: https://postgr.es/m/CACJufxF=WDGthXSAQr9thYUsfx_1_t9E6N8tE3B8EqXcVoVfQw@mail.gmail.com --- src/backend/commands/tablecmds.c | 96 +++++++++++-------- .../regress/expected/generated_stored.out | 15 +++ .../regress/expected/generated_virtual.out | 15 +++ src/test/regress/sql/generated_stored.sql | 3 + src/test/regress/sql/generated_virtual.sql | 3 + 5 files changed, 92 insertions(+), 40 deletions(-) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 265b1c397fb..42610f50b0b 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -19768,6 +19768,8 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu /* Expression */ Node *expr = pelem->expr; char partattname[16]; + Bitmapset *expr_attrs = NULL; + int i; Assert(expr != NULL); atttype = exprType(expr); @@ -19791,43 +19793,42 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu while (IsA(expr, CollateExpr)) expr = (Node *) ((CollateExpr *) expr)->arg; - if (IsA(expr, Var) && - ((Var *) expr)->varattno > 0) + /* + * Examine all the columns in the partition key expression. When + * the whole-row reference is present, examine all the columns of + * the partitioned table. + */ + pull_varattnos(expr, 1, &expr_attrs); + if (bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs)) { - /* - * User wrote "(column)" or "(column COLLATE something)". - * Treat it like simple attribute anyway. - */ - partattrs[attn] = ((Var *) expr)->varattno; + expr_attrs = bms_add_range(expr_attrs, + 1 - FirstLowInvalidHeapAttributeNumber, + RelationGetNumberOfAttributes(rel) - FirstLowInvalidHeapAttributeNumber); + expr_attrs = bms_del_member(expr_attrs, 0 - FirstLowInvalidHeapAttributeNumber); } - else - { - Bitmapset *expr_attrs = NULL; - int i; - partattrs[attn] = 0; /* marks the column as expression */ - *partexprs = lappend(*partexprs, expr); + i = -1; + while ((i = bms_next_member(expr_attrs, i)) >= 0) + { + AttrNumber attno = i + FirstLowInvalidHeapAttributeNumber; - /* - * transformPartitionSpec() should have already rejected - * subqueries, aggregates, window functions, and SRFs, based - * on the EXPR_KIND_ for partition expressions. - */ + Assert(attno != 0); /* * Cannot allow system column references, since that would * make partition routing impossible: their values won't be * known yet when we need to do that. */ - pull_varattnos(expr, 1, &expr_attrs); - for (i = FirstLowInvalidHeapAttributeNumber; i < 0; i++) - { - if (bms_is_member(i - FirstLowInvalidHeapAttributeNumber, - expr_attrs)) - ereport(ERROR, - (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), - errmsg("partition key expressions cannot contain system column references"))); - } + if (attno < 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("partition key expressions cannot contain system column references"))); + + /* + * We do not check dropped columns explicitly since they will + * be eliminated when expanding the the whole row expression + * anyway. + */ /* * Stored generated columns cannot work: They are computed @@ -19837,20 +19838,35 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu * SET EXPRESSION would need to check whether the column is * used in partition keys). Seems safer to prohibit for now. */ - i = -1; - while ((i = bms_next_member(expr_attrs, i)) >= 0) - { - AttrNumber attno = i + FirstLowInvalidHeapAttributeNumber; + if (TupleDescAttr(RelationGetDescr(rel), attno - 1)->attgenerated) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("cannot use generated column in partition key"), + errdetail("Column \"%s\" is a generated column.", + get_attname(RelationGetRelid(rel), attno, false)), + parser_errposition(pstate, pelem->location))); + } - if (attno > 0 && - TupleDescAttr(RelationGetDescr(rel), attno - 1)->attgenerated) - ereport(ERROR, - (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), - errmsg("cannot use generated column in partition key"), - errdetail("Column \"%s\" is a generated column.", - get_attname(RelationGetRelid(rel), attno, false)), - parser_errposition(pstate, pelem->location))); - } + if (IsA(expr, Var) && + ((Var *) expr)->varattno > 0) + { + + /* + * User wrote "(column)" or "(column COLLATE something)". + * Treat it like simple attribute anyway. + */ + partattrs[attn] = ((Var *) expr)->varattno; + } + else + { + partattrs[attn] = 0; /* marks the column as expression */ + *partexprs = lappend(*partexprs, expr); + + /* + * transformPartitionSpec() should have already rejected + * subqueries, aggregates, window functions, and SRFs, based + * on the EXPR_KIND_ for partition expressions. + */ /* * Preprocess the expression before checking for mutability. diff --git a/src/test/regress/expected/generated_stored.out b/src/test/regress/expected/generated_stored.out index 16de30ab191..eb981b0689b 100644 --- a/src/test/regress/expected/generated_stored.out +++ b/src/test/regress/expected/generated_stored.out @@ -1074,11 +1074,26 @@ ERROR: cannot use generated column in partition key LINE 1: ...ENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3); ^ DETAIL: Column "f3" is a generated column. +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3)); +ERROR: cannot use generated column in partition key +LINE 1: ...ERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3)); + ^ +DETAIL: Column "f3" is a generated column. CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3)); ERROR: cannot use generated column in partition key LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3)); ^ DETAIL: Column "f3" is a generated column. +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key)); +ERROR: cannot use generated column in partition key +LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_par... + ^ +DETAIL: Column "f3" is a generated column. +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key is not null)); +ERROR: cannot use generated column in partition key +LINE 1: ...ED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_par... + ^ +DETAIL: Column "f3" is a generated column. -- ALTER TABLE ... ADD COLUMN CREATE TABLE gtest25 (a int PRIMARY KEY); INSERT INTO gtest25 VALUES (3), (4); diff --git a/src/test/regress/expected/generated_virtual.out b/src/test/regress/expected/generated_virtual.out index 6300e7c1d96..75a1730e4d2 100644 --- a/src/test/regress/expected/generated_virtual.out +++ b/src/test/regress/expected/generated_virtual.out @@ -1027,11 +1027,26 @@ ERROR: cannot use generated column in partition key LINE 1: ...NERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3); ^ DETAIL: Column "f3" is a generated column. +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3)); +ERROR: cannot use generated column in partition key +LINE 1: ...RATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3)); + ^ +DETAIL: Column "f3" is a generated column. CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3)); ERROR: cannot use generated column in partition key LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3)); ^ DETAIL: Column "f3" is a generated column. +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key)); +ERROR: cannot use generated column in partition key +LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par... + ^ +DETAIL: Column "f3" is a generated column. +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key is not null)); +ERROR: cannot use generated column in partition key +LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par... + ^ +DETAIL: Column "f3" is a generated column. -- ALTER TABLE ... ADD COLUMN CREATE TABLE gtest25 (a int PRIMARY KEY); INSERT INTO gtest25 VALUES (3), (4); diff --git a/src/test/regress/sql/generated_stored.sql b/src/test/regress/sql/generated_stored.sql index 4ec155f2da9..296429fbb23 100644 --- a/src/test/regress/sql/generated_stored.sql +++ b/src/test/regress/sql/generated_stored.sql @@ -500,7 +500,10 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3; -- generated columns in partition key (not allowed) CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE (f3); +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3)); CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((f3 * 3)); +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key)); +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) STORED) PARTITION BY RANGE ((gtest_part_key is not null)); -- ALTER TABLE ... ADD COLUMN CREATE TABLE gtest25 (a int PRIMARY KEY); diff --git a/src/test/regress/sql/generated_virtual.sql b/src/test/regress/sql/generated_virtual.sql index b4eedeee2fb..033a251fb20 100644 --- a/src/test/regress/sql/generated_virtual.sql +++ b/src/test/regress/sql/generated_virtual.sql @@ -534,7 +534,10 @@ SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3; -- generated columns in partition key (not allowed) CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE (f3); +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3)); CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((f3 * 3)); +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key)); +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_part_key is not null)); -- ALTER TABLE ... ADD COLUMN CREATE TABLE gtest25 (a int PRIMARY KEY); base-commit: 80b727eb9deab589a8648750bc20f1623d5acd3e -- 2.34.1