Re: bug: virtual generated column can be partition key

2025-04-22 Thread Ashutosh Bapat
Sorry I missed this email while sending the patches - our emails crossed in
the air.

On Tue, Apr 22, 2025 at 2:15 PM jian he  wrote:

> On Tue, Apr 22, 2025 at 3:02 PM jian he 
> wrote:
> > Other than that, it looks good to me for fixing this bug.
>
> The error message seems not that intuitive.
>
> +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
> GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
> ((gtest_part_key));
> +ERROR:  cannot use generated column in partition key
> +LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
> + ^
> +DETAIL:  Column "f3" is a generated column.
>

Yes. It's not apparent where does f3 appear in the partition key, to a lay
users. Users who explicitly use whole row expression in a partition key,
would know that a whole row expression contains all columns. And the error
location points to the whole-row reference. So the current state isn't that
bad.


>
> with the attached patch. now,
> +CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
> GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
> ((gtest_part_key));
> ERROR:  cannot use generated column in partition key
> LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
>  ^
> DETAIL:  Generated column "f3" is part of the partition key of
> relation "gtest_part_key"
>
>
The DETAIL here is just mentioning what's already known from the command,
so the change in the DETAIL may not be useful.

If I understand your intention correctly, we have to mention something to
the effect "partition key expression contains a whole-row reference which
in turn contains a generated column." But that might be too verbose.

-- 
Best Wishes,
Ashutosh Bapat


amcheck support for BRIN indexes

2025-04-22 Thread Arseniy Mukhin
Hi,

It's not for v18, just wanted to share with the community and register
it in the upcoming Commitfest 2025-07.

Here is the patch with amcheck support for BRIN indexes.

Patch uses amcheck common infrastructure that was introduced in [1].
It works and I deleted all the code that
I copied from btree check at first. Great.

During the check we hold ShareUpdateExclusiveLock, so we don't block
regular reads/inserts/updates
and the same time range summarizations/desummarizations are impossible
during the check which simplifies check logic.
While checking we do ereport(ERROR) on the first issue, the same way
as btree, gin checks do.

There are two parts:

First part is 'index structure check':
1) Meta page check
2) Revmap check. Walk revmap and check every valid revmap item points
to the index tuple with the expected range blkno,
and index tuple is consistent with the tuple description. Also it's
not documented, but known from the brin code that
for every empty range we should have allnulls = true, hasnulls =
false. So this is also checked here.
3) Regular pages check. Walk regular pages and check that every index
tuple has a corresponding revmap item that points to it.
We don't check index tuple structures here, because it was already
done in 2 steps.

Regular pages check is optional. Orphan index tuple errors in this
check doesn't necessary mean that index is corrupted,
but AFAIS brin is not supposed to have such orphan index tuples now,
so if we encounter one than probably something
wrong with the index.

Second part is 'all consistent check':
We check all heap tuples are consistent with the index. It's the most
expensive part and it's optional.
Here we call consistent functions for every heap tuple. Also here we
check that fields like 'has_nulls', 'all_nulls',
'empty_range' are consistent with what we see in the heap.


There are two patch files:

0001-brin-refactoring.patch

It's just two tiny changes in the brin code.
1) For index tuple structure check we need to know how on-disk index
tuples look like.
Function that lets you get it 'brtuple_disk_tupdesc' is internal. This
patch makes it extern.
2) Create macros for BRIN_MAX_PAGES_PER_RANGE. For the meta page check.

0002-amechek-brin-support.patch

It's check functionality itself + regression tests + amcheck extension updates.


Some open questions:

1) How to get the correct strategy number for the scan key in "all
heap consistent" check. The consistent function wants
a scan key, and to generate it we have to pick a strategy number. We
can't just use the same strategy number for all
indexes because its meaning differs from opclass to opclass (for
example equal strategy number is 1 in Bloom
and 3 in min_max). We also can't pick an operator and use it for every
check, because opclasses don't have any requirements
about what operators they should support. The solution was to let user
to define check operator
(parameter consistent_operator_names). It's an array as we can have a
multicolumn index. We can use '=' as default check
operator, because AFAIS '=' operator is supported by all core
opclasses except 'box_inclusion_ops', and IMHO it's the
most obvious candidate for such a check. So if param
'consistent_operator_names' is an empty array (param default value),
then for all attributes we use operator '='. In most cases operator
'=' does the job and you don't need to worry about
consistent_operator_names param. Not sure about it, what do you think?

2) The current implementation of "all heap consistent" uses the scan
of the entire heap. So if we have a lot of
unsummarized ranges, we do a lot of wasted work because we can't use
the tuples that belong to the unsummarized ranges.
Instead of having one scan for the entire heap, we can walk the
revmap, take only the summarized ranges, and
scan only the pages that belong to those ranges. So we have one scan
per range. This approach helps us avoid touching
those heap tuples that we can't use for index checks. But I'm not sure
if we should to worry about that because every
autovacuum summarizes all the unsummarized ranges, so don't expect a
lot of unsummarized ranges on average.

TODO list:

1) add TAP tests
2) update SGML docs for amcheck (think it's better to do after patch
is reviewed and more or less finalized)
3) pg_amcheck integration

Big thanks to Tomas Vondra for the first patch idea and initial review.


[1] 
https://www.postgresql.org/message-id/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru


Best regards,
Arseniy Mukhin
From 7de60e8da824a97327a5c28ccd9d8a384c07b997 Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin 
Date: Tue, 22 Apr 2025 11:00:36 +0300
Subject: [PATCH 2/2] amcheck - brin support

---
 contrib/amcheck/Makefile|5 +-
 contrib/amcheck/amcheck--1.5--1.6.sql   |   20 +
 contrib/amcheck/amcheck.control |2 +-
 contrib/amcheck/expected/check_brin.out |  134 +++
 contrib/amcheck/meson.build |3 +
 contrib/amcheck/sql/check_brin.sql  |  10

Re: Check for tuplestorestate nullness before dereferencing

2025-04-22 Thread Nikolay Shaplov
В письме от пятница, 18 октября 2024 г. 02:28:22 MSK пользователь David Rowley 
написал:

> It would be good to know if the optimisation added in d2c555ee5 ever
> applies with today's code. If it does apply, we should likely add a
> test case for it and if it never does, then we should just remove the
> optimisation and always create the tuplestore when it's NULL.

That's sounds reasonable. It looks like that removing "node->eflags != 0" check 
is more logical then adding not null check.

> If it does apply, we should likely add a test case for it 

Do you have any idea how to test it?


-- 
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH] Check for TupleTableSlot nullness before dereferencing

2025-04-22 Thread Nikolay Shaplov
В письме от пятница, 13 декабря 2024 г. 11:54:35 MSK пользователь Alexander 
Kuznetsov написал:
> Hello,
> 
> ping. What do you think about reasoning below? Maybe we should consider
> proposing different patch for removing redundant check there?

Hi!

Please, pay attention that commitfest entry for this patch
https://commitfest.postgresql.org/patch/5662/
reports problems with windows build.

There is a small chance that this is flase alarm, sometimes checkers fails for 
their own reason. But most probably this is persistent error, and if it is, 
this problem should be researched first of all, and fixed. Only after that 
there 
there can be any discussion if this null-related problem should be fixed or 
not.


> 
> 09.10.2024 18:23, Alexander Kuznetsov wrote:
> > 03.10.2024 12:48, Daniel Gustafsson wrote:
> >>  From a quick reading we can only reach there after evaluating an
> >> expression, so can it really be null though?  This code hasn't changed
> >> all that much since 2009, if there was a reachable segfault on a null
> >> pointer deref I have a feeling we'd heard about it by now so some extra
> >> care seems warranted to ensure it's not a static analyzer false
> >> positive.
> > 
> > Thanks for your response!
> > It seems to me that dereferencing is possible under the following
> > scenario:
> > [...]
> > This entire reasoning is based on the assumption that slot2 can
> > theoretically be NULL, as there is such a check at line 968. Is it
> > possible that no errors have occurred because this condition has always
> > been satisfied and is, perhaps, redundant, or maybe I'm misunderstanding
> > something?


-- 
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su


signature.asc
Description: This is a digitally signed message part.


Re: jsonapi: scary new warnings with LTO enabled

2025-04-22 Thread Daniel Gustafsson
> On 21 Apr 2025, at 20:58, Jacob Champion  
> wrote:

> Personally, I'm fine with can't-fail APIs, as long as they're
> documented as such. (I think the deferred error API was probably
> chosen so that src/common JSON clients could be written without a lot
> of pain?)

My preference is that no operation can silently work on a failed object, but
it's not a hill (even more so given where we are in the cycle).  The attached
v3 allocates via the JSON api, no specific error handling should be required as
it's already handled today.

--
Daniel Gustafsson



v3-0001-Allocate-JsonLexContexts-on-the-heap-to-avoid-war.patch
Description: Binary data


Re: pgsql: Add function to get memory context stats for processes

2025-04-22 Thread Daniel Gustafsson
> On 17 Apr 2025, at 16:42, Robert Haas  wrote:
> 
> On Tue, Apr 15, 2025 at 6:11 AM Andres Freund  wrote:
>> There very well could be a CFI - but it better be somewhere where the
>> in-memory state is consistent. Otherwise an error inside raised in the CFI
>> would lead the in-memory state inconsistent which then would cause problems
>> when cleaning up the dsa during resowner release or process exit.
>> 
>> What am I missing here?
> 
> I think maybe you're only thinking about gathering the data. What
> about publishing it? If the DSA code were interrupted at a CFI and the
> interrupting code went and tried to perform a DSA allocation to store
> the resulting data and then returned to the interrupted DSA operation,
> would you expect the code to cope with that? I do not believe we have
> anywhere enough guarantees about reentrancy for that to be safe.

Do you mean that an interrupt handler will make DSA allocations?  I don't think
that would be something we'd want to allow regardless of this patch.  Or did I
misunderstand the above?

--
Daniel Gustafsson





disallow alter individual column if partition key contains wholerow reference

2025-04-22 Thread jian he
hi.

while reviewing disallow generated column as partition key in [1],
I found out this bug.

demo:
drop table if exists t4;
CREATE TABLE t4(f1 int, f2 bigint) PARTITION BY list ((t4));
create table t4_1 partition of t4 for values in ((1,2));
alter table t4 alter column f2 set data type text using f2;

insert into t4 select 1, '2';
ERROR: invalid memory alloc request size 18446744073709551615

turns out the fix seems pretty simple, mainly on has_partition_attrs.
has_partition_attrs is used to
Checks if any of the 'attnums' is a partition key attribute for rel.
if partition keys have column references, then has_partition_attrs
should return true.

[1]: 
https://postgr.es/m/CACJufxF=wdgthxsaqr9thyusfx_1_t9e6n8te3b8eqxcvov...@mail.gmail.com
From 7a4c9bc1cb65c3aedc92a4bf31352ba19f1135b9 Mon Sep 17 00:00:00 2001
From: jian he 
Date: Tue, 22 Apr 2025 19:37:24 +0800
Subject: [PATCH v1 1/1] fix wholerow as partition key reference.

If the partition key contains wholerow reference, individual columns cannot be altered.

discussion: https://postgr.es/m/
---
 src/backend/catalog/partition.c   | 12 
 src/test/regress/expected/alter_table.out | 14 ++
 src/test/regress/sql/alter_table.sql  |  7 +++
 3 files changed, 33 insertions(+)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 93d72157a46..e0ae3d2abe1 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -290,6 +290,18 @@ has_partition_attrs(Relation rel, Bitmapset *attnums, bool *used_in_expr)
 
 			/* Find all attributes referenced */
 			pull_varattnos(expr, 1, &expr_attrs);
+
+			/*
+			 * If this is a wholerow reference, then assume unconditionally that
+			 * 'attnums' is included as part of the partition key attributes.
+			*/
+			if (bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs))
+			{
+if (used_in_expr)
+	*used_in_expr = true;
+return true;
+			}
+
 			partexprs_item = lnext(partexprs, partexprs_item);
 
 			if (bms_overlap(attnums, expr_attrs))
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 476266e3f4b..fef20e5ae7c 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -3984,6 +3984,20 @@ LINE 1: ALTER TABLE partitioned ALTER COLUMN b TYPE char(5);
 ALTER TABLE partitioned SET (fillfactor=100);
 ERROR:  cannot specify storage parameters for a partitioned table
 HINT:  Specify storage parameters for its leaf partitions instead.
+-- partition expression is whole row
+CREATE TABLE partitioned1(a int, b int) PARTITION BY list((partitioned1));
+ALTER TABLE partitioned1 DROP COLUMN a; --error
+ERROR:  cannot drop column "a" because it is part of the partition key of relation "partitioned1"
+ALTER TABLE partitioned1 DROP COLUMN b; --error
+ERROR:  cannot drop column "b" because it is part of the partition key of relation "partitioned1"
+ALTER TABLE partitioned1 ALTER COLUMN a TYPE text; --error
+ERROR:  cannot alter column "a" because it is part of the partition key of relation "partitioned1"
+LINE 1: ALTER TABLE partitioned1 ALTER COLUMN a TYPE text;
+  ^
+ALTER TABLE partitioned1 ALTER COLUMN b TYPE text; --error
+ERROR:  cannot alter column "b" because it is part of the partition key of relation "partitioned1"
+LINE 1: ALTER TABLE partitioned1 ALTER COLUMN b TYPE text;
+  ^
 -- partitioned table cannot participate in regular inheritance
 CREATE TABLE nonpartitioned (
 	a int,
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 5ce9d1e429f..f5745f448ae 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -2390,6 +2390,13 @@ ALTER TABLE partitioned ALTER COLUMN b TYPE char(5);
 -- specifying storage parameters for partitioned tables is not supported
 ALTER TABLE partitioned SET (fillfactor=100);
 
+-- partition expression is whole row
+CREATE TABLE partitioned1(a int, b int) PARTITION BY list((partitioned1));
+ALTER TABLE partitioned1 DROP COLUMN a; --error
+ALTER TABLE partitioned1 DROP COLUMN b; --error
+ALTER TABLE partitioned1 ALTER COLUMN a TYPE text; --error
+ALTER TABLE partitioned1 ALTER COLUMN b TYPE text; --error
+
 -- partitioned table cannot participate in regular inheritance
 CREATE TABLE nonpartitioned (
 	a int,
-- 
2.34.1



Re: Large expressions in indexes can't be stored (non-TOASTable)

2025-04-22 Thread Nisha Moond
On Mon, Apr 21, 2025 at 9:26 PM Nathan Bossart  wrote:
>
> Apparently replication origins in tests are supposed to be prefixed with
> "regress_".  Here is a new patch with that fixed.
>

Hi Nathan,
Thanks for the patch! I tested it for functionality and here are a few comments:

1) While testing, I noticed an unexpected behavior with the new 512
byte length restriction in place. Since there’s no explicit
restriction on the column roname’s type, it’s possible to insert an
origin name exceeding 512 bytes. For instance, can use the COPY
command -- similar to how pg_dump might dump and restore catalog
tables.

For example:
  postgres=# COPY pg_catalog.pg_replication_origin (roident, roname) FROM stdin;
  Enter data to be copied followed by a newline.
  End with a backslash and a period on a line by itself, or an EOF signal.
  >> 44   regress_... [>512 bytes string]
  >> \.
  COPY 1

Once inserted, I was able to use  the pg_replication_origin_xxx
functions with this origin name without any errors.
Not sure how common pg_dump/restore of catalog tables is in real
systems, but should we still consider if this behavior is acceptable?
~~~

Below are a few cosmetic suggestions you might consider.
2)

 Creates a replication origin with the given external
 name, and returns the internal ID assigned to it.
+The name must be no longer than 512 bytes.


Would it make sense to make the phrasing a bit more formal, perhaps
something like:
“The name must not exceed 512 bytes.”?

3) For the code comments -
+ /*
+ * To avoid needing a TOAST table for pg_replication_origin, we restrict
+ * replication origin names to 512 bytes.  This should be more than enough
+ * for all practical use.
+ */
+ if (strlen(roname) > MAX_RONAME_LEN)

  a) /bytes.  This/bytes. This/ (there is an extra space before “This”)
  b) /all practical use/all practical uses/
  c) Both (a) and (b) above, also apply to the comment above the macro
“MAX_RONAME_LEN”.

4) The "Detail" part of the error message could be made a bit more
formal and precise -

Current:
DETAIL:  Replication origin names must be no longer than 512 bytes.

Suggestion:
DETAIL:  Replication origin names must not exceed 512 bytes.

5) As Euler pointed out, logical replication already has a built-in
restriction for internally assigned origin names, limiting them to
NAMEDATALEN. Given that, only the SQL function
pg_replication_origin_create() is capable of creating longer origin
names. Would it make sense to consider moving the check into
pg_replication_origin_create() itself, since it seems redundant for
all other callers?
That said, if there's a possibility of future callers needing this
restriction at a lower level, it may be reasonable to leave it as is.

--
Thanks,
Nisha




Re: bug: virtual generated column can be partition key

2025-04-22 Thread jian he
On Tue, Apr 22, 2025 at 3:02 PM jian he  wrote:
> Other than that, it looks good to me for fixing this bug.

The error message seems not that intuitive.

+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
+ERROR:  cannot use generated column in partition key
+LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
+ ^
+DETAIL:  Column "f3" is a generated column.

with the attached patch. now,
+CREATE TABLE gtest_part_key (f1 date NOT NULL, f2 bigint, f3 bigint
GENERATED ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE
((gtest_part_key));
ERROR:  cannot use generated column in partition key
LINE 1: ...D ALWAYS AS (f2 * 2) VIRTUAL) PARTITION BY RANGE ((gtest_par...
 ^
DETAIL:  Generated column "f3" is part of the partition key of
relation "gtest_part_key"


what do you think?
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 42610f50b0b..a2908fdeef9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19770,6 +19770,7 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			char		partattname[16];
 			Bitmapset  *expr_attrs = NULL;
 			int			i;
+			bool		whole_row = false;
 
 			Assert(expr != NULL);
 			atttype = exprType(expr);
@@ -19799,7 +19800,9 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			 * the partitioned table.
 			 */
 			pull_varattnos(expr, 1, &expr_attrs);
-			if (bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs))
+			whole_row = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs);
+
+			if (whole_row)
 			{
 expr_attrs = bms_add_range(expr_attrs,
 		   1 - FirstLowInvalidHeapAttributeNumber,
@@ -19839,12 +19842,23 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
  * used in partition keys).  Seems safer to prohibit for now.
  */
 if (TupleDescAttr(RelationGetDescr(rel), attno - 1)->attgenerated)
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
-			 errmsg("cannot use generated column in partition key"),
-			 errdetail("Column \"%s\" is a generated column.",
-	   get_attname(RelationGetRelid(rel), attno, false)),
-			 parser_errposition(pstate, pelem->location)));
+{
+	if (!whole_row)
+		ereport(ERROR,
+errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+errmsg("cannot use generated column in partition key"),
+errdetail("Column \"%s\" is a generated column.",
+		get_attname(RelationGetRelid(rel), attno, false)),
+parser_errposition(pstate, pelem->location));
+	else
+		ereport(ERROR,
+errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+errmsg("cannot use generated column in partition key"),
+errdetail("Generated column \"%s\" is part of the partition key of relation \"%s\"",
+		  get_attname(RelationGetRelid(rel), attno, false),
+		  RelationGetRelationName(rel)),
+parser_errposition(pstate, pelem->location));
+}
 			}
 
 			if (IsA(expr, Var) &&


Re: bug: virtual generated column can be partition key

2025-04-22 Thread Ashutosh Bapat
Thanks Jian for your review.

On Tue, Apr 22, 2025 at 12:32 PM jian he 
wrote:

> On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
>  wrote:
> >
> > I have included your original tests, but ended up rewriting code. Please
> let me know what do you think.
> >
>
> + if (attno < 0)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> + errmsg("partition key expressions cannot contain system column
> references")));
> +
> + /*
> + * We do not check dropped columns explicitly since they will
> + * be eliminated when expanding the the whole row expression
> + * anyway.
> + */
> typo: "the the".
> I am confused by the above comments.
> ComputePartitionAttrs only called in DefineRelation.
> DefineRelation will only CREATE a table, there will be no dropped
> column via DefineRelation.
>

You are right! Fixed.


>
>
> + /*
> + * transformPartitionSpec() should have already rejected
> + * subqueries, aggregates, window functions, and SRFs, based
> + * on the EXPR_KIND_ for partition expressions.
> + */
> "EXPR_KIND_" can change to EXPR_KIND_PARTITION_EXPRESSION
> ?
>

That's an existing comment which appears to be moved in diff but it's
actually untouched by the patch. Maybe you are right, IDK, but since it's
not related to the fix, let's leave it untouched. I did wonder whether that
comment has any relation to the pull_varattnos call which has been moved
earlier since pull_varattnos doesn't expect any Query node in the tree. But
pondering more, I think the comment is related to the number of rows
participating in the partition key computation - there should be exactly
one key for one tuple. Having SRFs, subqueries in part expression means a
possibility of producing more than one set of partition keys, aggregates
and window functions OTOH will produce one partition key for more than one
row - all of them breaking 1:1 mapping between a tuple and a partition key.
Hence I think the comment is at the right place.

PFA revised patch.

-- 
Best Wishes,
Ashutosh Bapat
From 5467147f6b37898263c78ce64b086072c76caaad Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat 
Date: Mon, 21 Apr 2025 18:06:58 +0530
Subject: [PATCH] Tighten check for generated column in partition key
 expression

A generated column may end up being part of the partition key
expression, if it's specified as an expression e.g. "()" or if the partition key expression contains a whole-row
reference, even though we do not allow a generated column to be part of
partition key expression. Fix this hole.

Discussion: https://postgr.es/m/CACJufxF=wdgthxsaqr9thyusfx_1_t9e6n8te3b8eqxcvov...@mail.gmail.com
---
 src/backend/commands/tablecmds.c  | 90 ++-
 .../regress/expected/generated_stored.out | 15 
 .../regress/expected/generated_virtual.out| 15 
 src/test/regress/sql/generated_stored.sql |  3 +
 src/test/regress/sql/generated_virtual.sql|  3 +
 5 files changed, 86 insertions(+), 40 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 265b1c397fb..f7b7162a99c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19768,6 +19768,8 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			/* Expression */
 			Node	   *expr = pelem->expr;
 			char		partattname[16];
+			Bitmapset  *expr_attrs = NULL;
+			int			i;
 
 			Assert(expr != NULL);
 			atttype = exprType(expr);
@@ -19791,43 +19793,36 @@ ComputePartitionAttrs(ParseState *pstate, Relation rel, List *partParams, AttrNu
 			while (IsA(expr, CollateExpr))
 expr = (Node *) ((CollateExpr *) expr)->arg;
 
-			if (IsA(expr, Var) &&
-((Var *) expr)->varattno > 0)
+			/*
+			 * Examine all the columns in the partition key expression. When
+			 * the whole-row reference is present, examine all the columns of
+			 * the partitioned table.
+			 */
+			pull_varattnos(expr, 1, &expr_attrs);
+			if (bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs))
 			{
-/*
- * User wrote "(column)" or "(column COLLATE something)".
- * Treat it like simple attribute anyway.
- */
-partattrs[attn] = ((Var *) expr)->varattno;
+expr_attrs = bms_add_range(expr_attrs,
+		   1 - FirstLowInvalidHeapAttributeNumber,
+		   RelationGetNumberOfAttributes(rel) - FirstLowInvalidHeapAttributeNumber);
+expr_attrs = bms_del_member(expr_attrs, 0 - FirstLowInvalidHeapAttributeNumber);
 			}
-			else
-			{
-Bitmapset  *expr_attrs = NULL;
-int			i;
 
-partattrs[attn] = 0;	/* marks the column as expression */
-*partexprs = lappend(*partexprs, expr);
+			i = -1;
+			while ((i = bms_next_member(expr_attrs, i)) >= 0)
+			{
+AttrNumber	attno = i + FirstLowInvalidHeapAttributeNumber;
 
-/*
- * transformPartitionSpec() should have already rejected
- * subqueries, aggregates, window functions, and SRFs, based
- * on the EXPR_KIND_ for partition expressions.
- */
+Assert(attno != 0);
 
 		

Re: Fix slot synchronization with two_phase decoding enabled

2025-04-22 Thread shveta malik
On Fri, Apr 11, 2025 at 1:03 PM Nisha Moond  wrote:
>
>
> Patch "v5_aprch_3-0001" implements the above Approach 3.
>
> Thanks Hou-san for implementing approach-2 and providing the patch.
>

I find Approach 2 better than Approach1. Yet to review Approach3.
Please find my initial comments:


Approach#1:

1)
When slot is created with failover enabled and later we try to create
a subscription using that pre-created slot with two-phase enabled, it
does not error out. But it silently retains two_phase of the existing
slot as false. Please check if an error is possible in this case too.


Approach #2:

1)
+ ReorderBufferSkipPrepare(reorder, parsed.twophase_xid);

In xact_decode, why do we have a new call to ReorderBufferSkipPrepare,
can you please add some comments here?

2)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("\"%s\" and \"%s\" are mutually exclusive options when \"%s\"
is specified",

So like approach 1, here as well we disallow creating subscriptions
with both failover and two_phase enabled when create_slot and
copy_data is true? But users can alter the sub later to allow failover
for two-phase enabled slot provided there are no pending PREPARE txns
which are not yet consumed by the subscriber. Is my understanding
correct? Can you please tell all the ways using which the user can
enable both failover and two_phase together? The patch msg is not that
clear. It will be good to add such details in patch message.

3)

+ /*
+ * Do not allow users to enable the failover and two_phase options together
+ * when create_slot is specified.
+ *
+ * See comments atop the similar check in DecodingContextFindStartpoint()
+ * for a detailed reason.
+ */
+ if (!opts.create_slot && opts.twophase && opts.failover)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("\"%s\" and \"%s\" are mutually exclusive options when \"%s\"
is specified",
+

Is the check '!opts.create_slot' correct? The error msg and comment
says otherwise.

thanks
Shveta




Re: Fix slot synchronization with two_phase decoding enabled

2025-04-22 Thread Amit Kapila
On Mon, Apr 21, 2025 at 8:44 AM Zhijie Hou (Fujitsu)
 wrote:
>
> On Sat, Apr 19, 2025 at 2:19 AM Masahiko Sawada wrote:
> >
> > On Tue, Apr 8, 2025 at 10:14 PM Zhijie Hou (Fujitsu)
> >  wrote:
> > >
> > >
> > > --
> > > Approach 2
> > > --
> > >
> > > Instead of disallowing the use of two-phase and failover together, a more
> > > flexible strategy could be only restrict failover for slots with two-phase
> > > enabled when there's a possibility of existing prepared transactions 
> > > before
> > the
> > > two_phase_at that are not yet replicated. During slot creation with
> > two-phase
> > > and failover, we could check for any decoded prepared transactions when
> > > determining the decoding start point (DecodingContextFindStartpoint). For
> > > subsequent attempts to alter failover to true, we ensure that 
> > > two_phase_at is
> > > less than restart_lsn, indicating that all prepared transactions have been
> > > committed and replicated, thus the bug would not happen.
> > >
> > > pros:
> > >
> > > This method minimizes restrictions for users. Especially during slot 
> > > creation
> > > with (two_phase=on, failover=on), as it’s uncommon for transactions to
> > prepare
> > > during consistent snapshot creation, the restriction becomes almost
> > > unnoticeable.
> >
> > I think this approach can work for the transactions that are prepared
> > while the slot is created. But if I understand the problem correctly,
> > while the initial table sync is performing, the slot's two_phase is
> > still false, so we need to deal with the transactions that are
> > prepared during the initial table sync too. What do you think?
> >
>
> Yes, I agree that we need to restrict this case too. Given that we haven't
> started decoding when setting two_phase=true during CreateDecodingContext()
> after tablesync, we could check prepared transactions afterwards during
> decoding. This could involve reporting an ERROR when skipping a prepared
> transaction during decoding if its prepare LSN is less than two_phase_at.
>

It will make it difficult for users to detect it as this happens at a
later point of time.

> Alternatively, a simpler method would be to prevent this situation entirely
> during the CREATE SUBSCRIPTION command. For example, we could restrict slots
> created with failover set to true and twophase is later modified to true after
> tablesync. Although the simpler check is more user-visible, it may offer less
> flexibility.
>

I agree with your point, but OTOH, I am also afraid of adding too many
smart checks in the back-branch. If we follow what you say here, then
users have the following ways in PG17 to enable both failover and
two_phase. (a) During Create Subscription, users can set both
'failover' and 'two_phase', if 'copy_data' is false, or (b), if
'copy_data' is true, during Create Subscription, then users can enable
'two_phase' and wait for it to be enabled. Then use Alter Subscription
to set 'failover'.


-- 
With Regards,
Amit Kapila.




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-04-22 Thread Daniel Gustafsson
> On 22 Apr 2025, at 01:19, Jacob Champion  
> wrote:

> v8 also makes the following changes:

Thanks for this version, a few small comments:

+  if oauth_flow_supported
+cdata.set('USE_LIBCURL', 1)
+  elif libcurlopt.enabled()
+error('client OAuth is not supported on this platform')
+  endif
We already know that libcurlopt.enabled() is true here so maybe just doing
if-else-endif would make it more readable and save readers thinking it might
have changed?  Also, "client OAuth" reads a bit strange, how about "client-side
OAuth" or "OAuth flow module"?


-   appendPQExpBufferStr(&conn->errorMessage,
-libpq_gettext(actx->errctx));
-   appendPQExpBufferStr(&conn->errorMessage, ": ");
+   appendPQExpBufferStr(errbuf, libpq_gettext(actx->errctx));
+   appendPQExpBufferStr(errbuf, ": ");
I think we should take this opportunity to turn this into a appendPQExpBuffer()
with a format string instead of two calls.


+   len = errbuf->len;
+   if (len >= 2 && errbuf->data[len - 2] == '\n')
Now that the actual variable, errbuf->len, is short and very descriptive I
wonder if we shouldn't just use this as it makes the code even clearer IMO.

--
Daniel Gustafsson





Re: bug: virtual generated column can be partition key

2025-04-22 Thread jian he
On Tue, Apr 22, 2025 at 11:45 AM Ashutosh Bapat
 wrote:
>
> I have included your original tests, but ended up rewriting code. Please let 
> me know what do you think.
>

+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("partition key expressions cannot contain system column references")));
+
+ /*
+ * We do not check dropped columns explicitly since they will
+ * be eliminated when expanding the the whole row expression
+ * anyway.
+ */
typo: "the the".
I am confused by the above comments.
ComputePartitionAttrs only called in DefineRelation.
DefineRelation will only CREATE a table, there will be no dropped
column via DefineRelation.


+ /*
+ * transformPartitionSpec() should have already rejected
+ * subqueries, aggregates, window functions, and SRFs, based
+ * on the EXPR_KIND_ for partition expressions.
+ */
"EXPR_KIND_" can change to EXPR_KIND_PARTITION_EXPRESSION
?


Other than that, it looks good to me for fixing this bug.




Fix premature xmin advancement during fast forward decoding

2025-04-22 Thread Zhijie Hou (Fujitsu)
Hi,

When analyzing some issues in another thread[1], I found a bug that fast forward
decoding could lead to premature advancement of catalog_xmin, resulting in
required catalog data being removed by vacuum.

The issue arises because we do not build a base snapshot when decoding changes
during fast forward mode, preventing reference to the minimum transaction ID
that remains visible in the snapshot when determining the candidate for
catalog_xmin. As a result, catalog_xmin was directly advanced to the oldest
running transaction ID found in the latest running_xacts record.

In code-level, SnapBuildProcessChange -> ReorderBufferSetBaseSnapshot could not
be reached during fast forward decoding, resulting in
rb->txns_by_base_snapshot_lsn being NULL. When advancing catalog_xmin, the
system attempts to refer to rb->txns_by_base_snapshot_lsn via
ReorderBufferGetOldestXmin(). However, since rb->txns_by_base_snapshot_lsn is
NULL, it defaults to directly using running->oldestRunningXid as the candidate
for catalog_xmin. For reference, see the details in
SnapBuildProcessRunningXacts.

See the attachment for a test(0002) to prove that the catalog data that are
still required would be removed after premature catalog_xmin advancement during
fast forward decoding.

To fix this, I think we can allow the base snapshot to be built during fast
forward decoding, as implemented in the patch 0001 (We already built base
snapshot in fast-forward mode for logical message in logicalmsg_decode()).

I also explored the possibility of further optimizing the fix to reduce the
overhead associated with building a snapshot in fast-forward mode. E.g., to
maintain a single base_snapshot_xmin rather than generating a full snapshot for
each transaction, and use this base_snapshot_xmin when determining the
candidate catalog_xmin. However, I am not feeling confident about maintaining
some fast-forward dedicated fields in common structures and
perhaps employing different handling for catalog_xmin.

Moreover, I conducted a basic test[2] to test the patch's impact, noting that
advancing the slot incurs roughly a 4% increase in processing time after
applying the patch, which appears to be acceptable. Additionally, the cost
associated with building the snapshot via SnapBuildBuildSnapshot() did not show
up in the profile. Therefore, I think it's reasonable to refrain from
further optimization at this stage.

[1] 
https://www.postgresql.org/message-id/OS3PR01MB5718ED3F0123737C7380BBAD94BB2%40OS3PR01MB5718.jpnprd01.prod.outlook.com

[2]
Create a table test(a int);
Create a slot A.
pgbench postgres -T 60 -j 100 -c 100 -f simple-insert

simple-insert:
BEGIN;
INSERT INTO test VALUES(1);
COMMIT;

use pg_replication_slot_advance to advance the slot A to the latest position
and count the time.

HEAD: Time: 4189.257 ms
PATCH: Time: 4371.358 ms

Best Regards,
Hou zj



v1-0002-Add-a-test.patch
Description: v1-0002-Add-a-test.patch


v1-0001-Fix-premature-xmin-advancement-during-fast-forwar.patch
Description:  v1-0001-Fix-premature-xmin-advancement-during-fast-forwar.patch


Re: [BUG] temporary file usage report with extended protocol and unnamed portals

2025-04-22 Thread Frédéric Yhuel




On 4/21/25 07:46, Frédéric Yhuel wrote:

On 4/20/25 00:42, Sami Imseih wrote:

(In my testing, the "temporary file:" message comes
out without any attached STATEMENT most of the time already, so this
isn't losing much as far as that's concerned.)



Indeed, this happens when using autocommit / implicit transactions.


Actually, this also happens with Java-style cursors, i.e. using the 
setFetchSize() method, which pgJDBC converts to using named 
portals and EXECUTE   protocol messages.


The explanation is probably very similar to the one Sami gave for the 
implicit transaction case.


In any case, my v3 patch seems to fix all these cases.

(I'm not saying it's good enough to be committed as is. I think I should 
at least add some comments. Anything else?)






Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-04-22 Thread Peter Smith
Hi Ajin,

Some review comments for patch v17-0001

==
Commit message

1.
Additionally, transactions containing WAL records (INTERNAL_SNAPSHOT,
COMMAND_ID, or INVALIDATION) that modify the historical snapshot
constructed during logical decoding are deemed unfilterable. This
is necessary because constructing a correct historical snapshot
for searching publication information requires processing these WAL record.

~

/these WAL record./these WAL records./

==
src/backend/replication/logical/reorderbuffer.c

ReorderBufferFilterByRelFileLocator:

2.
+ if (found)
+ {
+ rb->try_to_filter_change = entry->filterable;
+ return entry->filterable;
+ }
+

Bad indentation.

==
src/include/replication/reorderbuffer.h

3.
+extern bool ReorderBufferCanFilterChanges(ReorderBuffer *rb);
+

Why is this extern here? This function is not implemented in patch 0001.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Proposal: Adding compression of temporary files

2025-04-22 Thread Filip Janus
Since the patch was prepared months ago, it needs to be rebased.

-Filip-


ne 13. 4. 2025 v 21:53 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Fri, Mar 28, 2025 at 09:23:13AM GMT, Filip Janus wrote:
> > > +else
> > > +if (nbytesOriginal - file->pos != 0)
> > > +/* curOffset must be corrected also if compression is
> > > + * enabled, nbytes was changed by compression but we
> > > + * have to use the original value of nbytes
> > > + */
> > > +file->curOffset-=bytestowrite;
> > >
> > > It's not something introduced by the compression patch - the first part
> > > is what we used to do before. But I find it a bit confusing - isn't it
> > > mixing the correction of "logical file position" adjustment we did
> > > before, and also the adjustment possibly needed due to compression?
> > >
> > > In fact, isn't it going to fail if the code gets multiple loops in
> > >
> > > while (wpos < file->nbytes)
> > > {
> > >...
> > > }
> > >
> > > because bytestowrite will be the value from the last loop? I haven't
> > > tried, but I guess writing wide tuples (more than 8k) might fail.
> > >
> >
> > I will definitely test it with larger tuples than 8K.
> >
> > Maybe I don't understand it correctly,
> > the adjustment is performed in the case that file->nbytes and file->pos
> > differ.
> > So it must persist also if we are working with the compressed data, but
> the
> > problem is that data stored and compressed on disk has different sizes
> than
> > data incoming uncompressed ones, so what should be the correction value.
> > By debugging, I realized that the correction should correspond to the
> size
> > of
> > bytestowrite from the last iteration of the loop.
>
> I agree, this looks strange. If the idea is to set curOffset to its
> original value + pos, and the original value was advanced multiple times
> by bytestowrite, it seems incorrect to adjust it by bytestowrite, it
> seems incorrect to adjust it only once. From what I see current tests do
> not exercise a case where the while will get multiple loops, so it looks
> fine.
>
> At the same time maybe I'm missing something, but how exactly such test
> for 8k tuples and multiple loops in the while block should look like?
> E.g. when I force a hash join on a table with a single wide text column,
> the minimal tuple that is getting written to the temporary file still
> has rather small length, I assume due to toasting. Is there some other
> way to achieve that?
>
>


0002-Add-test-for-temporary-files-compression-this-commit.patch
Description: Binary data


0001-This-commit-adds-support-for-temporary-files-compres.patch
Description: Binary data


Allow to collect statistics on virtual generated columns

2025-04-22 Thread Yugo Nagata
Hi hackers,

Hi hackers,

Now we can create a table with a virtual generated column, but
when a condition in WHERE clause contains virtual generated column,
estimated rows are not correct since no statistics on this is
collectef.

[Ex.1]

 test=# CREATE TABLE t (i int, v int GENERATED ALWAYS AS (i+1) VIRTUAL);
 CREATE TABLE

 test=# INSERT INTO t SELECT generate_series(1,1000);
 INSERT 0 1000

 test=# INSERT INTO t SELECT 1 FROM generate_series(1,1000);
 INSERT 0 1000

 test=# EXPLAIN ANALYZE SELECT * FROM t WHERE v = 2;
 QUERY PLAN 
   
 
--
  Seq Scan on t  (cost=0.00..36.02 rows=9 width=8) (actual time=0.093..3.059 
rows=1001.00 loops=1)
Filter: ((i + 1) = 2)
Rows Removed by Filter: 999
Buffers: shared hit=9
  Planning:
Buffers: shared hit=26
  Planning Time: 1.142 ms
  Execution Time: 3.434 ms
 (8 rows)

Therefore, I would like to allow to collect statistics on virtual enerated 
columns.

I think there are at least three approaches for this.

(1) Allow the normal ANALYZE to collect statistics on virtual generated columns

ANALYZE expands virtual generated columns' expression, and collects statistics
on evaluated values. In this approach, the statistics on virtual generated 
columns
are collected in default, but ANALYZE on the table would become a bit expensive.

(2) Allow to create an index on virtual generated column

This is proposed in [1]. This proposal itself would be useful, I believe it is 
better
to provide a way to collect statistics without cost of creating an index.

[1] 
https://www.postgresql.org/message-id/flat/CACJufxGao-cypdNhifHAdt8jHfK6-HX=trbovbkgruxw063...@mail.gmail.com

(3) Allow to create extended statistics on virtual generated columns

In this approach, ANALYZE processes virtual generated columns only if 
corresponding
extended statistics are defined. Although we can create extended statistics on
expressions of virtual generated columns even in the current implementation, 
this enables
that users to create a useful statistics this just by specifying a column name 
without
specifying complex expression.

I can also think of two variations for this approach.

(3a)
At the timing when an extended statistics is created, virtual generated columns 
are
expanded, and the statistics is defined on this expression.

(3b)
At the timing when an extended statistics is created, virtual generated columns 
are
NOT expanded. The statistics is defined on the virtual generated column itself 
and,
the expression is expanded when ANALYZE processes the extended statistics.

I've attached a draft patch based on (3a).  However, if it is possible we could 
change
the definition of generated columns in future (as proposed in [2]), (3b) might 
be preferred.  

[2] 
https://www.postgresql.org/message-id/flat/cacjufxh3vetr7orf5rw29gndk3n1wwboe3wdkhyd3ipgrq9...@mail.gmail.com

Here is an example of how the patch works.

[Ex.2]

 test=# CREATE STATISTICS exstat ON v FROM t;
 CREATE STATISTICS
 test=# ANALYZE t;
 ANALYZE
 test=# EXPLAIN ANALYZE SELECT * FROM t WHERE v = 2;
  QUERY PLAN
  
 
-
  Seq Scan on t  (cost=0.00..41.50 rows=1001 width=8) (actual time=0.067..2.422 
rows=1001.00 loops=1)
Filter: ((i + 1) = 2)
Rows Removed by Filter: 999
Buffers: shared hit=9
  Planning:
Buffers: shared hit=14
  Planning Time: 0.785 ms
  Execution Time: 2.744 ms
 (8 rows)


What do you think of this?  Which approach of (1), (3a), or (3b) is good?
Or, completely different approach is better? 
With your feedback, I would like to progress or rework the patch.

Regards,
Yugo Nagata

-- 
Yugo Nagata 
>From a6b0be714f6d4e4e0e7423f07432d3135c807a63 Mon Sep 17 00:00:00 2001
From: Yugo Nagata 
Date: Tue, 22 Apr 2025 17:03:50 +0900
Subject: [PATCH v1] Allow to create extended statistics on virtual generated
 columns

---
 src/backend/commands/statscmds.c | 86 +++-
 1 file changed, 40 insertions(+), 46 deletions(-)

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index e24d540cd45..9b7f27fec28 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -29,6 +29,7 @@
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/optimizer.h"
+#include "rewrite/rewriteHandler.h"
 #include "statistics/statistics.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
@@ -240,28 +241,27 @@ CreateStatistics(CreateStatsStmt *stmt)
 attname)));
 			attForm = (Form_pg_attribute) GETSTRUCT(atttuple);
 
-			/* Disallow use of system attributes in extended stats */
-			if (attForm->attnum <= 0)
-ereport(ERROR,
-		(errcode(ERRCODE_FEATU

Re: What's our minimum supported Python version?

2025-04-22 Thread Daniel Gustafsson
> On 19 Apr 2025, at 18:17, Tom Lane  wrote:

> The reason I bring this up is that I found out the hard way
> that src/test/modules/oauth_validator fails on RHEL8, because
> its oauth_server.py script is not compatible with the 3.6.8
> version of Python supplied by this distro.

Do you have the error message/log for the failure handy?

--
Daniel Gustafsson





Re: Cygwin support

2025-04-22 Thread Aleksander Alekseev
Hi Andrew,

> Last year the old Windows machine where I was running the buildfarm
> member lorikeet died, and since then we've had no buildfarm coverage for
> Cygwin. I now have a new (but slow) W11pro machine and I have been
> testing out Cygwin builds on it. I wanted to have it running the TAP
> tests, unlike lorikeet. Attached is a set of very small patches aimed at
> enabling this.
>
> [...]

Thanks for the patches.

I wonder though if it is advisable to support Cygwin if this requires
extra effort from our side, compared to a regular Linux or Windows
environment. I actually like the project but I wouldn't recommend
running Postgres on prod, and for development these days it's simpler
to use WSL / VirtualBox / Raspberry Pi, or running Postgres natively
on Windows.

-- 
Best regards,
Aleksander Alekseev




Re: Adding pg_dump flag for parallel export to pipes

2025-04-22 Thread Hannu Krosing
If there are no objections we will add this to the commitfest

On Mon, Apr 7, 2025 at 9:48 PM Hannu Krosing  wrote:
>
>
> Just to bring this out separately : Does anybody have any idea why pipe 
> commands close inside tests ?
>
> Re: 003-pg_dump_basic_tests has a few basic validation tests for
> correctmflag combinations. We need to write more automated tests in
> 002_pg_dump.pl but have been running into some issues with environment
> setup due to which certain pipe commands result in the shell process
> becoming defunct. These same commands are working fine in manual
> testing. We are still looking into this.
>
> 
> Hannu
>
>
> On Mon, Apr 7, 2025 at 7:17 PM Nitin Motiani  wrote:
>>
>> Hi Hackers,
>>
>> We are proposing the ability to specify a pipe command to pg_dump by a
>> flag. And attaching the patch set.
>>
>> Why : Currently it is quite simple to pipe the output of pg_dump for
>> text format to a pipe at command line and do any manipulations
>> necessary. Following is an example :
>>
>>pg_dump   | lz4 | pv -L 10k | ssh remote.host
>> "cat - > remote.dump.lz4"
>>
>> Here we first compress the stream using lz4 and then send it over ssh
>> to a remote host to be saved as a file while rate-limiting the network
>> usage to 10KB/s.
>>
>> Something like this is not possible for format=directory (-Fd) since
>> all you can provide is the directory name to store the individual
>> files. Note it is not possible to do this irrespective of the usage of
>> the parallel dump option ('--jobs' flag).
>>
>> While the directory format supports compression using a flag, the rest
>> of the operations in the above example are not possible. And a pipe
>> command provides more flexibility in what compression algorithm one
>> wants to use.
>>
>> This patch set provides pg_dump the ability to pipe the data in the
>> directory mode by using a new flag '--pipe-command' (in both parallel
>> and non-parallel mode).
>>
>> We also add a similar option to pg_restore.
>>
>> The following can be the major use cases of these changes :
>>   1. Stream pg_dump output to a cloud storage
>>   2. SSH the data to a remote host (with or without throttling)
>>   3. Custom compression options
>>
>>
>> Usage Examples : Here is an example of how the pipe-command will look like.
>>
>>  pg_dump -Fd mydb --pipe-command="cat > dumpdir/%f" (dumpdir
>> should exist beforehand.)
>>
>> This is equivalent to
>>
>>  pg_dump -Fd mydb --file=dumpdir
>>
>> (Please note that the flags '--file' or '--pipe-command' can't be used
>> together.)
>>
>> For the more complex scenario as mentioned above, the command will be
>> (with the parallelism of 5) :
>>
>>   pg_dump -Fd mydb -j 5 --pipe-command="lz4 | pv -L 10k | ssh
>> remote.host "cat > dumpdir/%f""
>>
>> Please note the use of %f in the above examples. As a user would
>> almost always want to write the post-processing output to a file (or
>> perhaps a cloud location), we provide a format specifier %f in the
>> command. The implementation of pipe-command replaces these format
>> specifiers with the corresponding file names. These file names are the
>> same as they would be in the current usage of directory format with
>> '--file' flag (.dat, toc.dat, blob_NNN.toc,
>> blob_.dat).
>>
>> The usage of this flag with pg_restore will also be similar. Here is
>> an example of restoring from a gzip compressed dump directory.
>>
>> pg_restore -C -Fd -d postgres --pipe-commnad="cat
>> dumpdir/%f.gz | gunzip"
>>
>> The new flag in pg_restore also works with '-l' and '-L' options
>>
>> pg_restore -C -Fd -d postgres --pipe-commnad="cat dumpdir/%f" -L 
>> db.list
>>
>>
>> Implementation Details : Here are the major changes :
>>1. We reuse the same variables which store the file name to store
>> the pipe command. And add a new bool fSpecIsPipe in _archiveHandle
>> (similar bools in pg_dump.c and pg_restore.c) to specify if it's a
>> pipe command.
>> 2. In the cases when the above bool is set to true, we use popen
>> and pclose instead of fopen and fclose.
>>  3. To enable the format specifier %f in the pipe-command, we make
>> changes to the file name creation logic in a few places. Currently the
>> file name (corresponding to a table or large object) is appended to
>> the directory name provided by '--file' command. In case of
>> '--pipe-command', we use 'replace_percent_placeholders' to replace %f
>> with the corresponding file name. This change is made for both table
>> files and LO TOC files.
>>
>> With these core changes, the rest of the code continues working as-is.
>>
>> We are attaching 4 patches for this change :
>>
>>   001-pg_dump_pipe has the pg_dump pipe support code.
>>   002-pg_restore_pipe has the pg_restore pipe support.
>>   003-pg_dump_basic_tests has a few basic validation tests for
>> correctmflag combinations. We need to write more automated tests in
>> 002_pg_dump.pl but have been running into some issues with environment
>> setup due to which certa

Re: Cygwin support

2025-04-22 Thread Tom Lane
Andrew Dunstan  writes:
> I agree that I would not normally use Cygwin to run a Postgres instance. 
> But its psql is nicer than others on Windows because unlike the native 
> builds we build it with readline. That's why I've kept a buildfarm 
> animal going all these years. If the maintenance burden were high I 
> wouldn't do so - but it's not really. And these patches are tiny.

I vaguely recall some discussion about whether building with readline
has become possible under MSVC.  I think it'd make a lot of people
happy if that could happen (but I hasten to add that I'm not
volunteering).

regards, tom lane




Re: Add Pipelining support in psql

2025-04-22 Thread Anthonin Bonnefoy
On Tue, Apr 22, 2025 at 2:06 AM Michael Paquier  wrote:
> I am wondering if we could not be smarter with the handling of
> the counters, but I really doubt that there is much more we can do
> under a PGRES_FATAL_ERROR.

One thing that bothers me is that the reported error is silently
discarded within discardAbortedPipelineResults.

psql -f bug_assertion.sql
psql:bug_assertion.sql:7: ERROR:  unexpected message type 0x50 during
COPY from stdin
CONTEXT:  COPY psql_pipeline, line 1
psql:bug_assertion.sql:7: Pipeline aborted, command did not run

This should ideally report the "FATAL:  terminating connection because
protocol synchronization was lost" sent by the backend process.

Also, we still have a triggered assertion failure with the following:
CREATE TABLE psql_pipeline(a text);
\startpipeline
COPY psql_pipeline FROM STDIN;
SELECT 'val1';
\syncpipeline
\endpipeline
...
Assertion failed: (pset.piped_syncs == 0), function
ExecQueryAndProcessResults, file common.c, line 2153.

A possible alternative could be to abort discardAbortedPipelineResults
when we encounter a res != NULL + FATAL error and let the outer loop
handle it. As you said, the pipeline flow is borked so there's not
much to salvage. The outer loop would read and print all error
messages until the closed connection is detected. Then,
CheckConnection will reset the connection which will reset the
pipeline state.

While testing this change, I was initially looking for the "FATAL:
terminating connection because protocol synchronization was lost"
message in the tests. However, this was failing on Windows[1] as the
FATAL message wasn't reported on stderr. I'm not sure why yet.

[1]: https://cirrus-ci.com/task/5051031505076224?logs=check_world#L240-L246


v02-0001-PATCH-psql-Fix-assertion-failure-with-pipeline-m.patch
Description: Binary data


Re: Cygwin support

2025-04-22 Thread Andrew Dunstan



On 2025-04-22 Tu 8:10 AM, Aleksander Alekseev wrote:

Hi Andrew,


Last year the old Windows machine where I was running the buildfarm
member lorikeet died, and since then we've had no buildfarm coverage for
Cygwin. I now have a new (but slow) W11pro machine and I have been
testing out Cygwin builds on it. I wanted to have it running the TAP
tests, unlike lorikeet. Attached is a set of very small patches aimed at
enabling this.

[...]

Thanks for the patches.

I wonder though if it is advisable to support Cygwin if this requires
extra effort from our side, compared to a regular Linux or Windows
environment. I actually like the project but I wouldn't recommend
running Postgres on prod, and for development these days it's simpler
to use WSL / VirtualBox / Raspberry Pi, or running Postgres natively
on Windows.



I agree that I would not normally use Cygwin to run a Postgres instance. 
But its psql is nicer than others on Windows because unlike the native 
builds we build it with readline. That's why I've kept a buildfarm 
animal going all these years. If the maintenance burden were high I 
wouldn't do so - but it's not really. And these patches are tiny.


What I might do with the new animal is run just enough TAP tests to 
exercise psql. That would reduce the errors we get, so it would be less 
bother to anyone.



cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Tom Lane
Pavel Borisov  writes:
> It's not a big problem, but propose a simple fix for the tests. It
> just adds ORDER BY 1 to all relevant float4 and floa8 queries.

You do realize that this problem is hardly confined to the
float4 and float8 tests?  To a first approximation, a table AM
that fails to preserve insertion order would break every single
one of our regression tests.  (I speak from experience, as
Salesforce had to deal with this when I was there...)

The reason why we don't simply add ORDER BY to everything is
explained at [1]:

You might wonder why we don't order all the regression test
queries explicitly to get rid of this issue once and for all.  The
reason is that that would make the regression tests less useful,
not more, since they'd tend to exercise query plan types that
produce ordered results to the exclusion of those that don't.

At some point we will probably have to think through what we
want to do about running the regression tests with table AMs that
don't wish to preserve ordering as much as heapam does.  It's going
to need careful balancing of multiple concerns, and I don't think
"blindly slap ORDER BY everywhere" is going to be an acceptable
answer.

regards, tom lane

[1] 
https://www.postgresql.org/docs/devel/regress-evaluation.html#REGRESS-EVALUATION-ORDERING-DIFFERENCES




Re: Check for tuplestorestate nullness before dereferencing

2025-04-22 Thread Nikolay Shaplov
В письме от вторник, 22 апреля 2025 г. 18:50:49 MSK пользователь Tom Lane 
написал:

> The reason that the subsequent bit of code is safe is that !forward
> should not possibly be true unless EXEC_FLAG_BACKWARD was given,
> which'd cause us to create a tuplestore.  So if we were going
> to change anything, I'd propose adding something more like
> 
> if (!forward && eof_tuplestore)
> {
> +   Assert(node->eflags & EXEC_FLAG_BACKWARD);
> if (!node->eof_underlying)
> {
> /*
> 
> or perhaps more directly, Assert that tuplestore is not null.

I like  Assert(node->eflags & EXEC_FLAG_BACKWARD);
solution, it gives more information: "here we expect that BACKWARD eflag is 
set". (I am not quite familiar with this part of code, this knowledge in not 
obvious for me)
While Assert(tuplestorestate != NULL)  gives much less information about what 
is happening here.

And I think it is better to add this Assert there. People will continue using 
static analyzers on postgres code, finding this place again and again. Better 
to close this issue once and for all :-)

-- 
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su


signature.asc
Description: This is a digitally signed message part.


Re: Cygwin support

2025-04-22 Thread Andrew Dunstan



On 2025-04-22 Tu 10:26 AM, Tom Lane wrote:

Andrew Dunstan  writes:

I agree that I would not normally use Cygwin to run a Postgres instance.
But its psql is nicer than others on Windows because unlike the native
builds we build it with readline. That's why I've kept a buildfarm
animal going all these years. If the maintenance burden were high I
wouldn't do so - but it's not really. And these patches are tiny.

I vaguely recall some discussion about whether building with readline
has become possible under MSVC.  I think it'd make a lot of people
happy if that could happen (but I hasten to add that I'm not
volunteering).





Neither am I, although I'll test it if somebody sends in a patch. If 
that happens I'll be happy enough to retire Cygwin support.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Pavel Borisov
On Tue, 22 Apr 2025 at 18:40, Pavel Borisov  wrote:
>
> Hi, Tom!
>
> On Tue, 22 Apr 2025 at 18:23, Tom Lane  wrote:
> >
> > Pavel Borisov  writes:
> > > It's not a big problem, but propose a simple fix for the tests. It
> > > just adds ORDER BY 1 to all relevant float4 and floa8 queries.
> >
> > You do realize that this problem is hardly confined to the
> > float4 and float8 tests?  To a first approximation, a table AM
> > that fails to preserve insertion order would break every single
> > one of our regression tests.  (I speak from experience, as
> > Salesforce had to deal with this when I was there...)
> >
> > The reason why we don't simply add ORDER BY to everything is
> > explained at [1]:
> >
> > You might wonder why we don't order all the regression test
> > queries explicitly to get rid of this issue once and for all.  The
> > reason is that that would make the regression tests less useful,
> > not more, since they'd tend to exercise query plan types that
> > produce ordered results to the exclusion of those that don't.
> >
> > At some point we will probably have to think through what we
> > want to do about running the regression tests with table AMs that
> > don't wish to preserve ordering as much as heapam does.  It's going
> > to need careful balancing of multiple concerns, and I don't think
> > "blindly slap ORDER BY everywhere" is going to be an acceptable
> > answer.
> >
>
> I agree we might need to elaborate different AM support in regression tests.
> Also, I agree that these are not the only tests to be fixed.
>
> What I hardly agree with, is that we suppose it's right for regression
> tests to require some fixed results ordering for so simple cases.
> Maybe for more complicated plans it's useful, but for the float tests
> mentioned in the patch it's this requirement is a total loss IMO.
>
> I understand your sentiment against blindly slapping order by's, but I
> don't see a useful alternative for this time. Also a large number of
> tests in PG were already fortified with ORDER BY 1.

I forgot to mention that it's not only a question of INSERTs ordering.
Extension AM can have some internal ordering e.g. index-organized
tables. And besides SELECT query results following internal AM
ordering just being allowed, more importantly they are good
performance-wise and justify table AM extensibility.

Regards,
Pavel.




Re: What's our minimum supported Python version?

2025-04-22 Thread Tom Lane
Daniel Gustafsson  writes:
>> On 19 Apr 2025, at 18:17, Tom Lane  wrote:
>> The reason I bring this up is that I found out the hard way
>> that src/test/modules/oauth_validator fails on RHEL8, because
>> its oauth_server.py script is not compatible with the 3.6.8
>> version of Python supplied by this distro.

> Do you have the error message/log for the failure handy?

The first problem is that this Python version seems not to
like assignments embedded in if statements:

  File "t/oauth_server.py", line 319
if err := self._get_param("error_code", None):
^
SyntaxError: invalid syntax

I was able to work around that with the attached quick hack.
But then I get

Traceback (most recent call last):
  File "t/oauth_server.py", line 19, in 
class OAuthHandler(http.server.BaseHTTPRequestHandler):
  File "t/oauth_server.py", line 26, in OAuthHandler
JsonObject = dict[str, object]  # TypeAlias is not available until 3.10
TypeError: 'type' object is not subscriptable

which I have no idea how to work around.

regards, tom lane

diff --git a/src/test/modules/oauth_validator/t/oauth_server.py b/src/test/modules/oauth_validator/t/oauth_server.py
index 5bc30be87fd..133fe496cfc 100755
--- a/src/test/modules/oauth_validator/t/oauth_server.py
+++ b/src/test/modules/oauth_validator/t/oauth_server.py
@@ -316,11 +316,13 @@ class OAuthHandler(http.server.BaseHTTPRequestHandler):
 return resp
 
 def token(self) -> JsonObject:
-if err := self._get_param("error_code", None):
+err = self._get_param("error_code", None)
+if err:
 self._response_code = self._get_param("error_status", 400)
 
 resp = {"error": err}
-if desc := self._get_param("error_desc", ""):
+desc = self._get_param("error_desc", "")
+if desc:
 resp["error_description"] = desc
 
 return resp


Re: [BUG] temporary file usage report with extended protocol and unnamed portals

2025-04-22 Thread Sami Imseih
> In any case, my v3 patch seems to fix all these cases.
>
> (I'm not saying it's good enough to be committed as is. I think I should
> at least add some comments. Anything else?)

the patch relies on looking up queryDesc->sourceText inside DropPortal,
which Tom raised concerns about earlier in the thread [0]

So, It seems to me we are better off just setting debug_query_string
to NULL in DropPortal, or alternatively why not just log the statement
automatically at the start of execution whenever we have log_temp_files > 0.
That will allow us to safely capture the statement to blame for the
temp files and
will cover all cases?

[0] 
https://www.postgresql.org/message-id/CAA5RZ0ssqRTz_9T0Gy74SiTViiX3V0rSFxc4N_4GNcbEBK9wow%40mail.gmail.com

--
Sami Imseih
Amazon Web Services (AWS)




Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Aleksander Alekseev
Hi Pavel,

> It's not a big problem, but propose a simple fix for the tests. It
> just adds ORDER BY 1 to all relevant float4 and floa8 queries.

Thanks for the patch. That's a good change IMO.

> I don't
> have a strong opinion about backpatching this, but as the patch
> changes only regression tests, it's maybe also worth backpatching.

I don't see a reason for backpatching this but I'm not strongly
opposed to the idea either.

-- 
Best regards,
Aleksander Alekseev




Re: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details

2025-04-22 Thread Robin Haberkorn
I forgot the patch of course...

-- 
Robin Haberkorn
Senior Software Engineer

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / https://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt, HRB 3537
From cc5be302aa4305403b8131f6c22c39e4dfb75bd1 Mon Sep 17 00:00:00 2001
From: Robin Haberkorn 
Date: Thu, 17 Apr 2025 13:15:48 +0300
Subject: [PATCH] contrib/xml2: xslt_process() now reports XSLT-related error
 details

* It sets (and restores) the libxslt error handler (xsltSetGenericErrorFunc()).
  Since it only supports old-school "generic" error handlers, which are no longer
  used in PG's libxml code, we reintroduced a "generic" error handler
  xml_generic_error_handler() in xml.c.
* The alternative would have been to expose PgXmlErrorContext in xml.h,
  so we could implement a generic handler in xslt_proc.c.
  This is obviously not desirable, as it would depend on USE_LIBXML.
* No longer use the PG_XML_STRICTNESS_LEGACY for error handling,
  but query the error_occurred-flag via pg_xml_error_occurred().
  The existance of this getter is another hint, that we are not supposed
  to expose PgXmlErrorContext in xml.h.
* This change means that xslt_process() now reports not only details
  about XML parsing errors, but XSLT-schema deviations and missing
  stylesheet parameters as well.
* The XSLT error messages now also contain line numbers.
  For that to work, we had to set a dummy "SQL" URL when parsing XML strings.
  This is important, since xsltPrintErrorContext() includes
  line numbers only if an URL is set.
* The xsltSaveResultToString() error handling has been removed.
  It can practically only fail in OOM situations and there is no reason
  to handle them any different than with the other libxslt functions.
* Updated test suite and added test case for detecting missing
  stylesheet parameters.
  This was initially reported here but has obviously been fixed in the
  meantime:
  https://www.postgresql.org/message-id/4C5ECEF9.3030806%40mlfowler.com
---
 contrib/xml2/expected/xml2.out | 17 +++
 contrib/xml2/sql/xml2.sql  |  8 +++
 contrib/xml2/xslt_proc.c   | 40 --
 src/backend/utils/adt/xml.c| 37 +++
 src/include/utils/xml.h|  2 ++
 5 files changed, 92 insertions(+), 12 deletions(-)

diff --git a/contrib/xml2/expected/xml2.out b/contrib/xml2/expected/xml2.out
index 3d97b14..157d584 100644
--- a/contrib/xml2/expected/xml2.out
+++ b/contrib/xml2/expected/xml2.out
@@ -261,3 +261,20 @@ $$
 $$);
 ERROR:  failed to apply stylesheet
+DETAIL:  runtime error: file SQL line 7 element output
+File write for 0wn3d.txt refused
+runtime error: file SQL line 7 element output
+xsltDocumentElem: write rights for 0wn3d.txt denied
+-- detecting missing stylesheet parameter
+SELECT xslt_process('',
+$$http://www.w3.org/1999/XSL/Transform";>
+  
+
+  
+$$)::xml;
+ERROR:  failed to apply stylesheet
+DETAIL:  runtime error: file SQL line 3 element value-of
+Variable 'n1' has not been declared.
+Undefined variable
+runtime error: file SQL line 3 element value-of
+XPath evaluation returned no result.
diff --git a/contrib/xml2/sql/xml2.sql b/contrib/xml2/sql/xml2.sql
index ef99d16..9d42ac8 100644
--- a/contrib/xml2/sql/xml2.sql
+++ b/contrib/xml2/sql/xml2.sql
@@ -153,3 +153,11 @@ $$
   
 $$);
+
+-- detecting missing stylesheet parameter
+SELECT xslt_process('',
+$$http://www.w3.org/1999/XSL/Transform";>
+  
+
+  
+$$)::xml;
diff --git a/contrib/xml2/xslt_proc.c b/contrib/xml2/xslt_proc.c
index b720d89..f17cf9f 100644
--- a/contrib/xml2/xslt_proc.c
+++ b/contrib/xml2/xslt_proc.c
@@ -61,6 +61,10 @@ xslt_process(PG_FUNCTION_ARGS)
 	xmlChar*resstr = NULL;
 	int			reslen = 0;
 
+	/* the previous libxslt error context */
+	xmlGenericErrorFunc saved_errfunc;
+	void	   *saved_errcxt;
+
 	if (fcinfo->nargs == 3)
 	{
 		paramstr = PG_GETARG_TEXT_PP(2);
@@ -74,35 +78,46 @@ xslt_process(PG_FUNCTION_ARGS)
 	}
 
 	/* Setup parser */
-	xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_LEGACY);
+	xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_ALL);
+
+	/*
+	 * Save the previous libxslt error context.
+	 */
+	saved_errfunc = xsltGenericError;
+	saved_errcxt = xsltGenericErrorContext;
+	xsltSetGenericErrorFunc(xmlerrcxt, xml_generic_error_handler);
 
 	PG_TRY();
 	{
 		xmlDocPtr	ssdoc;
 		bool		xslt_sec_prefs_error;
 
-		/* Parse document */
+		/*
+		 * Parse document.
+		 * It's important to set an "URL", so libxslt includes
+		 * line numbers in error messages (cf. xsltPrintErrorContext()).
+		 */
 		doctree = xmlReadMemory((char *) VARDATA_ANY(doct),
-VARSIZE_ANY_EXHDR(doct), NULL, NULL,
+VARSIZE_ANY_EXHDR(doct), "SQL", NULL,
 XML_PARSE_NOENT);
 
-		if (doctree == NULL)
+		if (doctree == NULL || pg_xml_error_occurred(xmlerrcxt))
 			xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_XML_DOCUMENT,
 		"error parsing XML document");
 
 		/* Same for stylesheet */
 		ssdoc = xmlReadMem

Re: Large expressions in indexes can't be stored (non-TOASTable)

2025-04-22 Thread Nathan Bossart
On Tue, Apr 22, 2025 at 05:17:17PM +0530, Nisha Moond wrote:
> Thanks for the patch! I tested it for functionality and here are a few 
> comments:

Thank you for reviewing.

> 1) While testing, I noticed an unexpected behavior with the new 512
> byte length restriction in place. Since there´s no explicit
> restriction on the column roname´s type, it´s possible to insert an
> origin name exceeding 512 bytes. For instance, can use the COPY
> command -- similar to how pg_dump might dump and restore catalog
> tables.
> 
> For example:
>   postgres=# COPY pg_catalog.pg_replication_origin (roident, roname) FROM 
> stdin;
>   Enter data to be copied followed by a newline.
>   End with a backslash and a period on a line by itself, or an EOF signal.
>   >> 44   regress_... [>512 bytes string]
>   >> \.
>   COPY 1
> 
> Once inserted, I was able to use  the pg_replication_origin_xxx
> functions with this origin name without any errors.
> Not sure how common pg_dump/restore of catalog tables is in real
> systems, but should we still consider if this behavior is acceptable?

I'm personally not too worried about this.  The limit is primarily there to
provide a nicer ERROR than "row is too big", which AFAICT is the worst-case
scenario if you go to the trouble of setting allow_system_table_mods and
modifying shared catalogs directly.

> 5) As Euler pointed out, logical replication already has a built-in
> restriction for internally assigned origin names, limiting them to
> NAMEDATALEN. Given that, only the SQL function
> pg_replication_origin_create() is capable of creating longer origin
> names. Would it make sense to consider moving the check into
> pg_replication_origin_create() itself, since it seems redundant for
> all other callers?
> That said, if there's a possibility of future callers needing this
> restriction at a lower level, it may be reasonable to leave it as is.

pg_replication_origin_create() might be a better place.  After all, that's
where we're enforcing the name doesn't start with "pg_" and, for testing,
_does_ start with "regress_".  Plus, it seems highly unlikely that any
other callers of replorigin_create() will ever provide names longer than
512 bytes.

-- 
nathan




Re: long-standing data loss bug in initial sync of logical replication

2025-04-22 Thread Masahiko Sawada
On Tue, Mar 18, 2025 at 2:55 AM Amit Kapila  wrote:
>
> On Mon, Mar 17, 2025 at 4:56 PM Hayato Kuroda (Fujitsu)
>  wrote:
> >
> > > Regarding the PG13, it cannot be
> > > applied
> > > as-is thus some adjustments are needed. I will share it in upcoming posts.
> >
> > Here is a patch set for PG13. Apart from PG14-17, the patch could be 
> > created as-is,
> > because...
> >
> > 1. WAL record for invalidation messages (XLOG_XACT_INVALIDATIONS) does not 
> > exist.
> > 2. Thus the ReorderBufferChange for the invalidation does not exist.
> >Our patch tries to distribute it but cannot be done as-is.
> > 3. Codes assumed that invalidation messages can be added only once.
> > 4. The timing when invalidation messages are consumed is limited:
> >   a. COMMAND_ID change is poped,
> >   b. start of decoding a transaction, or
> >   c. end of decoding a transaction.
> >
> > Above means that invalidations cannot be executed while being decoded.
> > I created two patch sets to resolve the data loss issue. 0001 has less code
> > changes but could resolve a part of issue, 0002 has huge changes but 
> > provides a
> > complete solution.
> >
> > 0001 - mostly same as patches for other versions. 
> > ReorderBufferAddInvalidations()
> >was adjusted to allow being called several times. As I said above,
> >0001 cannot execute inval messages while decoding the transacitons.
> > 0002 - introduces new ReorderBufferChange type to indicate inval messages.
> >It would be handled like PG14+.
> >
> > Here is an example. Assuming that the table foo exists on both nodes, a
> > publication "pub" which publishes all tables, and a subscription "sub" which
> > subscribes "pub". What if the workload is executed?
> >
> > ```
> > S1  S2
> > BEGIN;
> > INSERT INTO foo VALUES (1)
> > ALTER PUBLICATION pub RENAME TO pub_renamed;
> > INSERT INTO foo VALUES (2)
> > COMMIT;
> > LR -> ?
> > ```
> >
> > With 0001, tuples (1) and (2) would be replicated to the subscriber.
> > An error "publication "pub" does not exist" would raise when new changes 
> > are done
> > later.
> >
> > 0001+0002 works more aggressively; the error would raise when S1 
> > transaction is decoded.
> > The behavior is same as for patched PG14-PG17.
> >
>
> I understand that with 0001 the fix is partial in the sense that
> because invalidations are processed at the transaction end, the
> changes of concurrent DDL will only be reflected for the next
> transaction. Now, on one hand, it is prudent to not add a new type of
> ReorderBufferChange in the backbranch (v13) but the change is not that
> invasive, so we can go with it as well. My preference would be to go
> with just 0001 for v13 to minimize the risk of adding new bugs or
> breaking something unintentionally.
>
> Sawada-San, and others involved here, do you have any suggestions on
> this matter?

Sorry for the late response.

I agree with just 0001 for v13 as 0002 seems invasive. Given that v13
would have only a few releases until EOL and 0001 can deal with some
cases in question, I'd like to avoid such invasive  changes in v13. It
would not be advisable to change the ReorderBufferChange format in
minor release even though it would not change the struct size.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Pavel Borisov
Hi, Tom!

On Tue, 22 Apr 2025 at 21:13, Tom Lane  wrote:
>
> Pavel Borisov  writes:
> > On Tue, 22 Apr 2025 at 20:22, Alexander Korotkov  
> > wrote:
> >> On Tue, Apr 22, 2025 at 7:20 PM Tom Lane  wrote:
> >>> Alexander Korotkov  writes:
>  I'd like to add that float4.out not only assumes that insert-ordering is
>  preserved (this could be more-or-less portable between table AMs).  It 
>  also
>  assumes the way UPDATE moves updated rows.  That seems quite
>  heap-specific.  You can see in the following fragment, updated rows jump 
>  to
>  the bottom.
>
> >>> I'd be willing to consider a policy that we don't want to depend on
> >>> exactly where UPDATE moves rows to.  The proposed patch is not that,
> >>> however.
>
> >> OK, that makes sense for me.
>
> > Thanks for this input!
> > This was my first intention to fix only the test that was affected by
> > UPDATE-order specifics, broke when runnung on an extension AM.
> > Maybe I was too liberal and added ORDER BY's more than needed.
> > I definitely agree to the proposal.
>
> On further reflection, this policy makes plenty of sense because even
> heapam is not all that stable about row order after UPDATE.  With
> tables large enough to draw the attention of autovacuum, we've had
> to use ORDER BY or other techniques to stabilize the results, because
> the timing of background autovacuums affected where rows got put.
> These tests are different only because the tables are small enough
> and the updates few enough that autovacuum doesn't get involved.
> I think it's reasonable that at a project-policy level we should
> not consider that an excuse.  For example, a change in autovacuum's
> rules could probably break these tests even with heapam.
>
> On the other hand, as I mentioned earlier, a table AM that doesn't
> reproduce original insertion order would break a whole lot more
> tests than these.  So that's a bridge I'd rather not cross,
> at least not without a lot of consideration.
>
> BTW, you forgot to update expected/float4-misrounded-input.out.
Added in v3. Thanks for a mention!

Regards,
Pavel Borisov


v3-0001-Fortify-float4-and-float8-regression-tests-agains.patch
Description: Binary data


Re: What's our minimum supported Python version?

2025-04-22 Thread Tom Lane
Jacob Champion  writes:
> As for picking a version... 3.6 will have been EOL for almost three
> years by the time 18 releases. It seems like we would drop it happily,
> were it not for RHEL8.

Agreed, but RHEL8 is out there and I don't think we can just drop
support for it.  I'm also not excited by the idea that an incidental
test script gets to dictate what the cutoff is.

I do think we should stop claiming that Python 3.2 will work.
(Maybe it does, but we don't know that.)  I see that the configure
script only checks for Python >= 3, and it doesn't look like the
meson scripts check explicitly at all, although there's a comment
saying that our meson version cutoff is intended to allow working
with Python 3.5.

Maybe it's sufficient to make a documentation change here, and
say we support Python >= 3.5?  I'd be okay with saying 3.6.8
too, on the grounds that if anything older fails to work we'd
almost certainly just say "too bad".  But RHEL8 is widespread
enough that I think we need to keep making the effort for 3.6.8.

regards, tom lane




Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Alexander Korotkov
On Tue, Apr 22, 2025 at 5:57 PM Pavel Borisov 
wrote:
> On Tue, 22 Apr 2025 at 18:40, Pavel Borisov 
wrote:
> > On Tue, 22 Apr 2025 at 18:23, Tom Lane  wrote:
> > >
> > > Pavel Borisov  writes:
> > > > It's not a big problem, but propose a simple fix for the tests. It
> > > > just adds ORDER BY 1 to all relevant float4 and floa8 queries.
> > >
> > > You do realize that this problem is hardly confined to the
> > > float4 and float8 tests?  To a first approximation, a table AM
> > > that fails to preserve insertion order would break every single
> > > one of our regression tests.  (I speak from experience, as
> > > Salesforce had to deal with this when I was there...)
> > >
> > > The reason why we don't simply add ORDER BY to everything is
> > > explained at [1]:
> > >
> > > You might wonder why we don't order all the regression test
> > > queries explicitly to get rid of this issue once and for all.  The
> > > reason is that that would make the regression tests less useful,
> > > not more, since they'd tend to exercise query plan types that
> > > produce ordered results to the exclusion of those that don't.
> > >
> > > At some point we will probably have to think through what we
> > > want to do about running the regression tests with table AMs that
> > > don't wish to preserve ordering as much as heapam does.  It's going
> > > to need careful balancing of multiple concerns, and I don't think
> > > "blindly slap ORDER BY everywhere" is going to be an acceptable
> > > answer.
> > >
> >
> > I agree we might need to elaborate different AM support in regression
tests.
> > Also, I agree that these are not the only tests to be fixed.
> >
> > What I hardly agree with, is that we suppose it's right for regression
> > tests to require some fixed results ordering for so simple cases.
> > Maybe for more complicated plans it's useful, but for the float tests
> > mentioned in the patch it's this requirement is a total loss IMO.
> >
> > I understand your sentiment against blindly slapping order by's, but I
> > don't see a useful alternative for this time. Also a large number of
> > tests in PG were already fortified with ORDER BY 1.
>
> I forgot to mention that it's not only a question of INSERTs ordering.
> Extension AM can have some internal ordering e.g. index-organized
> tables. And besides SELECT query results following internal AM
> ordering just being allowed, more importantly they are good
> performance-wise and justify table AM extensibility.

+1,
I'd like to add that float4.out not only assumes that insert-ordering is
preserved (this could be more-or-less portable between table AMs).  It also
assumes the way UPDATE moves updated rows.  That seems quite
heap-specific.  You can see in the following fragment, updated rows jump to
the bottom.

-- test the unary float4abs operator
SELECT f.f1, @f.f1 AS abs_f1 FROM FLOAT4_TBL f;
  f1   |abs_f1
---+---
 0 | 0
1004.3 |1004.3
-34.84 | 34.84
 1.2345679e+20 | 1.2345679e+20
 1.2345679e-20 | 1.2345679e-20
(5 rows)

UPDATE FLOAT4_TBL
   SET f1 = FLOAT4_TBL.f1 * '-1'
   WHERE FLOAT4_TBL.f1 > '0.0';
SELECT * FROM FLOAT4_TBL;
   f1

  0
 -34.84
-1004.3
 -1.2345679e+20
 -1.2345679e-20
(5 rows)

--
Regards,
Alexander Korotkov
Supabase


Re: What's our minimum supported Python version?

2025-04-22 Thread Tom Lane
Jacob Champion  writes:
> On Tue, Apr 22, 2025 at 9:04 AM Tom Lane  wrote:
>> But RHEL8 is widespread
>> enough that I think we need to keep making the effort for 3.6.8.

> Even if we can get side-by-side versions working?

That would be off-loading our problem onto the users (and the
buildfarm owners).  If we really had little other choice,
we could go there; but it doesn't sound like this is that
hard to work around.

regards, tom lane




Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Pavel Borisov
Hi!

On Tue, 22 Apr 2025 at 20:22, Alexander Korotkov  wrote:
>
> On Tue, Apr 22, 2025 at 7:20 PM Tom Lane  wrote:
> > Alexander Korotkov  writes:
> > > I'd like to add that float4.out not only assumes that insert-ordering is
> > > preserved (this could be more-or-less portable between table AMs).  It 
> > > also
> > > assumes the way UPDATE moves updated rows.  That seems quite
> > > heap-specific.  You can see in the following fragment, updated rows jump 
> > > to
> > > the bottom.
> >
> > I'd be willing to consider a policy that we don't want to depend on
> > exactly where UPDATE moves rows to.  The proposed patch is not that,
> > however.
>
> OK, that makes sense for me.

Thanks for this input!
This was my first intention to fix only the test that was affected by
UPDATE-order specifics, broke when runnung on an extension AM.
Maybe I was too liberal and added ORDER BY's more than needed.
I definitely agree to the proposal.

Please find attached v2 that fixes only UPDATE-specific part of
float4/float8 test.


v2-0001-Fortify-float4-and-float8-regression-tests-agains.patch
Description: Binary data


Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Pavel Borisov
Hi, hackers!

I noticed that SELECT results in float4 and float8 tests lack ORDER BY
clauses. This makes test results depend on the current heap/MVCC
implementation.

If I try to run the float8 test on a table created with a different
access method provided by an extension, I'm getting results ordered
differently.

It's not a big problem, but propose a simple fix for the tests. It
just adds ORDER BY 1 to all relevant float4 and floa8 queries. I don't
have a strong opinion about backpatching this, but as the patch
changes only regression tests, it's maybe also worth backpatching.

Regards,
Pavel Borisov,
Supabase.


v1-0001-Fortify-float4-and-float8-tests-by-ordering-test-.patch
Description: Binary data


Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Pavel Borisov
Hi, Tom!

On Tue, 22 Apr 2025 at 18:23, Tom Lane  wrote:
>
> Pavel Borisov  writes:
> > It's not a big problem, but propose a simple fix for the tests. It
> > just adds ORDER BY 1 to all relevant float4 and floa8 queries.
>
> You do realize that this problem is hardly confined to the
> float4 and float8 tests?  To a first approximation, a table AM
> that fails to preserve insertion order would break every single
> one of our regression tests.  (I speak from experience, as
> Salesforce had to deal with this when I was there...)
>
> The reason why we don't simply add ORDER BY to everything is
> explained at [1]:
>
> You might wonder why we don't order all the regression test
> queries explicitly to get rid of this issue once and for all.  The
> reason is that that would make the regression tests less useful,
> not more, since they'd tend to exercise query plan types that
> produce ordered results to the exclusion of those that don't.
>
> At some point we will probably have to think through what we
> want to do about running the regression tests with table AMs that
> don't wish to preserve ordering as much as heapam does.  It's going
> to need careful balancing of multiple concerns, and I don't think
> "blindly slap ORDER BY everywhere" is going to be an acceptable
> answer.
>

I agree we might need to elaborate different AM support in regression tests.
Also, I agree that these are not the only tests to be fixed.

What I hardly agree with, is that we suppose it's right for regression
tests to require some fixed results ordering for so simple cases.
Maybe for more complicated plans it's useful, but for the float tests
mentioned in the patch it's this requirement is a total loss IMO.

I understand your sentiment against blindly slapping order by's, but I
don't see a useful alternative for this time. Also a large number of
tests in PG were already fortified with ORDER BY 1.

Regards,
Pavel Borisov
Supabase




Re: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details

2025-04-22 Thread Tom Lane
"Robin Haberkorn"  writes:
> I forgot the patch of course...

Hi Robin, thank you for the patch!  This has arrived too late to
be considered for PG v18, but please add the thread to the open
commitfest

https://commitfest.postgresql.org/53/

so that we don't forget about it for v19, and so that it will
receive automated testing from our cfbot.

Just from a quick eyeballing of the patch, it seems generally
sane, but I wonder why you have xml_generic_error_handler
creating and then destroying an errorBuf instead of just
appending the results directly to xmlerrcxt->err_buf.  The
extra buffer doesn't seem to be doing much except increasing
our odds of an OOM failure right there.

I'm also a little suspicious of

-   xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_LEGACY);
+   xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_ALL);

as that looks like it is probably changing the behavior of
the module in ways bigger than just providing more error
details.  This code has been in legacy status for so long
that I feel a little hesitant about doing that.  If we are
willing to do it, should we go further and clean out the use
of PG_XML_STRICTNESS_LEGACY in xml2/xpath.c as well?

regards, tom lane




Re: What's our minimum supported Python version?

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 8:05 AM Tom Lane  wrote:
> The first problem is that this Python version seems not to
> like assignments embedded in if statements: [...]
>
> I was able to work around that with the attached quick hack.
> But then I get [...]

Thanks, I'll put a patch together. Sorry -- IIRC, both of those
features are probably too new for me to have used, regardless of what
we decide is the minimum version for 18.

As for picking a version... 3.6 will have been EOL for almost three
years by the time 18 releases. It seems like we would drop it happily,
were it not for RHEL8. But RHEL also supports side-by-side Python,
right? In general, I'd be more than willing to try plumbing that
through the new tests, if it lets us drop end-of-life versions for new
code.

(I don't know if we'll need to drop 3.6 for PG18, specifically, but I
also think we shouldn't plan to support 3.6 for the full RHEL8
lifetime.)

Thanks,
--Jacob




Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Alexander Korotkov
On Tue, Apr 22, 2025 at 7:20 PM Tom Lane  wrote:
> Alexander Korotkov  writes:
> > I'd like to add that float4.out not only assumes that insert-ordering is
> > preserved (this could be more-or-less portable between table AMs).  It also
> > assumes the way UPDATE moves updated rows.  That seems quite
> > heap-specific.  You can see in the following fragment, updated rows jump to
> > the bottom.
>
> I'd be willing to consider a policy that we don't want to depend on
> exactly where UPDATE moves rows to.  The proposed patch is not that,
> however.

OK, that makes sense for me.

--
Regards,
Alexander Korotkov
Supabase




Re: ZStandard (with dictionaries) compression support for TOAST compression

2025-04-22 Thread Andres Freund
Hi,

On 2025-04-18 12:22:18 -0400, Robert Haas wrote:
> On Tue, Apr 15, 2025 at 2:13 PM Nikhil Kumar Veldanda
>  wrote:
> > Addressing Compressed Datum Leaks problem (via CTAS, INSERT INTO ... SELECT 
> > ...)
> >
> > As compressed datums can be copied to other unrelated tables via CTAS,
> > INSERT INTO ... SELECT, or CREATE TABLE ... EXECUTE, I’ve introduced a
> > method inheritZstdDictionaryDependencies. This method is invoked at
> > the end of such statements and ensures that any dictionary
> > dependencies from source tables are copied to the destination table.
> > We determine the set of source tables using the relationOids field in
> > PlannedStmt.
> 
> With the disclaimer that I haven't opened the patch or thought
> terribly deeply about this issue, at least not yet, my fairly strong
> suspicion is that this design is not going to work out, for multiple
> reasons. In no particular order:
> 
> 1. I don't think users will like it if dependencies on a zstd
> dictionary spread like kudzu across all of their tables. I don't think
> they'd like it even if it were 100% accurate, but presumably this is
> going to add dependencies any time there MIGHT be a real dependency
> rather than only when there actually is one.
> 
> 2. Inserting into a table or updating it only takes RowExclusiveLock,
> which is not even self-exclusive. I doubt that it's possible to change
> system catalogs in a concurrency-safe way with such a weak lock. For
> instance, if two sessions tried to do the same thing in concurrent
> transactions, they could both try to add the same dependency at the
> same time.
> 
> 3. I'm not sure that CTAS, INSERT INTO...SELECT, and CREATE
> TABLE...EXECUTE are the only ways that datums can creep from one table
> into another. For example, what if I create a plpgsql function that
> gets a value from one table and stores it in a variable, and then use
> that variable to drive an INSERT into another table? I seem to recall
> there are complex cases involving records and range types and arrays,
> too, where the compressed object gets wrapped inside of another
> object; though maybe that wouldn't matter to your implementation if
> INSERT INTO ... SELECT uses a sufficiently aggressive strategy for
> adding dependencies.

+1 to all of these.


> I think we could add plain-old zstd compression without really
> tackling this issue

+1


> I'm now also curious to know whether Andres would agree that it's bad
> if zstd dictionaries are un-droppable. After all, I thought it would
> be bad if there was no way to eliminate a dependency on a compression
> method, and he disagreed.

I still am not too worried about that aspect. However:


> So maybe he would also think undroppable dictionaries are fine.

I'm much less sanguine about this. Imagine a schema based multi-tenancy setup,
where tenants come and go, and where a few of the tables use custom
dictionaries. Whereas not being able to get rid of lz4 at all has basically no
cost whatsoever, collecting more and more unusable dictionaries can imply a
fair amount of space usage after a while. I don't see any argument why that
would be ok, really.


> But maybe not. It seems even worse to me than undroppable compression
> methods, because you'll probably not have that many compression methods
> ever, but you could have a large number of dictionaries eventually.

Agreed on the latter.

Greetings,

Andres Freund




Re: index prefetching

2025-04-22 Thread Peter Geoghegan
On Tue, Apr 22, 2025 at 6:46 AM Tomas Vondra  wrote:
> here's an improved (rebased + updated) version of the patch series, with
> some significant fixes and changes. The patch adds infrastructure and
> modifies btree indexes to do prefetching - and AFAIK it passes all tests
> (no results, correct results).

Cool!

> Compared to the last patch version [1] shared on list (in November),
> there's a number of significant design changes - a lot of this is based
> on a number of off-list discussions I had with Peter Geoghegan, which
> was very helpful.

Thanks for being so receptive to my feedback. I know that I wasn't
particularly clear. I mostly only gave you my hand-wavy, caveat-laden
ideas about how best to layer things. But you were willing to give
them full and fair consideration.

> 1) patch now relies on read_stream

> So I squashed these two parts, and the patch now does read_stream (for
> the table reads) from the beginning.

Make sense.

> Based on the discussions with Peter I decided to make this a bit more
> ambitious, moving the whole batch management from the index AM to the
> indexam.c level. So now there are two callbacks - amgetbatch and
> amfreebatch, and it's up to indexam.c to manage the batches - decide how
> many batches to allow, etc. The index AM is responsible merely for
> loading the next batch, but does not decide when to load or free a
> batch, how many to keep in memory, etc.
>
> There's a section in indexam.c with a more detailed description of the
> design, I'm not going to explain all the design details here.

To me, the really important point about this high-level design is that
it provides a great deal of flexibility around reordering work, while
still preserving the appearance of an index scan that performs work in
the same old fixed order. All relevant kinds of work (whether table AM
and index AM related work) are under the direct control of one single
module. There's one central place for a mechanism that weighs both
costs and benefits, keeping things in balance.

(I realize that there's still some sense in which that isn't true,
partly due to the read stream interface, but for now the important
thing is that we're agreed on this high level direction.)

> I think the indexam.c is a sensible layer for this. I was hoping doing
> this at the "executor level" would mean no need for AM code changes, but
> that turned out not possible - the AM clearly needs to know about the
> batch boundaries, so that it can e.g. do killtuples, etc. That's why we
> need the two callbacks (not just the "amgetbatch" one). At least this
> way it's "hidden" by the indexam.c API, like index_getnext_slot().

Right. But (if I'm not mistaken) the index AM doesn't actually need to
know *when* to do killtuples. It still needs to have some handling for
this, since we're actually modifying index pages, and we need to have
handling for certain special cases (e.g., posting list tuples) on the
scan side. But it can be made to work in a way that isn't rigidly tied
to the progress of the scan -- it's perfectly fine to do this work
somewhat out of order, if that happens to make sense. It doesn't have
to happen in perfect lockstep with the scan, right after the items
from the relevant leaf page have all been returned.

It should also eventually be possible to do things like perform
killtuples in a different process (perhaps even thread?) to the one
that originally read the corresponding leaf page items. That's the
kind of long term goal to keep in mind, I feel.

> (You could argue indexam.c is "executor" and maybe it is - I don't know
> where exactly to draw the line. I don't think it matters, really. The
> "hidden in indexam API" is the important bit.)

The term that I've used is "index scan manager", since it subsumes
some of the responsibilities related to scheduling work that has
traditionally been under the control of index AMs. I'm not attached to
that name, but we should agree upon some name for this new concept. It
is a new layer, above the index AM but below the executor proper, and
so it feels like it needs to be clearly distinguished from the two
adjoining layers.

> Or maybe we should
> even rip out the amgettuple() entirely, and only support one of those
> for each AM? That's what Peter suggested, but I'm not convinced we
> should do that.

Just to be clear, for other people reading along: I never said that we
should fully remove amgettuple as an interface. What I said was that I
think that we should remove btgettuple(), and any other amgettuple
routine within index AMs that switch over to using the new interface.

I'm not religious about removing amgettuple() from index AMs that also
support the new batch interface. It's probably useful to keep around
for now, for debugging purposes. My point was only this: I know of no
good reason to keep around btgettuple in the first committed version
of the patch. So if you're going to keep it around, you should surely
have at least one explicit reason for doing s

Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Tom Lane
Alexander Korotkov  writes:
> I'd like to add that float4.out not only assumes that insert-ordering is
> preserved (this could be more-or-less portable between table AMs).  It also
> assumes the way UPDATE moves updated rows.  That seems quite
> heap-specific.  You can see in the following fragment, updated rows jump to
> the bottom.

I'd be willing to consider a policy that we don't want to depend on
exactly where UPDATE moves rows to.  The proposed patch is not that,
however.

regards, tom lane




Re: What's our minimum supported Python version?

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 9:04 AM Tom Lane  wrote:
> I'm also not excited by the idea that an incidental
> test script gets to dictate what the cutoff is.

I agree, and I don't intend for the script to dictate that.

> Maybe it's sufficient to make a documentation change here, and
> say we support Python >= 3.5?  I'd be okay with saying 3.6.8
> too, on the grounds that if anything older fails to work we'd
> almost certainly just say "too bad".

I'm definitely on board with pulling the minimum up, as high as we
think we can get away with. If you're comfortable with 3.6(.8), I'd
say let's start there.

> But RHEL8 is widespread
> enough that I think we need to keep making the effort for 3.6.8.

Even if we can get side-by-side versions working? RHEL8 has a bunch of
newer Python 3 versions available, according to the docs. And
virtualenvs can do a lot of lifting for us in practice.

Thanks,
--Jacob




Re: High CPU consumption in cascade replication with large number of walsenders and ConditionVariable broadcast issues

2025-04-22 Thread Alexey Makhmutov
After playing with the patch a little more, we’ve come to the 
conclusion, that the idea to signal CV broadcast from the timer handler 
is a risky one, as it creates a large number of execution paths, which 
are not present in current code base. It's hard to prove correctness of 
application behavior in each such case, so we've decided to use a 
different approach.


In the new version of the patch we use timer handler only to set the 
flag, which is then checked in the ProcessStartupProcInterrupts 
function. This allow us to send signal on timeout if the startup process 
is waiting for the arrival of new WAL records (in ReadRecord), but the 
notification may be delayed while record is being applied (during redo 
handler invocation from ApplyWalRecord). This could increase delay for 
some corner cases with non-trivial WAL records like ‘drop database’, but 
this should be a rare case and walsender process have its own limit on 
the wait time, so the delay won’t be indefinite even in this case.


A new variant of the patch is attached in case anybody else wants to 
play with this patch and approach. A slightly improved test case and 
formatted CV patch (which is not strictly required anymore for this 
case) are attached as well.


Thanks,
Alexey

From b6749b210cceabcc0c382935ec412182b047fbb4 Mon Sep 17 00:00:00 2001
From: Alexey Makhmutov 
Date: Fri, 28 Mar 2025 16:16:36 +0300
Subject: [PATCH 1/2] Ensure that content of the wait list on CV is aligned
 with cv_sleep_target value

Wait on the ConditionVariable could be interleaved with interruption handlers,
such as timers. If such handler uses CV calls (i.e. ConditionVariableBroadcast),
then value of the cv_sleep_target could be null or point to a different CV after
wakeup. In this case we should not try to add ourselves to the wakeup wait list,
as subsequent call to ConditionVariableCancelSleep won't be able to find our CV
to clear its list. We should instead return control to the client, so the exit
condition for the CV external wait loop could be checked. If wait is still
required, then next invocation of ConditionVariableTimedSleep will perform the
required setup procedures for CV. Behavior before this change results in the
mismatch between cv_sleep_target value and content of the wakeup list, so the
next invocation of ConditionVariableBroadcast method may fail due to presence of
current process in the list.
---
 src/backend/storage/lmgr/condition_variable.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 228303e77f7..ab4ad7e5641 100644
--- a/src/backend/storage/lmgr/condition_variable.c
+++ b/src/backend/storage/lmgr/condition_variable.c
@@ -165,6 +165,17 @@ ConditionVariableTimedSleep(ConditionVariable *cv, long timeout,
 		/* Reset latch before examining the state of the wait list. */
 		ResetLatch(MyLatch);
 
+		/*
+		 * First, check that we are still expected to wait on this variable.
+		 * If sleep target has changed, then we need to return spuriously to
+		 * allow caller to check the exit condition and invoke the wait
+		 * function again to properly prepare for the next sleep. Note, that
+		 * we don't need to change wait list in this case, as we should be
+		 * deleted from the list in case of cv_sleep_target change.
+		 */
+		if (cv != cv_sleep_target)
+			return false;
+
 		/*
 		 * If this process has been taken out of the wait list, then we know
 		 * that it has been signaled by ConditionVariableSignal (or
-- 
2.49.0

From 6eaeb7a7fbb40fc3a8fb04a2ea22df9a54747a0a Mon Sep 17 00:00:00 2001
From: Rustam Khamidullin 
Date: Fri, 14 Mar 2025 18:18:34 +0700
Subject: [PATCH 2/2] Implement batching for WAL records notification during
 cascade replication

Currently standby server notifies walsenders after applying of each WAL
record during cascade replication. This creates a bottleneck in case of large
number of sender processes during WalSndWakeup invocation. This change
introduces batching for such notifications, which are now sent either after
certain number of applied records or specified time interval (whichever comes
first).

Co-authored-by: Alexey Makhmutov 
---
 src/backend/access/transam/xlogrecovery.c | 99 ++-
 src/backend/postmaster/startup.c  |  5 ++
 src/backend/utils/misc/guc_tables.c   | 22 +
 src/include/access/xlogrecovery.h |  8 ++
 src/include/utils/timeout.h   |  1 +
 5 files changed, 133 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 6ce979f2d8b..43e95602223 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -64,6 +64,7 @@
 #include "utils/pg_lsn.h"
 #include "utils/ps_status.h"
 #include "utils/pg_rusage.h"
+#include "utils/timeout.h"
 
 /* Unsupported old recovery command file names (relative to $PGDATA) */
 #define

[PATCH] contrib/xml2: xslt_process() should report XSLT-related error details

2025-04-22 Thread Robin Haberkorn
Dear PG hackers,

here is a patch that improves the xslt_process() error handling. Error
handling was one of the points criticized earlier [1] and as far as I
understand it is one of the reasons that xslt_process() was never adopted
into core and contrib/xml2 is considered deprecated since PG 8. However the
point is also still on the TODO list ("Report errors returned by the XSLT
library") and xslt_process() has been asked for several times on the mailing
list. Memory handling bugs that have been reported earlier [2] have obviously
long been fixed.

The point raised in [1] was that xslt_process() would not report missing
stylesheet parameters at all. This hasn't been the case at the current HEAD.
However, xslt_process() still wouldn't report any details about errors
related to XSLT processing. This is what is fixed by this patch. It now
installs an XSLT error handler. See the patch comment for details. Since the
goal is to eventually get xslt_process() adopted into core, I also got rid of
PG_XML_STRICTNESS_LEGACY.

Perhaps you can tell me what else is preventing adoption into core. I believe
that xslt_process() should also accept the `xml` type as an alternative to
strings. Strings should be kept for backwards compatibility, though. Also,
the stylesheet parameter passing is very crude and limited. I suggest, that
the current method of passing them as strings (see parse_params()) is
deprecated and we don't try to tackle its limitations, but will instead
accept hstores. If you agree, I will work on these two tasks next.

Yours sincerely,
Robin Haberkorn

[1]: https://www.postgresql.org/message-id/4C5ECEF9.3030806%40mlfowler.com
[2]: https://www.postgresql.org/message-id/flat/4A6A276A.6090405%40dunslane.net

-- 
Robin Haberkorn
Senior Software Engineer

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / https://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt, HRB 3537




Re: Vacuum statistics

2025-04-22 Thread Andrei Lepikhov

On 10/28/24 14:40, Alexander Korotkov wrote:

On Sun, Aug 25, 2024 at 6:59 PM Alena Rybakina

If I missed something or misunderstood, can you explain in more detail?


Actually, I mean why do we need a possibility to return statistics for
all tables/indexes in one function call?  User anyway is supposed to
use pg_stat_vacuum_indexes/pg_stat_vacuum_tables view, which do
function calls one per relation.  I suppose we can get rid of
possibility to get all the objects in one function call and just
return a tuple from the functions like other pgstatfuncs.c functions
do.
I suppose it was designed this way because databases may contain 
thousands of tables and indexes - remember, at least, partitions. But it 
may be okay to use the SRF_FIRSTCALL_INIT / SRF_RETURN_NEXT API. I think 
by registering a prosupport routine predicting cost and rows of these 
calls, we may let the planner build adequate plans for queries involving 
those stats - people will definitely join it with something else in the 
database.


--
regards, Andrei Lepikhov





Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Tom Lane
Pavel Borisov  writes:
> On Tue, 22 Apr 2025 at 21:13, Tom Lane  wrote:
>> BTW, you forgot to update expected/float4-misrounded-input.out.

> Added in v3. Thanks for a mention!

v3 LGTM, pushed.

regards, tom lane




Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Pavel Borisov
On Tue, 22 Apr 2025 at 22:25, Tom Lane  wrote:
>
> Pavel Borisov  writes:
> > On Tue, 22 Apr 2025 at 21:13, Tom Lane  wrote:
> >> BTW, you forgot to update expected/float4-misrounded-input.out.
>
> > Added in v3. Thanks for a mention!
>
> v3 LGTM, pushed.
Thank you!

Regards,
Pavel




Re: index prefetching

2025-04-22 Thread Tomas Vondra



On 4/22/25 18:26, Peter Geoghegan wrote:
> On Tue, Apr 22, 2025 at 6:46 AM Tomas Vondra  wrote:
>> here's an improved (rebased + updated) version of the patch series, with
>> some significant fixes and changes. The patch adds infrastructure and
>> modifies btree indexes to do prefetching - and AFAIK it passes all tests
>> (no results, correct results).
> 
> Cool!
> 
>> Compared to the last patch version [1] shared on list (in November),
>> there's a number of significant design changes - a lot of this is based
>> on a number of off-list discussions I had with Peter Geoghegan, which
>> was very helpful.
> 
> Thanks for being so receptive to my feedback. I know that I wasn't
> particularly clear. I mostly only gave you my hand-wavy, caveat-laden
> ideas about how best to layer things. But you were willing to give
> them full and fair consideration.
> 
>> 1) patch now relies on read_stream
> 
>> So I squashed these two parts, and the patch now does read_stream (for
>> the table reads) from the beginning.
> 
> Make sense.
> 
>> Based on the discussions with Peter I decided to make this a bit more
>> ambitious, moving the whole batch management from the index AM to the
>> indexam.c level. So now there are two callbacks - amgetbatch and
>> amfreebatch, and it's up to indexam.c to manage the batches - decide how
>> many batches to allow, etc. The index AM is responsible merely for
>> loading the next batch, but does not decide when to load or free a
>> batch, how many to keep in memory, etc.
>>
>> There's a section in indexam.c with a more detailed description of the
>> design, I'm not going to explain all the design details here.
> 
> To me, the really important point about this high-level design is that
> it provides a great deal of flexibility around reordering work, while
> still preserving the appearance of an index scan that performs work in
> the same old fixed order. All relevant kinds of work (whether table AM
> and index AM related work) are under the direct control of one single
> module. There's one central place for a mechanism that weighs both
> costs and benefits, keeping things in balance.
> 
> (I realize that there's still some sense in which that isn't true,
> partly due to the read stream interface, but for now the important
> thing is that we're agreed on this high level direction.)
> 

Yeah, that makes sense, although I've been thinking about this a bit
differently. I haven't been trying to establish a new "component" to
manage prefetching. For me the question was what's the right layer, so
that unnecessary details don't leak into AM and/or executor.

The AM could issue fadvise prefetches, or perhaps even feed blocks into
a read_stream, but it doesn't seem like the right place to ever do more
decisions. OTOH we don't want every place in the executor to reimplement
the prefetching, and indexam.c seems like a good place in between.

It requires exchanging some additional details with the AM, provided by
the new callbacks.

It seems the indexam.c achieves both your and mine goals, more or less.

>> I think the indexam.c is a sensible layer for this. I was hoping doing
>> this at the "executor level" would mean no need for AM code changes, but
>> that turned out not possible - the AM clearly needs to know about the
>> batch boundaries, so that it can e.g. do killtuples, etc. That's why we
>> need the two callbacks (not just the "amgetbatch" one). At least this
>> way it's "hidden" by the indexam.c API, like index_getnext_slot().
> 
> Right. But (if I'm not mistaken) the index AM doesn't actually need to
> know *when* to do killtuples. It still needs to have some handling for
> this, since we're actually modifying index pages, and we need to have
> handling for certain special cases (e.g., posting list tuples) on the
> scan side. But it can be made to work in a way that isn't rigidly tied
> to the progress of the scan -- it's perfectly fine to do this work
> somewhat out of order, if that happens to make sense. It doesn't have
> to happen in perfect lockstep with the scan, right after the items
> from the relevant leaf page have all been returned.
> 
> It should also eventually be possible to do things like perform
> killtuples in a different process (perhaps even thread?) to the one
> that originally read the corresponding leaf page items. That's the
> kind of long term goal to keep in mind, I feel.
> 

Right. The amfreebatch() does not mean the batch needs to be freed
immediately, it's just handed over back to the AM, and it's up to the AM
to do the necessary cleanup at some point. It might queue it for later,
or perhaps even do that in a separate thread ...

>> (You could argue indexam.c is "executor" and maybe it is - I don't know
>> where exactly to draw the line. I don't think it matters, really. The
>> "hidden in indexam API" is the important bit.)
> 
> The term that I've used is "index scan manager", since it subsumes
> some of the responsibilities related to scheduling work that has
> 

Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2025-04-22 Thread Peter Smith
Hi Ajin.

Here are some v17-0003 review comments (aka some v16-0003 comments
that were not yet addressed or rejected)

On Fri, Apr 11, 2025 at 4:05 PM Peter Smith  wrote:
...
> ==
> Commit message
>
> 2. missing commit message

Not yet addressed.

> ~~~
>
> 8.
> # Insert, delete, and update tests for restricted publication tables
> $log_location = -s $node_publisher->logfile;
> $node_publisher->safe_psql('postgres', "INSERT INTO insert_only_table
> VALUES (1, 'to be inserted')");
> $node_publisher->safe_psql('postgres', "UPDATE insert_only_table SET
> data = 'updated' WHERE id = 1");
> $logfile = slurp_file($node_publisher->logfile, $log_location);
> ok($logfile =~ qr/Filtering UPDATE/,
> 'unpublished UPDATE is filtered');
>
> $log_location = -s $node_publisher->logfile;
> $node_publisher->safe_psql('postgres', "DELETE FROM insert_only_table
> WHERE id = 1");
> $logfile = slurp_file($node_publisher->logfile, $log_location);
> ok($logfile =~ qr/Filtering DELETE/,
> 'unpublished DELETE is filtered');
>
> $log_location = -s $node_publisher->logfile;
> $node_publisher->safe_psql('postgres', "INSERT INTO delete_only_table
> VALUES (1, 'to be deleted')");
> $logfile = slurp_file($node_publisher->logfile, $log_location);
> ok($logfile =~ qr/Filtering INSERT/,
> 'unpublished INSERT is filtered');
>
> ~
...
> 8b.
> Change the comment or rearrange the tests so they are in the same
> order as described
>

The comment was changed, and now says "Insert, update, and delete
tests ..." but still, the "Filtering INSERT" test is last. IMO, that
test should come first to match the comment.

> ~
>
> 8c.
> Looking at the expected logs I wondered if it might be nicer for the
> pgoutput_filter_change doing this logging to also emit the relation
> name.
>

Not yet addressed

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Recent pg_rewind test failures in buildfarm

2025-04-22 Thread Michael Paquier
On Tue, Apr 22, 2025 at 06:22:51AM +, Bertrand Drouvot wrote:
> Yeah, unless that might come from fc415edf8ca but I don't think that's the 
> case.

I've been eyeing this whole area of the code for a few hours to
convince myself on HEAD, and I cannot really find a defect directly
related to it.

I am wondering if we should do a PGSTAT_BACKEND_FLUSH_WAL in the WAL
sender, but the end of any transaction happening in a logical WAL
sender would make sure that this happens on a periodic basis.

Anyway, sorry for make everybody waiting here.  I've now removed the
assertion down to v15.
--
Michael


signature.asc
Description: PGP signature


Re: Correct documentation for protocol version

2025-04-22 Thread Dave Cramer
On Fri, 11 Apr 2025 at 11:02, Jelte Fennema-Nio  wrote:

> On Fri, 11 Apr 2025 at 22:57, Dave Cramer  wrote:
> > Well this isn't quite true since if you request 3.0 and have invalid
> options it will return 3.0, which is not the highest supported minor
> version.
>
> Probably good to update this section too then to be similarly correct
> as your already updated section. Maybe also good to clarify further
> that the version that the server responds with is the protocol version
> that will be used during the following communication.
>

I've updated the wording to specify that the negotiateProtocol message will
only be sent if the client requests a major version the server supports.
Also added a note saying that this will be the protocol version that will
be used for the duration of the connectin

See attached.


protocol-6.patch
Description: Binary data


Re: Check for tuplestorestate nullness before dereferencing

2025-04-22 Thread Tom Lane
Nikolay Shaplov  writes:
> В письме от пятница, 18 октября 2024 г. 02:28:22 MSK пользователь David 
> Rowley 
> написал:

>> It would be good to know if the optimisation added in d2c555ee5 ever
>> applies with today's code. If it does apply, we should likely add a
>> test case for it and if it never does, then we should just remove the
>> optimisation and always create the tuplestore when it's NULL.

> That's sounds reasonable. It looks like that removing "node->eflags != 0" 
> check 
> is more logical then adding not null check.

I don't believe that the patch as-proposed is necessary or reasonable.

The case of node->eflags == 0 is reachable; I don't know why David's
test didn't see this, but when I try asserting the contrary I get
a regression crash on

(gdb) p debug_query_string 
$1 = 0x1d03160 "explain (costs off) declare c1 scroll cursor for select (select 
42) as x;"

The reason that the subsequent bit of code is safe is that !forward
should not possibly be true unless EXEC_FLAG_BACKWARD was given,
which'd cause us to create a tuplestore.  So if we were going
to change anything, I'd propose adding something more like

if (!forward && eof_tuplestore)
{
+   Assert(node->eflags & EXEC_FLAG_BACKWARD);
if (!node->eof_underlying)
{
/*

or perhaps more directly, Assert that tuplestore is not null.
But I don't like the proposed patch because if tuplestore is
null here, something's wrong, and we should complain not
silently mask the problem.

regards, tom lane




Re: [BUG] temporary file usage report with extended protocol and unnamed portals

2025-04-22 Thread Sami Imseih
> So, It seems to me we are better off just setting debug_query_string
> to NULL in DropPortal, or alternatively why not just log the statement
> automatically at the start of execution whenever we have log_temp_files > 0.
> That will allow us to safely capture the statement to blame for the
> temp files and will cover all cases?

of course we only know about the temp file usage after execution,
so this means we will probably end up with unnecessary
statement logging, so this may not be such a good idea after all.

So this tells me not logging STATEMENT along with log_temp_files is a
better alternative and a statement can be logged with other existing
mechanisms like log_min_duration_statement.

--
Sami Imseih
Amazon Web Services (AWS)




Re: long-standing data loss bug in initial sync of logical replication

2025-04-22 Thread Amit Kapila
On Tue, Apr 22, 2025 at 10:57 PM Masahiko Sawada  wrote:
>
> On Tue, Mar 18, 2025 at 2:55 AM Amit Kapila  wrote:
> >
> > On Mon, Mar 17, 2025 at 4:56 PM Hayato Kuroda (Fujitsu)
> >  wrote:
> > >
> > > > Regarding the PG13, it cannot be
> > > > applied
> > > > as-is thus some adjustments are needed. I will share it in upcoming 
> > > > posts.
> > >
> > > Here is a patch set for PG13. Apart from PG14-17, the patch could be 
> > > created as-is,
> > > because...
> > >
> > > 1. WAL record for invalidation messages (XLOG_XACT_INVALIDATIONS) does 
> > > not exist.
> > > 2. Thus the ReorderBufferChange for the invalidation does not exist.
> > >Our patch tries to distribute it but cannot be done as-is.
> > > 3. Codes assumed that invalidation messages can be added only once.
> > > 4. The timing when invalidation messages are consumed is limited:
> > >   a. COMMAND_ID change is poped,
> > >   b. start of decoding a transaction, or
> > >   c. end of decoding a transaction.
> > >
> > > Above means that invalidations cannot be executed while being decoded.
> > > I created two patch sets to resolve the data loss issue. 0001 has less 
> > > code
> > > changes but could resolve a part of issue, 0002 has huge changes but 
> > > provides a
> > > complete solution.
> > >
> > > 0001 - mostly same as patches for other versions. 
> > > ReorderBufferAddInvalidations()
> > >was adjusted to allow being called several times. As I said above,
> > >0001 cannot execute inval messages while decoding the transacitons.
> > > 0002 - introduces new ReorderBufferChange type to indicate inval messages.
> > >It would be handled like PG14+.
> > >
> > > Here is an example. Assuming that the table foo exists on both nodes, a
> > > publication "pub" which publishes all tables, and a subscription "sub" 
> > > which
> > > subscribes "pub". What if the workload is executed?
> > >
> > > ```
> > > S1  S2
> > > BEGIN;
> > > INSERT INTO foo VALUES (1)
> > > ALTER PUBLICATION pub RENAME TO 
> > > pub_renamed;
> > > INSERT INTO foo VALUES (2)
> > > COMMIT;
> > > LR -> ?
> > > ```
> > >
> > > With 0001, tuples (1) and (2) would be replicated to the subscriber.
> > > An error "publication "pub" does not exist" would raise when new changes 
> > > are done
> > > later.
> > >
> > > 0001+0002 works more aggressively; the error would raise when S1 
> > > transaction is decoded.
> > > The behavior is same as for patched PG14-PG17.
> > >
> >
> > I understand that with 0001 the fix is partial in the sense that
> > because invalidations are processed at the transaction end, the
> > changes of concurrent DDL will only be reflected for the next
> > transaction. Now, on one hand, it is prudent to not add a new type of
> > ReorderBufferChange in the backbranch (v13) but the change is not that
> > invasive, so we can go with it as well. My preference would be to go
> > with just 0001 for v13 to minimize the risk of adding new bugs or
> > breaking something unintentionally.
> >
> > Sawada-San, and others involved here, do you have any suggestions on
> > this matter?
>
> Sorry for the late response.
>
> I agree with just 0001 for v13 as 0002 seems invasive. Given that v13
> would have only a few releases until EOL and 0001 can deal with some
> cases in question, I'd like to avoid such invasive  changes in v13.
>

Fair enough. OTOH, we can leave the 13 branch considering following:
(a) it is near EOL, (b) bug happens in rare cases (when the DDLs like
ALTER PUBLICATION ... ADD TABLE ... or ALTER TYPE ...  that don't take
a strong lock on table happens concurrently to DMLs on the tables
involved in the DDL.), and (c) the complete fix is invasive, even
partial fix is not simple. I have a slight fear that if we make any
mistake in fixing it partially (of course, we can't see any today), we
may not even get a chance to fix it.

Now, if the above convinces you or someone else not to push the
partial fix in 13, then fine; otherwise, I'll push the 0001 to 13 day
after tomorrow.

-- 
With Regards,
Amit Kapila.




Re: Fix premature xmin advancement during fast forward decoding

2025-04-22 Thread shveta malik
On Tue, Apr 22, 2025 at 12:36 PM Zhijie Hou (Fujitsu)
 wrote:
>
> Hi,
>
> To fix this, I think we can allow the base snapshot to be built during fast
> forward decoding, as implemented in the patch 0001 (We already built base
> snapshot in fast-forward mode for logical message in logicalmsg_decode()).

The idea and code looks okay to me and the performance impact is also
not that huge.

IIUC, fast_forward decoding mode is used only in two cases. 1)
pg_replication_slot_advance and 2) in upgrade flow to check if there
are any pending WAL changes which are not yet replicated. See
'binary_upgrade_logical_slot_has_caught_up'-->'LogicalReplicationSlotHasPendingWal'.
It seems like this change will not have any negative impact in the
upgrade flow as well (in terms of performance and anything else).
Thoughts?

>
> Moreover, I conducted a basic test[2] to test the patch's impact, noting that
> advancing the slot incurs roughly a 4% increase in processing time after
> applying the patch, which appears to be acceptable. Additionally, the cost
> associated with building the snapshot via SnapBuildBuildSnapshot() did not 
> show
> up in the profile. Therefore, I think it's reasonable to refrain from
> further optimization at this stage.

I agree on this.

thanks
Shveta




Re: pgsql: Update guidance for running vacuumdb after pg_upgrade.

2025-04-22 Thread Nathan Bossart
On Tue, Apr 22, 2025 at 09:43:56PM +0200, Christoph Berg wrote:
> Re: Nathan Bossart
>> Update guidance for running vacuumdb after pg_upgrade.
>> 
>> Now that pg_upgrade can carry over most optimizer statistics, we
>> should recommend using vacuumdb's new --missing-stats-only option
>> to only analyze relations that are missing statistics.
> 
> I've been looking at vacuumdb --missing-stats-only because Debian's
> pg_upgradecluster is using that now.
> 
> I am wondering if this is really good advice in the pg_upgrade
> documentation. Sure it's nice that optimizer statistics are carried
> over by pg_upgrade, but the pg_stat_user_tables statistics are not
> carried over, and afaict these are the numbers that determine when the
> next autovacuum or autoanalyze run is going to happen. By removing the
> "please run vacuumdb on all tables" step from the pg_upgrade docs, we
> are effectively telling everyone that they should be starting with
> these numbers all 0, postponing the next run to some indeterminate
> point. Running `vacuumdb --missing-stats-only` does not fix that
> because it's skipping the tables. Is that the message we want to send?
> 
> (If I am misinterpreting the situation the docs should still explain
> why this is ok.)

relation_needs_vacanalyze() uses dead_tuples, ins_since_vacuum, and
mod_since_analyze.  IIUC a full post-upgrade vacuumdb run would only set
dead_tuples to a nonzero value, so the worst-case scenario is that it would
take longer before a vacuum is triggered based on
autovacuum_vacuum_{threshold,max_threshold,scale_factor}.  To address this,
I think we'd need to recommend using "vacuumdb --all --analyze-only"
instead.  We could alternatively suggest first running "vacuumdb --all
--analyze-in-stages --missing-stats-only" (to fill in any missing stats)
followed by "vacuumdb --all --analyze-only" (to update dead_tuples).

However, I'm not sure how concerned to be about this.  It does seem bad
that it might take longer for tables to be vacuumed for the first time
after upgrade, but I believe that's already the case for any type of
unclean shutdown (e.g., immediate shutdown, server crash, starting from a
base backup, point-in-time recovery).  I see that we do recommend running
ANALYZE after pg_stat_reset(), though.  In any case, IMO it's unfortunate
that we might end up recommending roughly the same post-upgrade steps as
before even though the optimizer statistics are carried over.

-- 
nathan




Re: pgsql: Update guidance for running vacuumdb after pg_upgrade.

2025-04-22 Thread Christoph Berg
Re: Nathan Bossart
> In any case, IMO it's unfortunate
> that we might end up recommending roughly the same post-upgrade steps as
> before even though the optimizer statistics are carried over.

Maybe the docs (and the pg_upgrade scripts) should recommend the old
procedure by default until this gap is closed? People could then still
opt to use the new procedure in specific cases.

Christoph




Re: What's our minimum supported Python version?

2025-04-22 Thread Jelte Fennema-Nio
tl;dr I think requiring support of Python 3.9 for PG18 would probably
be reasonable and save us a bunch of maintenance burden over the next
5 years.

On Tue, 22 Apr 2025 at 21:41, Tom Lane  wrote:
> Yeah, that.  The distros that are still shipping older Pythons
> are LTS distros where part of the deal is that the vendor
> is supposed to back-patch security fixes, even if the upstream
> has moved on.  Maybe a particular vendor is falling down on that
> job, but it's not for us to rate their performance.  So I don't
> buy that "but security!" is a good argument here.

I totally agree that "security" is not the discussion that we should
be having here. I think the Python version decision (or really any
minimum version decision) should be based on some analysis of costs to
us for maintaining support vs benefits to the users.

I'm pretty sure most users of RHEL expect most modern software not to
compile/work completely effortlessly on their distros without some
effort on their part or on the part of RHEL packagers. That's kinda
what you're signing up for if you choose a distro like that.

It might very well be that supporting Python 3.6 does not require many
code changes. But it's so old that installing it on most of the recent
OSes is not trivial, e.g. uv does not support installing it[1]. By
saying we support it, it means that all developers need to go through
serious hoops to be able to test that their python scripts work
correctly on their own machines. And given our own support policy,
we'd probably have to do that for 5 more years (unless we want to
change supported python versions in minor PG releases, which I highly
doubt we want).

So that leaves the question, how much are we actually helping users by
spending all that effort on keeping support for old Python versions?
Based on the Red Hat docs, it's pretty easy to install newer Python3
versions on RHEL8. Depending on the X in RHEL 8.X you can install
different Python3 versions from the official repos[2]. And based on
the official RHEL EOL policy[3], only RHEL 8.4, 8.6, 8.8 and 8.10 are
still getting some level of support. With 8.4 and 8.6 only getting the
final level of support called "Update Services for SAP Solutions", and
with 8.8 reaching that same final support level before PG18 is
released (and 8.4 becoming unsupported before PG18).

Based on that information (only taking into account only RHEL8, not
other distros). It seems like we could at least bump the minimum
Python support to Python 3.9, which is the newest python that's
installable in RHEL 8.4 and 8.6. Anyone that's running a RHEL version
that's still supported by Red Hat can then simply do "yum install
python39" and run the tests (assuming we set up meson/autoconf to
correctly detect the "python3.9" binary).

I'd even be curious how much resistance we'd get from RHEL users if
we'd bump the Python requirement for PG18 to Python 3.12, which is the
newest version that RHEL 8.10 supports. RHEL 8.10 is the only RHEL 8
version that is getting regular support from Red Hat.

(Full disclosure: I've used close to EOL versions of RHEL7
professionally, and I hated every second it. I constantly felt like I
was walking through a software museum. Having to work around bugs that
had been fixed for ages in upstream)

Looking a little further than RHEL:
- Ubuntu 20.04 can easily install Python3.9 from the official package
repos. Currently this is still supported by Cristoph his pgdg repos,
but looks like that support might be removed before PG18 given it's
transition to "Expanded Security Maintenance" next month.
- Ubuntu 22.04 can easily install Python3.11 from the official package
repos. Still has full upstream support for 2 years.
- Debian Bullseye (EOL in 1.5 years) can only install Python3.9 from
the official package repos.
- Debian Bookworm (EOL in 3 years) has Pytho
- Python 3.8 itself is declared EOL by upstream
- Python 3.10 will be declared EOL by upstream around the release of PG19

Based on all this info, I think that it sounds quite reasonable to
start requiring Python3.9 for PG18. And for PG19 we could probably
start requiring Python3.11

[1]: https://github.com/astral-sh/uv/issues/9833
[2]: 
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/configuring_basic_system_settings/installing-and-using-dynamic-programming-languages_configuring-basic-system-settings#assembly_installing-and-using-python_installing-and-using-dynamic-programming-languages
[3]: 
https://access.redhat.com/support/policy/updates/errata#RHEL8_Planning_Guide

P.S. I CC-ed Devrim and Christoph, for insights into this from the
packagers perspective.




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-04-22 Thread Ivan Kush

Hi Jacob, thank you for detailed explanation and links!

Am I right that classic OAuth flow "create user account based on a 
token" is implemented using custom validators?


1) In pg_hba.conf set user to all and  "delegate_ident_mapping=1"

"local all all oauth issuer=$issuer scope=$scope delegate_ident_mapping=1"

2) Write a custom validator that will "execute" in C `CREATE USER 
token.name WITH token.listofOptions` after verification of a token.


On 25-04-21 19:57, Jacob Champion wrote:

We have some options for dealing with them, since their documentation
instructs clients to hardcode their API entry points instead of using
discovery. (That makes it easy for us to figure out when we're talking
to Google, and potentially switch to a quirks mode.)


What do you mean by "discovery"? OpenID link that returns endpoint?

Google has this link

https://accounts.google.com/.well-known/openid-configuration

OUTPUT:
    {
    "issuer": "https://accounts.google.com";,
    "authorization_endpoint": 
"https://accounts.google.com/o/oauth2/v2/auth";,
    "device_authorization_endpoint": 
"https://oauth2.googleapis.com/device/code";,

    "token_endpoint": "https://oauth2.googleapis.com/token";,
    "userinfo_endpoint": 
"https://openidconnect.googleapis.com/v1/userinfo";,

    "revocation_endpoint": "https://oauth2.googleapis.com/revoke";,
    "jwks_uri": "https://www.googleapis.com/oauth2/v3/certs";,

    }

Here it's described

https://developers.google.com/identity/openid-connect/openid-connect


But! Before we do that: How do you intend to authorize tokens issued
by Google? Last I checked, they still had no way to register an
application-specific scope, making it very dangerous IMO to use a
public flow [2].


I've also thought as Antonin about 
https://www.googleapis.com/oauth2/v3/userinfo for verification


As I understand from [2], the current problem is security, Google 
doesn't want to add new scopes.


--
Best wishes,
Ivan Kush
Tantor Labs LLC





Re: pgsql: Update guidance for running vacuumdb after pg_upgrade.

2025-04-22 Thread Nathan Bossart
On Tue, Apr 22, 2025 at 11:03:29PM +0200, Christoph Berg wrote:
> Re: Nathan Bossart
>> In any case, IMO it's unfortunate
>> that we might end up recommending roughly the same post-upgrade steps as
>> before even though the optimizer statistics are carried over.
> 
> Maybe the docs (and the pg_upgrade scripts) should recommend the old
> procedure by default until this gap is closed? People could then still
> opt to use the new procedure in specific cases.

I think we'd still want to modify the --analyze-in-stages recommendation
(from what is currently recommended for supported versions).  If we don't,
you'll wipe out the optimizer stats you brought over from the old version.

Here is a rough draft of what I am thinking.

-- 
nathan
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index df13365b287..648c6e2967c 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -833,17 +833,19 @@ psql --username=postgres --file=script.sql postgres
 
 
  Because not all statistics are not transferred by
- pg_upgrade, you will be instructed to run a command to
+ pg_upgrade, you will be instructed to run commands to
  regenerate that information at the end of the upgrade.  You might need to
  set connection parameters to match your new cluster.
 
 
 
- Using vacuumdb --all --analyze-only 
--missing-stats-only
- can efficiently generate such statistics.  Alternatively,
+ First, use
  vacuumdb --all --analyze-in-stages --missing-stats-only
- can be used to generate minimal statistics quickly.  For either command,
- the use of --jobs can speed it up.
+ to quickly generate minimal optimizer statistics for relations without
+ any.  Then, use vacuumdb --all --analyze-only to ensure
+ all relations have updated cumulative statistics for triggering vacuum and
+ analyze.  For both commands, the use of --jobs can speed
+ it up.
  If vacuum_cost_delay is set to a non-zero
  value, this can be overridden to speed up statistics generation
  using PGOPTIONS, e.g., PGOPTIONS='-c
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 18c2d652bb6..f1b90c5957e 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -814,9 +814,12 @@ output_completion_banner(char *deletion_script_file_name)
}
 
pg_log(PG_REPORT,
-  "Some optimizer statistics may not have been transferred by 
pg_upgrade.\n"
+  "Some statistics are not transferred by pg_upgrade.\n"
   "Once you start the new server, consider running:\n"
-  "%s/vacuumdb %s--all --analyze-in-stages 
--missing-stats-only", new_cluster.bindir, user_specification.data);
+  "%s/vacuumdb %s--all --analyze-in-stages 
--missing-stats-only\n"
+  "%s/vacuumdb %s--all --analyze-only",
+  new_cluster.bindir, user_specification.data,
+  new_cluster.bindir, user_specification.data);
 
if (deletion_script_file_name)
pg_log(PG_REPORT,


Re: What's our minimum supported Python version?

2025-04-22 Thread Christoph Berg
Re: Jelte Fennema-Nio
> - Ubuntu 20.04 can easily install Python3.9 from the official package
> repos. Currently this is still supported by Cristoph his pgdg repos,
> but looks like that support might be removed before PG18 given it's
> transition to "Expanded Security Maintenance" next month.

20.04 (focal) is already too old for PG18 because of llvm version
woes, so we are already not building it there. (I forgot what the
precise problem was, but I figured focal would be EOL before the PG18
release anyway.)

> - Ubuntu 22.04 can easily install Python3.11 from the official package
> repos. Still has full upstream support for 2 years.
> - Debian Bullseye (EOL in 1.5 years) can only install Python3.9 from
> the official package repos.
> - Debian Bookworm (EOL in 3 years) has Pytho
  ^ n 3.11
> - Python 3.8 itself is declared EOL by upstream
> - Python 3.10 will be declared EOL by upstream around the release of PG19

> Based on all this info, I think that it sounds quite reasonable to
> start requiring Python3.9 for PG18. And for PG19 we could probably
> start requiring Python3.11

Ack. Bullseye LTS will be supported until August 2026, so very close
to the PG19 release, but it should work out not to support PG19 there.

https://wiki.debian.org/LTS/

Christoph




Re: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details

2025-04-22 Thread Robin Haberkorn
Hello Tom!

On Tue Apr 22, 2025 at 18:29:02 GMT +03, Tom Lane wrote:
> Hi Robin, thank you for the patch!  This has arrived too late to
> be considered for PG v18, but please add the thread to the open
> commitfest

I will do that. But what about the other changes I proposed (xml data type
and hstore support)? Should they all be discussed in this thread or should I
create separate threads and consequently separate patch entries for the
Commitfest? And should they all be based on the master branch or can they be
based upon each other? Do you think they will help to bring xslt_process()
closer to adoption into core?

> Just from a quick eyeballing of the patch, it seems generally
> sane, but I wonder why you have xml_generic_error_handler
> creating and then destroying an errorBuf instead of just
> appending the results directly to xmlerrcxt->err_buf.  The
> extra buffer doesn't seem to be doing much except increasing
> our odds of an OOM failure right there.

You are right. It was a leftover from thinning out xml_errorHandler().
I attached a new version of the patch where this is fixed.
I also ran all touched files through pgindent.

> I'm also a little suspicious of
>
> - xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_LEGACY);
> + xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_ALL);
>
> as that looks like it is probably changing the behavior of
> the module in ways bigger than just providing more error
> details.  This code has been in legacy status for so long
> that I feel a little hesitant about doing that.

You wouldn't actually need that change to get more error details as the
details are provided by xml_generic_error_handler() which ignores the
strictness level. The reason is that libxml generic error handlers cannot
discern between warnings and errors. The only thing that the strictness level
does is affecting how xml_errorHandler() works: With STRICTNESS_ALL, it can
actually log warnings without appending anything to the err_buf and set the
err_occurred flag which is apparently useful because some libxml functions
can result in error-level messages without the function calls signalling
failure. At least, otherwise the code doesn't make sense to me. In other
words, using the STRICTNESS_ALL should allow you to see warnings, you
wouldn't otherwise see and catch errors, you wouldn't otherwise catch -- at
least if they occur in libxml itself. As far as I understand from the code
comment xml_errorHandler(), STRICTNESS_LEGACY was introduced, just so you
don't have to touch the existing contrib/xml2 code that did not check
err_occurred:

/*
 * Legacy error handling mode.  err_occurred is never set, we just add the
 * message to err_buf.  This mode exists because the xml2 contrib module
 * uses our error-handling infrastructure, but we don't want to change its
 * behaviour since it's deprecated anyway.  This is also why we don't
 * distinguish between notices, warnings and errors here --- the old-style
 * generic error handler wouldn't have done that either.
 */

Since we now check that flag in xslt_proc.c, we should be fine.
But I might be overlooking something.

> If we are willing to do it, should we go further and clean out the use
> of PG_XML_STRICTNESS_LEGACY in xml2/xpath.c as well?

I think this is a separate construction site. We don't necessarily have
to touch xpath.c and I didn't plan to do so in the near future.
We should first determine whether the XPath support is actually worth
keeping (should be adopted into core) or rather stay deprecated.

Best regards,
Robin Haberkorn

-- 
Robin Haberkorn
Senior Software Engineer

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / https://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt, HRB 3537
From b4d9a6594142cf85e4e38352f6be3cea260962f7 Mon Sep 17 00:00:00 2001
From: Robin Haberkorn 
Date: Thu, 17 Apr 2025 13:15:48 +0300
Subject: [PATCH v2] contrib/xml2: xslt_process() now reports XSLT-related
 error details

* It sets (and restores) the libxslt error handler (xsltSetGenericErrorFunc()).
  Since it only supports old-school "generic" error handlers, which are no longer
  used in PG's libxml code, we reintroduced a "generic" error handler
  xml_generic_error_handler() in xml.c.
* The alternative would have been to expose PgXmlErrorContext in xml.h,
  so we could implement a generic handler in xslt_proc.c.
  This is obviously not desirable, as it would depend on USE_LIBXML.
* No longer use the PG_XML_STRICTNESS_LEGACY for error handling,
  but query the error_occurred-flag via pg_xml_error_occurred().
  The existance of this getter is another hint, that we are not supposed
  to expose PgXmlErrorContext in xml.h.
* This change means that xslt_process() now reports not only details
  about XML parsing errors, but XSLT-schema deviations and missing
  stylesheet parameters as well.
* The XSLT error messages now also contain line numbers.
  For that to work, we had to set a dummy "SQL" URL when parsing XML strings.
 

Re: [PATCH] Support older Pythons in oauth_server.py

2025-04-22 Thread Renan Alves Fonseca
Jacob Champion  writes:

>
> This is tested against Python 3.6.15 (3.6 went EOL at the end of
> 2021). I'm working on getting Rocky 8 installed locally to test
> against. If it's decided we want to support downwards to 3.5, I will
> test there too (but I hope we don't; see parent thread).
>

I did a quick test on 3.5. It seems to work after removing
f-strings... At this point, I would be really happy with 3.6.





Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 3:02 AM Daniel Gustafsson  wrote:
> +  if oauth_flow_supported
> +cdata.set('USE_LIBCURL', 1)
> +  elif libcurlopt.enabled()
> +error('client OAuth is not supported on this platform')
> +  endif
> We already know that libcurlopt.enabled() is true here so maybe just doing
> if-else-endif would make it more readable and save readers thinking it might
> have changed?

Features are tri-state, so libcurlopt.disabled() and
libcurlopt.enabled() can both be false. :( My intent is to fall
through nicely in the case where -Dlibcurl=auto.

(Our minimum version of Meson is too old to switch to syntax that
makes this more readable, like .allowed(), .require(), .disable_if(),
etc...)

> Also, "client OAuth" reads a bit strange, how about "client-side
> OAuth" or "OAuth flow module"?
> ...
> I think we should take this opportunity to turn this into a 
> appendPQExpBuffer()
> with a format string instead of two calls.
> ...
> Now that the actual variable, errbuf->len, is short and very descriptive I
> wonder if we shouldn't just use this as it makes the code even clearer IMO.

All three done in v9, attached.

Thanks!
--Jacob
1:  5f87f11b18e = 1:  5f87f11b18e Add minor-version counterpart to 
(PG_)MAJORVERSION
2:  4c9cc7f69af ! 2:  9e37fd7c217 oauth: Move the builtin flow into a separate 
module
@@ configure.ac: if test "$PORTNAME" = "win32" ; then
 +if test "$with_libcurl" = yes ; then
 +  # Error out early if this platform can't support libpq-oauth.
 +  if test "$ac_cv_header_sys_event_h" != yes -a 
"$ac_cv_header_sys_epoll_h" != yes; then
-+AC_MSG_ERROR([client OAuth is not supported on this platform])
++AC_MSG_ERROR([client-side OAuth is not supported on this platform])
 +  fi
 +fi
 +
@@ meson.build: if not libcurlopt.disabled()
 +  if oauth_flow_supported
 +cdata.set('USE_LIBCURL', 1)
 +  elif libcurlopt.enabled()
-+error('client OAuth is not supported on this platform')
++error('client-side OAuth is not supported on this platform')
 +  endif
 +
  else
@@ src/interfaces/libpq-oauth/oauth-curl.c: 
pg_fe_run_oauth_flow_impl(PGconn *conn)
 * also the documentation for struct async_ctx.
 */
if (actx->errctx)
-   {
+-  {
 -  appendPQExpBufferStr(&conn->errorMessage,
 -   
libpq_gettext(actx->errctx));
 -  appendPQExpBufferStr(&conn->errorMessage, ": ");
-+  appendPQExpBufferStr(errbuf, libpq_gettext(actx->errctx));
-+  appendPQExpBufferStr(errbuf, ": ");
-   }
+-  }
++  appendPQExpBuffer(errbuf, "%s: ", libpq_gettext(actx->errctx));
  
if (PQExpBufferDataBroken(actx->errbuf))
 -  appendPQExpBufferStr(&conn->errorMessage,
@@ src/interfaces/libpq-oauth/oauth-curl.c: 
pg_fe_run_oauth_flow_impl(PGconn *conn)
  
if (actx->curl_err[0])
{
-   size_t  len;
- 
+-  size_t  len;
+-
 -  appendPQExpBuffer(&conn->errorMessage,
 -" (libcurl: %s)", 
actx->curl_err);
 +  appendPQExpBuffer(errbuf, " (libcurl: %s)", actx->curl_err);
@@ src/interfaces/libpq-oauth/oauth-curl.c: 
pg_fe_run_oauth_flow_impl(PGconn *conn)
/* Sometimes libcurl adds a newline to the error buffer. :( */
 -  len = conn->errorMessage.len;
 -  if (len >= 2 && conn->errorMessage.data[len - 2] == '\n')
-+  len = errbuf->len;
-+  if (len >= 2 && errbuf->data[len - 2] == '\n')
++  if (errbuf->len >= 2 && errbuf->data[errbuf->len - 2] == '\n')
{
 -  conn->errorMessage.data[len - 2] = ')';
 -  conn->errorMessage.data[len - 1] = '\0';
 -  conn->errorMessage.len--;
-+  errbuf->data[len - 2] = ')';
-+  errbuf->data[len - 1] = '\0';
++  errbuf->data[errbuf->len - 2] = ')';
++  errbuf->data[errbuf->len - 1] = '\0';
 +  errbuf->len--;
}
}


v9-0001-Add-minor-version-counterpart-to-PG_-MAJORVERSION.patch
Description: Binary data


v9-0002-oauth-Move-the-builtin-flow-into-a-separate-modul.patch
Description: Binary data


DOCS - create publication (tweak for generated columns)

2025-04-22 Thread Peter Smith
Hi,

While re-reading documentation about logical replication of generated
columns, I came across this paragraph in the CREATE PUBLICATION ...
FOR TABLE description [1].


When a column list is specified, only the named columns are
replicated. The column list can contain stored generated columns as
well. If no column list is specified, all table columns (except
generated columns) are replicated through this publication, including
any columns added later. It has no effect on TRUNCATE commands. See
Section 29.5 for details about column lists.


That "except generated columns" part is not strictly correct because
it fails to account for the 'publish_generated_columns' parameter.
I've suggested a more accurate description below.


SUGGESTION

When a column list is specified, only the named columns are
replicated. Stored generated columns may be included in the list.
Specifying a column list has no effect on TRUNCATE commands. See
Section 29.5 for details about column lists. If no column list is
specified, all table columns are replicated through this publication,
including any columns added later. Generated columns are included in
this case only if publish_generated_columns is set to stored.


I've attached a patch to implement this suggested wording.

Thoughts?

==
[1] 
https://www.postgresql.org/docs/devel/sql-createpublication.html#SQL-CREATEPUBLICATION-PARAMS-FOR-TABLE

Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-DOCS-create-publication-for-table.patch
Description: Binary data


Re: jsonapi: scary new warnings with LTO enabled

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 3:10 AM Daniel Gustafsson  wrote:
> My preference is that no operation can silently work on a failed object, but
> it's not a hill (even more so given where we are in the cycle).

Hm, okay. Something to talk about with Andrew and Peter, maybe.

> The attached
> v3 allocates via the JSON api, no specific error handling should be required 
> as
> it's already handled today.

pgindent shows one whitespace change on my machine (comment
indentation), but other than that, LGTM! Fuzzers are happy too.

Thanks,
--Jacob




Re: pgsql: Add function to get memory context stats for processes

2025-04-22 Thread Andres Freund
Hi,

On 2025-04-17 10:42:45 -0400, Robert Haas wrote:
> On Tue, Apr 15, 2025 at 6:11 AM Andres Freund  wrote:
> > There very well could be a CFI - but it better be somewhere where the
> > in-memory state is consistent. Otherwise an error inside raised in the CFI
> > would lead the in-memory state inconsistent which then would cause problems
> > when cleaning up the dsa during resowner release or process exit.
> >
> > What am I missing here?
> 
> I think maybe you're only thinking about gathering the data. What
> about publishing it? If the DSA code were interrupted at a CFI and the
> interrupting code went and tried to perform a DSA allocation to store
> the resulting data and then returned to the interrupted DSA operation,
> would you expect the code to cope with that?

I would expect the DSA code to not call CFI() in such a place - afaict nearly
all such cases would also not be safe against errors being raised in the CFI()
and then later again allocating memory from the DSA (or resetting
it). Similarly, aset.c better not accept interrupts in the middle of, e.g.,
AllocSetAllocFromNewBlock(), since we'd loose track of the allocation.

Greetings,

Andres Freund




Re: Cygwin support

2025-04-22 Thread Laurenz Albe
On Tue, 2025-04-22 at 10:26 -0400, Tom Lane wrote:
> I vaguely recall some discussion about whether building with readline
> has become possible under MSVC.  I think it'd make a lot of people
> happy if that could happen (but I hasten to add that I'm not
> volunteering).

https://postgr.es/m/CACLU5mThm-uCtERMVHMoGED2EPfyS54j83WB20s_BmzVQuMkpg%40mail.gmail.com

Seems like Kirk didn't pursue this further.

Yours,
Laurenz Albe




Re: Fortify float4 and float8 regression tests by ordering test results

2025-04-22 Thread Tom Lane
Pavel Borisov  writes:
> On Tue, 22 Apr 2025 at 20:22, Alexander Korotkov  wrote:
>> On Tue, Apr 22, 2025 at 7:20 PM Tom Lane  wrote:
>>> Alexander Korotkov  writes:
 I'd like to add that float4.out not only assumes that insert-ordering is
 preserved (this could be more-or-less portable between table AMs).  It also
 assumes the way UPDATE moves updated rows.  That seems quite
 heap-specific.  You can see in the following fragment, updated rows jump to
 the bottom.

>>> I'd be willing to consider a policy that we don't want to depend on
>>> exactly where UPDATE moves rows to.  The proposed patch is not that,
>>> however.

>> OK, that makes sense for me.

> Thanks for this input!
> This was my first intention to fix only the test that was affected by
> UPDATE-order specifics, broke when runnung on an extension AM.
> Maybe I was too liberal and added ORDER BY's more than needed.
> I definitely agree to the proposal.

On further reflection, this policy makes plenty of sense because even
heapam is not all that stable about row order after UPDATE.  With
tables large enough to draw the attention of autovacuum, we've had
to use ORDER BY or other techniques to stabilize the results, because
the timing of background autovacuums affected where rows got put.
These tests are different only because the tables are small enough
and the updates few enough that autovacuum doesn't get involved.
I think it's reasonable that at a project-policy level we should
not consider that an excuse.  For example, a change in autovacuum's
rules could probably break these tests even with heapam.

On the other hand, as I mentioned earlier, a table AM that doesn't
reproduce original insertion order would break a whole lot more
tests than these.  So that's a bridge I'd rather not cross,
at least not without a lot of consideration.

BTW, you forgot to update expected/float4-misrounded-input.out.

regards, tom lane




Re: Logical Replication of sequences

2025-04-22 Thread Peter Smith
Hi Vignesh.

Review comments for patch v20250422-0001.

==
Commit message

1.
This patch introduces a new function: pg_sequence_state function
allows retrieval of sequence values including LSN.

SUGGESTION
This patch introduces a new function, 'pg_sequence_state', which
allows retrieval of sequence values, including the associated LSN.

==
src/backend/commands/sequence.c

pg_sequence_state:

2.
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("permission denied for sequence %s",
+ RelationGetRelationName(seqrel;

Has redundant parentheses.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: What's our minimum supported Python version?

2025-04-22 Thread Renan Alves Fonseca
Tom Lane  writes:

> Florents Tselai  writes:
>> On 19 Apr 2025, at 7:17 PM, Tom Lane  wrote:
>>> I think we need to do some combination of moving our
>>> minimum-supported-version goalposts forward, making sure that
>>> whatever we claim is the minimum Python version is actually
>>> being tested in the buildfarm, and fixing oauth_server.py
>>> so that it works on that version.
>
>> From an Python ecosystem perspective,
>>  3.9 is the usual minimum that people use in CI matrices nowdays.
>> So if it was up to me, that’s what I’d choose. 
>
> Per these numbers, that would be cutting off 31% of the buildfarm,
> including a lot of still-in-support distros such as RHEL8.
> So I would say that's not likely to be our choice.
>
>   regards, tom lane

Also from a python world perspective, we really don't want to run EOL
versions. Besides the security reasons, bugs in thirdy-party
dependencies usually are not fixed and the whole system seems to rot
very quickly.

Of course, this does not happen if one does not rely on third-party
deps. Even in this case, I would say that it is wise to support a
relatively small set because the built-in libs will vary over versions,
and a wide range of versions will cause more troubles like this one.

The oldest non EOL version is 3.9 right now. My suggestion is to follow
the official supported releases. I'm not familiar with the buildfarm,
but I know that python3.6 does not come installed by default in RHEL8,
so instead of installing this version we can simply install 3.9, 3.11 or
3.12.

In a persistent workstation, I believe the following should do the
trick (tested on a 8.5 box):
sudo dnf install python3.12
sudo dnf remove python36

BTW, we are used to create a virtualenv for everything, but that is another 
topic.

Best regards,
Renan Fonseca




Re: [PATCH] Support older Pythons in oauth_server.py

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 2:43 PM Renan Alves Fonseca
 wrote:
> I did a quick test on 3.5.

Thanks!

> It seems to work after removing
> f-strings... At this point, I would be really happy with 3.6.

Ah, makes sense. Well, I'm starting with the 3.6 fix regardless, since
it's preventing RHEL8 testing. If we decide we must support 3.5, I can
patch f-strings too. Fingers crossed...

--Jacob




Re: What's our minimum supported Python version?

2025-04-22 Thread Tom Lane
Jelte Fennema-Nio  writes:
> I totally agree that "security" is not the discussion that we should
> be having here. I think the Python version decision (or really any
> minimum version decision) should be based on some analysis of costs to
> us for maintaining support vs benefits to the users.

[ shrug... ] This thread in itself has already consumed more effort
than we've spent thinking about old-Python problems for the last
several years.  (Otherwise, we would probably have updated that
supported-version claim in installation.sgml some time ago.)
So I doubt that there's all that much in cost savings to us that'd
outweigh benefits to users.  The fact that Jacob was able to fix
oauth_server.py without (it seemed like) very much work seems to
bear out that opinion.

This calculus might change if we start to rely more heavily on
Python for build and test infrastructure than we do today.  But
I think we can have that conversation when it happens.

Worth noting also is that there are different requirements in
different places.  For example,
contrib/unaccent/generate_unaccent_rules.py is probably only ever run
by developers, so I doubt anyone is going to make a fuss if it doesn't
work on hoary Pythons.  Test infrastructure is more of a problem,
since we want to be able to test on any platform that we consider
supported at all.

Anyway, what I propose that we do for now is replace the
installation.sgml text

The minimum required version is Python 3.2.

with

The minimum supported version is Python 3.6.8.

where I use "supported" advisedly, as in "it might work with
something older, but we don't promise to fix it if not".

regards, tom lane




Re: What's our minimum supported Python version?

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 2:28 PM Jelte Fennema-Nio  wrote:
> I'm pretty sure most users of RHEL expect most modern software not to
> compile/work completely effortlessly on their distros without some
> effort on their part or on the part of RHEL packagers. That's kinda
> what you're signing up for if you choose a distro like that.

This is the core of my position, too. I think it's reasonable to
support Python versions for some time after they go EOL, but I don't
think that we need to treat "users who want bleeding-edge Postgres
with long-EOL dependencies" as something to cater to, at the expense
of the committer testing matrix.

Note that Meson itself has updated to Python 3.7 as a minimum version
(as it now warns you, loudly).

Thanks,
--Jacob




Re: DOCS - create publication (tweak for generated columns)

2025-04-22 Thread David G. Johnston
On Tuesday, April 22, 2025, Peter Smith  wrote:

> Hi,
>
> While re-reading documentation about logical replication of generated
> columns, I came across this paragraph in the CREATE PUBLICATION ...
> FOR TABLE description [1].
>
> 
> When a column list is specified, only the named columns are
> replicated. The column list can contain stored generated columns as
> well. If no column list is specified, all table columns (except
> generated columns) are replicated through this publication, including
> any columns added later. It has no effect on TRUNCATE commands. See
> Section 29.5 for details about column lists.
> 
>
> That "except generated columns" part is not strictly correct because
> it fails to account for the 'publish_generated_columns' parameter.
> I've suggested a more accurate description below.
>
>
> SUGGESTION
> 
> When a column list is specified, only the named columns are
> replicated. Stored generated columns may be included in the list.
> Specifying a column list has no effect on TRUNCATE commands. See
> Section 29.5 for details about column lists. If no column list is
> specified, all table columns are replicated through this publication,
> including any columns added later. Generated columns are included in
> this case only if publish_generated_columns is set to stored.
> 
>
> I've attached a patch to implement this suggested wording.
>
> Thoughts?
>
> ==
> [1] https://www.postgresql.org/docs/devel/sql-createpublication.html#SQL-
> CREATEPUBLICATION-PARAMS-FOR-TABLE
>

In the column list I would stick to mentioning what cannot be specified,
since it would be assumed by default that any column on the table is fair
game.  I believe that means not mentioning stored generated and instead
mentioning virtual generated.  It really seems odds mentioning stored
generated being allowed before mentioning when they are optional, default
excluded.

Alternative:

The optional column list may include any non-virtual columns of the table.
If omitted, tables publish all (including future) non-generated columns by
default, and may publish stored generated columns if the publication option
publish_generated_columns is set to stored.  See Section 29.5 for details
about column lists.


I don’t get why truncate is mentioned here.  I omitted it intentionally.

David J.


Re: Row pattern recognition

2025-04-22 Thread Tatsuo Ishii
> Attached are the v30 patches, just adding a patch to change the
> default io_method parameter to sync, expecting this affects something
> to the recent CFbot failure.
> https://commitfest.postgresql.org/patch/4460/
> https://cirrus-ci.com/task/6078653164945408
> which is similar to:
> https://www.postgresql.org/message-id/20250422.39.1502127917165838403.ishii%40postgresql.org

CFBot has just run tests against v30 patches and now it turns to green
again!  I guess io_method = sync definitely did the trick.  Note that
previously only the Windows Server 2019, VS 2019 - Neason & ninja test
had failed.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Allow database owners to CREATE EVENT TRIGGER

2025-04-22 Thread David G. Johnston
On Tuesday, April 22, 2025, Steve Chavez  wrote:

> >  alter event trigger command which doesn’t need to be exercised here
>
> That part does need to be tested, I modified `AlterEventTriggerOwner_internal`
> to allow altering owners to regular users. Before it was only restricted to
> superusers.
>
> > Actually, leave the other member around, but not granted ownership, and
> both create tables, to demonstrate that a non-superuser and non-owner is
> unaffected by the trigger.
>
> I've updated the tests accordingly. Please let me know if that's what you
> had in mind.
>

Pretty much.  You have a bad drop table cleanup command, and I’d drop the
entire alter event trigger owner test.

The other thing I’m wondering, but haven’t gotten around to testing, is
whether a role affected by the event trigger is able to just drop the
trigger given this implementation.  I always get member/member-of dynamics
confused.  Having member (possibly via set role…) trying to drop the
trigger would be good to prove that it isn’t allowed.

David J.


Re: Allow database owners to CREATE EVENT TRIGGER

2025-04-22 Thread David G. Johnston
On Tuesday, April 22, 2025, David G. Johnston 
wrote:

> On Tuesday, April 22, 2025, Steve Chavez  wrote:
>
>> >  alter event trigger command which doesn’t need to be exercised here
>>
>> That part does need to be tested, I modified
>> `AlterEventTriggerOwner_internal` to allow altering owners to regular
>> users. Before it was only restricted to superusers.
>>
>
Ok.  I missed this.

David J.


Re: Allow database owners to CREATE EVENT TRIGGER

2025-04-22 Thread David G. Johnston
On Tuesday, April 22, 2025, David G. Johnston 
wrote:

> On Tuesday, April 22, 2025, David G. Johnston 
> wrote:
>
>> On Tuesday, April 22, 2025, Steve Chavez  wrote:
>>
>>> >  alter event trigger command which doesn’t need to be exercised here
>>>
>>> That part does need to be tested, I modified
>>> `AlterEventTriggerOwner_internal` to allow altering owners to regular
>>> users. Before it was only restricted to superusers.
>>>
>>
> Ok.  I missed this.
>

Sorry for the self-reply but this nagged at me.

It’s probably not a big deal either way, but the prior test existed to
ensure that a superuser couldn’t do something they are otherwise are always
permitted to do - assign object to whomever they wish.  So
event_trigger.sql had a test that errored showing this anomaly.  You moved
the test and now are proving it doesn’t error.  But it is not expected to
error; and immediately above you already show that a non-superuser can be
an owner.  We don’t need a test to show a superuser demonstrating their
normal abilities.

IOW, select test cases around the feature as it is implemented now, not its
history.  A personal one-off test to ensure that no super-user prohibitions
remained will suffice to make sure all such code that needed to be removed
is gone.

David J.


Re: disabled SSL log_like tests

2025-04-22 Thread Tom Lane
I wrote:
> The fact that mamba hasn't been failing on the other affected
> tests is a bit puzzling.  mamba isn't running kerberos or ldap
> or oauth_validator, so the lack of failures there is expected.
> authentication/t/001_password.pl appears accidentally not vulnerable:
> it's not using log_like in any connect_fails tests.  (It is using
> log_unlike, so those tests could be giving silent false negatives.)
> But authentication/t/003_peer.pl has 8 test cases that look
> vulnerable.  I guess there must be some extra dollop of timing
> weirdness in the ssl tests that's not there in 003_peer.pl.

After pushing this patch, the probable explanation for the "extra
timing weirdness" finally penetrated my brain: the error reports
we are looking for are cases that are detected by OpenSSL.  So
it's plausible that OpenSSL has told the connecting client to
go away before it returns the error indication to the backend's
calling code, which would be what would log the message.  That is
what provides the window for a race condition.  You still need
a bunch of assumptions about the kernel's subsequent scheduling
of psql and the TAP script versus the backend, but the whole
thing is certainly dependent on psql getting the word sooner
than the backend.

(Not all of the tests disabled by 55828a6b6 meet this description,
but a bunch of them do, and I believe that I just zapped every
log_like condition in the script rather than trying to be selective
about which ones were known to have failed.)

It seems at least plausible that the Kerberos tests could have
a similar problem.  I'm less sure about LDAP or OAuth.  But
authentication/t/003_peer.pl would not, because elog.c emits errors
to the log before sending them to the client.  So that explains
the lack of buildfarm reports from mamba, and it's unclear that
we have any other animals with the right timing behavior to
see this.

regards, tom lane




Re: [PATCH] Support older Pythons in oauth_server.py

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 12:36 PM Tom Lane  wrote:
> Thanks!  I confirm this works for me on RHEL8 with 3.6.8.

Great, thanks for the quick review!

I'm considering committing this myself. Do you mind if it takes a day
or so to get this in, while I quadruple-check everything, or are you
blocked on this?

--Jacob




Re: Cygwin support

2025-04-22 Thread Andrew Dunstan


On 2025-04-22 Tu 12:55 PM, Laurenz Albe wrote:

On Tue, 2025-04-22 at 10:26 -0400, Tom Lane wrote:

I vaguely recall some discussion about whether building with readline
has become possible under MSVC.  I think it'd make a lot of people
happy if that could happen (but I hasten to add that I'm not
volunteering).

https://postgr.es/m/CACLU5mThm-uCtERMVHMoGED2EPfyS54j83WB20s_BmzVQuMkpg%40mail.gmail.com

Seems like Kirk didn't pursue this further.



It's worth looking at, but I don't think it's possible for release 18 
let alone the older releases. We should look at retiring Cygwin when all 
the live branches have psql with a working readline on native Windows 
builds.


cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: AIO v2.5

2025-04-22 Thread Tomas Vondra
On 3/18/25 21:12, Andres Freund wrote:
> Hi,
> 
> Attached is v2.10, with the following changes:
> 
>  ...
> 
> - committed io_method=worker
> 

There's an open item related to this commit (247ce06b883d), based on:

For now the default io_method is changed to "worker". We should re-
evaluate that around beta1, we might want to be careful and set the
default to "sync" for 18.

FWIW the open item is in the "recheck mid-beta" section, but the commit
message says "around beta1" which is not that far away (especially if we
choose to do beta1 on May 8, with the minor releases).

What information we need to gather to make the decision and who's
expected to do it? I assume someone needs to do run some benchmarks, but
is anyone working on it already? What benchmarks, what platforms?

FWIW I'm asking because of the RMT, but I'm also willing to do some of
the tests, if needed - but it'd be good to get some guidance.


regards

-- 
Tomas Vondra





Re: [PATCH] Support older Pythons in oauth_server.py

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 3:17 PM Jelte Fennema-Nio  wrote:
> The way you replaced this does not have the same behaviour in the case
> where the prefix/suffix is not part of the string. removeprefix/suffix
> will not remove any characters in that case, but your code will always
> remove the number of characters that the suffix/prefix is long. Maybe
> your other checks and definition of the OAuth spec ensure that these
> prefixes/suffixes are present when you remove them

Correct. Directly above the changed code are the prefix checks, which
set self._alt_issuer/_parameterized.

Thanks!
--Jacob




Re: [PATCH] Support older Pythons in oauth_server.py

2025-04-22 Thread Tom Lane
Jacob Champion  writes:
> I'm considering committing this myself.

Please do.

> Do you mind if it takes a day
> or so to get this in, while I quadruple-check everything, or are you
> blocked on this?

I'm not in a hurry.  Push when you're ready.

regards, tom lane




Re: [PATCH] Support older Pythons in oauth_server.py

2025-04-22 Thread Tom Lane
Jacob Champion  writes:
> This is tested against Python 3.6.15 (3.6 went EOL at the end of
> 2021). I'm working on getting Rocky 8 installed locally to test
> against. If it's decided we want to support downwards to 3.5, I will
> test there too (but I hope we don't; see parent thread).

Thanks!  I confirm this works for me on RHEL8 with 3.6.8.

regards, tom lane




Re: What's our minimum supported Python version?

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 12:09 PM Renan Alves Fonseca
 wrote:
> The oldest non EOL version is 3.9 right now. My suggestion is to follow
> the official supported releases.

I think this policy is too aggressive. Many operating systems we
support are going to ship Python versions past their EOL date (and
keep them supported for a long while with security patches). There's a
middle ground to be found, IMO.

--Jacob




[PATCH] Support older Pythons in oauth_server.py

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 8:29 AM Jacob Champion
 wrote:
> On Tue, Apr 22, 2025 at 8:05 AM Tom Lane  wrote:
> > The first problem is that this Python version seems not to
> > like assignments embedded in if statements: [...]
> >
> > I was able to work around that with the attached quick hack.
> > But then I get [...]
>
> Thanks, I'll put a patch together. Sorry -- IIRC, both of those
> features are probably too new for me to have used, regardless of what
> we decide is the minimum version for 18.

Splitting this off into its own $SUBJECT.

I coupled oauth_server to new Python stuff without realizing it, sorry:
- urllib's reclassification of `~` as a safe URL character (3.7)
- walrus operator `:=` (3.8)
- dict[type, type] annotations (3.9)
- str.removeprefix/suffix() (3.9)

Attached patch gets rid of all that.

This is tested against Python 3.6.15 (3.6 went EOL at the end of
2021). I'm working on getting Rocky 8 installed locally to test
against. If it's decided we want to support downwards to 3.5, I will
test there too (but I hope we don't; see parent thread).

Thanks,
--Jacob


0001-oauth-Support-Python-3.6-in-tests.patch
Description: Binary data


Re: What's our minimum supported Python version?

2025-04-22 Thread Tom Lane
Jacob Champion  writes:
> On Tue, Apr 22, 2025 at 12:09 PM Renan Alves Fonseca
>  wrote:
>> The oldest non EOL version is 3.9 right now. My suggestion is to follow
>> the official supported releases.

> I think this policy is too aggressive. Many operating systems we
> support are going to ship Python versions past their EOL date (and
> keep them supported for a long while with security patches).

Yeah, that.  The distros that are still shipping older Pythons
are LTS distros where part of the deal is that the vendor
is supposed to back-patch security fixes, even if the upstream
has moved on.  Maybe a particular vendor is falling down on that
job, but it's not for us to rate their performance.  So I don't
buy that "but security!" is a good argument here.

(Full disclosure: I worked for Red Hat for more than a decade,
and still preferentially use their distros.  So I've pretty well
bought into that model.)

regards, tom lane




Re: [PATCH] Support older Pythons in oauth_server.py

2025-04-22 Thread Jelte Fennema-Nio
On Tue, 22 Apr 2025 at 21:26, Jacob Champion
 wrote:
> - str.removeprefix/suffix() (3.9)

The way you replaced this does not have the same behaviour in the case
where the prefix/suffix is not part of the string. removeprefix/suffix
will not remove any characters in that case, but your code will always
remove the number of characters that the suffix/prefix is long. Maybe
your other checks and definition of the OAuth spec ensure that these
prefixes/suffixes are present when you remove them, but the code is
not the same in the general case (you'd need to add a hasprefix/suffix
check before removing the characters.

+# Strip off the magic path segment. (The more readable
+# str.removeprefix()/removesuffix() aren't available until Py3.9.)
 if self._alt_issuer:
 # The /alternate issuer uses IETF-style .well-known URIs.
 if self.path.startswith("/.well-known/"):
-self.path = self.path.removesuffix("/alternate")
+self.path = self.path[: -len("/alternate")]
 else:
-self.path = self.path.removeprefix("/alternate")
+self.path = self.path[len("/alternate") :]
 elif self._parameterized:
-self.path = self.path.removeprefix("/param")
+self.path = self.path[len("/param") :]




Re: Enable data checksums by default

2025-04-22 Thread Tomas Vondra
On 10/16/24 08:54, Peter Eisentraut wrote:
> On 14.10.24 11:28, Peter Eisentraut wrote:
>> On 03.10.24 23:13, Nathan Bossart wrote:
>>> On Tue, Oct 01, 2024 at 11:15:02AM -0400, Peter Eisentraut wrote:
 I have committed 0001 (the new option) and 0004 (the docs tweak).  I
 think
 there is consensus for the rest, too, but I'll leave it for a few
 more days
 to think about.  I guess the test failure has to be addressed.
>>>
>>> Here is a rebased patch with the test fix (for cfbot).  I have made no
>>> other changes.
>>
>> I have committed the test changes (patch 0002).  (I renamed the option
>> to no_data_checksums to keep the wording consistent with the initdb
>> option.)
>>
>> Right now, with checksums off by default, this doesn't do much, but
>> you can test this like
>>
>>  PG_TEST_INITDB_EXTRA_OPTS=--data-checksums meson test ...
>>
>> and everything will pass.  To make that work, I had to adjust the
>> order of how the initdb options are assembled in Cluster.pm a bit.
>>
>> I will work on the patch that flips the default next.
> 
> The patch that flips the default has been committed.
> 
> I also started a PG18 open items page and made a note that we follow up
> on the upgrade experience, as was discussed in this thread.
> 
> https://wiki.postgresql.org/wiki/PostgreSQL_18_Open_Items
> 

Regarding the open item, can someone explain what exactly are we
planning to evaluate mid-beta?

We went through the open items on the RMT team meeting today, and my
impression was the questions are mostly about performance of having
checksums by default, but now I realize the thread talks about "upgrade
experience" which seems fairly wide.

So, what kind of data we expect to gather in order to evaluate this?
Who's expected to collect it and evaluate this?

regards

-- 
Tomas Vondra





Re: What's our minimum supported Python version?

2025-04-22 Thread Jacob Champion
On Tue, Apr 22, 2025 at 3:04 PM Tom Lane  wrote:
> The fact that Jacob was able to fix
> oauth_server.py without (it seemed like) very much work seems to
> bear out that opinion.

I think my ability to do it should not be used as evidence of "ease".

I knew where to download 3.6, that I should build it with Clang to
avoid GCC segfaults, that I should set the rpaths appropriately, and
to wrap it all in a virtualenv for Meson to find. And I had to kick
Meson to get it to clear its cached Python information, or else the
contrib modules refused to build...

I suspect the only way most developers are going to want to test this
is by running EL8.

> This calculus might change if we start to rely more heavily on
> Python for build and test infrastructure than we do today.  But
> I think we can have that conversation when it happens.

As long as the need to backport to PG18 doesn't freeze that
conversation in place, I suppose.

> Anyway, what I propose that we do for now is replace the
> installation.sgml text
>
> The minimum required version is Python 3.2.
>
> with
>
> The minimum supported version is Python 3.6.8.
>
> where I use "supported" advisedly, as in "it might work with
> something older, but we don't promise to fix it if not".

No objection to that as a first step. I'm still hopeful that Devrim
has some evidence in favor of bumping to 3.8 or 3.9. :)

Thanks,
--Jacob




Re: Allow database owners to CREATE EVENT TRIGGER

2025-04-22 Thread Steve Chavez
>  alter event trigger command which doesn’t need to be exercised here

That part does need to be tested, I modified
`AlterEventTriggerOwner_internal` to allow altering owners to regular
users. Before it was only restricted to superusers.

> Actually, leave the other member around, but not granted ownership, and
both create tables, to demonstrate that a non-superuser and non-owner is
unaffected by the trigger.

I've updated the tests accordingly. Please let me know if that's what you
had in mind.

Best regards,
Steve Chavez

On Sun, 20 Apr 2025 at 23:13, David G. Johnston 
wrote:

> On Sunday, April 20, 2025, Steve Chavez  wrote:
>
>> > Also, this looks unconventional…
>> > EventTriggerCacheItem *item = (EventTriggerCacheItem*) lfirst_oid(lc);
>>
>> Just noticed the mistake there, I would have expected a compilation
>> error. New patch attached with the following change:
>>
>>   EventTriggerCacheItem *item = lfirst(lc);
>>
>> On Sun, 20 Apr 2025 at 22:55, Steve Chavez  wrote:
>>
>>> Sorry, attached the output file.
>>>
>>>
> You can remove role member_1 and trigger..1 and “create table foo” from
> the nosuper script without any loss of test coverage.  Or member2 trigger2
> table_bar along with the alter event trigger command which doesn’t need to
> be exercised here.  Ownership is all that matters.  Whether come to
> directly or via alter.
>
> Actually, leave the other member around, but not granted ownership, and
> both create tables, to demonstrate that a non-superuser and non-owner is
> unaffected by the trigger.
>
> David J.
>
>
From e9a73cda8145a04b8375d21e37a6e713af1bed34 Mon Sep 17 00:00:00 2001
From: steve-chavez 
Date: Sun, 20 Apr 2025 19:46:00 -0500
Subject: [PATCH v2] Allow regular users to create event triggers

---
 src/backend/commands/event_trigger.c  | 33 +++--
 src/backend/utils/cache/evtcache.c|  1 +
 src/include/utils/evtcache.h  |  1 +
 src/test/regress/expected/event_trigger.out   | 13 +---
 .../expected/event_trigger_nosuper.out| 71 +++
 src/test/regress/parallel_schedule|  4 ++
 src/test/regress/sql/event_trigger.sql| 11 +--
 .../regress/sql/event_trigger_nosuper.sql | 59 +++
 8 files changed, 147 insertions(+), 46 deletions(-)
 create mode 100644 src/test/regress/expected/event_trigger_nosuper.out
 create mode 100644 src/test/regress/sql/event_trigger_nosuper.sql

diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index edc2c988e29..7ae0636f7c2 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -126,18 +126,6 @@ CreateEventTrigger(CreateEventTrigStmt *stmt)
 	ListCell   *lc;
 	List	   *tags = NULL;
 
-	/*
-	 * It would be nice to allow database owners or even regular users to do
-	 * this, but there are obvious privilege escalation risks which would have
-	 * to somehow be plugged first.
-	 */
-	if (!superuser())
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("permission denied to create event trigger \"%s\"",
-		stmt->trigname),
- errhint("Must be superuser to create an event trigger.")));
-
 	/* Validate event name. */
 	if (strcmp(stmt->eventname, "ddl_command_start") != 0 &&
 		strcmp(stmt->eventname, "ddl_command_end") != 0 &&
@@ -545,14 +533,6 @@ AlterEventTriggerOwner_internal(Relation rel, HeapTuple tup, Oid newOwnerId)
 		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_EVENT_TRIGGER,
 	   NameStr(form->evtname));
 
-	/* New owner must be a superuser */
-	if (!superuser_arg(newOwnerId))
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("permission denied to change owner of event trigger \"%s\"",
-		NameStr(form->evtname)),
- errhint("The owner of an event trigger must be a superuser.")));
-
 	form->evtowner = newOwnerId;
 	CatalogTupleUpdate(rel, &tup->t_self, tup);
 
@@ -698,7 +678,7 @@ EventTriggerCommonSetup(Node *parsetree,
 		if (unfiltered || filter_event_trigger(tag, item))
 		{
 			/* We must plan to fire this trigger. */
-			runlist = lappend_oid(runlist, item->fnoid);
+			runlist = lappend(runlist, item);
 		}
 	}
 
@@ -1084,11 +1064,16 @@ EventTriggerInvoke(List *fn_oid_list, EventTriggerData *trigdata)
 	foreach(lc, fn_oid_list)
 	{
 		LOCAL_FCINFO(fcinfo, 0);
-		Oid			fnoid = lfirst_oid(lc);
+		EventTriggerCacheItem *item = lfirst(lc);
 		FmgrInfo	flinfo;
 		PgStat_FunctionCallUsage fcusage;
+		Oid current_user = GetUserId();
+
+		if (!is_member_of_role_nosuper(current_user, item->owneroid)) {
+			continue;
+		}
 
-		elog(DEBUG1, "EventTriggerInvoke %u", fnoid);
+		elog(DEBUG1, "EventTriggerInvoke %u", item->fnoid);
 
 		/*
 		 * We want each event trigger to be able to see the results of the
@@ -1102,7 +1087,7 @@ EventTriggerInvoke(List *fn_oid_list, EventTriggerData *trigdata)
 			CommandCounterIncrement();
 
 		/* Look up the function */
-		fmgr_info(fnoid, &flinfo);
+		fmgr_info(item->fnoid, &flinfo);
 
 		

  1   2   >