Trigger violates foreign key constraint

2023-10-02 Thread Laurenz Albe
CREATE TABLE parent (id integer PRIMARY KEY);

CREATE TABLE child (id integer REFERENCES parent ON DELETE CASCADE);

CREATE FUNCTION silly() RETURNS trigger LANGUAGE plpgsql AS 'BEGIN RETURN NULL; 
END;';

CREATE TRIGGER silly BEFORE DELETE ON child FOR EACH ROW EXECUTE FUNCTION 
silly();

INSERT INTO parent VALUES (1);

INSERT INTO child VALUES (1);

DELETE FROM parent WHERE id = 1;

TABLE child;
 id 

  1
(1 row)

The trigger function cancels the cascaded delete on "child", and we are left 
with
a row in "child" that references no row in "parent".

Yours,
Laurenz Albe




Clean up some pg_dump tests

2023-10-02 Thread Peter Eisentraut
Following [0], I did a broader analysis of some dubious or nonsensical 
like/unlike combinations in the pg_dump tests.


This includes

1) Remove useless entries from "unlike" lists.  Runs that are not
   listed in "like" don't need to be excluded in "unlike".

2) Ensure there is always a "like" list, even if it is empty.  This
   makes the test more self-documenting.

3) Use predefined lists such as %full_runs where appropriate, instead
   of listing all runs separately.

I also added code that checks 1 and 2 automatically and issues a message
for violations.  (This is currently done with "diag".  We could also 
make it an error.)


The results are in the attached patch.

[0]: 
https://www.postgresql.org/message-id/3ddf79f2-8b7b-a093-11d2-5c739bc64...@eisentraut.orgFrom bf13888e63431c513e440982a72a76d1d7193777 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 2 Oct 2023 08:58:23 +0200
Subject: [PATCH] Clean up some pg_dump tests

1) Remove useless entries from "unlike" lists.  Runs that are not
   listed in "like" don't need to be excluded in "unlike".

2) Ensure there is always a "like" list, even if it is empty.  This
   makes the test more self-documenting.

3) Use predefined lists such as %full_runs where appropriate, instead
   of listing all runs separately.

Also add code that checks 1 and 2 automatically and issues a message
for violations.
---
 src/bin/pg_dump/t/002_pg_dump.pl | 72 +++-
 1 file changed, 15 insertions(+), 57 deletions(-)

diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 55e98ec8e3..8a93910269 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -818,7 +818,7 @@
regexp => qr/^\QALTER COLLATION public.test0 OWNER TO \E.+;/m,
collation => 1,
like => { %full_runs, section_pre_data => 1, },
-   unlike => { %dump_test_schema_runs, no_owner => 1, },
+   unlike => { no_owner => 1, },
},
 
'ALTER FOREIGN DATA WRAPPER dummy OWNER TO' => {
@@ -977,7 +977,7 @@
create_sql =>
  'ALTER SCHEMA public OWNER TO "regress_quoted  \"" role";',
regexp => qr/^(GRANT|REVOKE)/m,
-   unlike => { defaults_public_owner => 1 },
+   like => {},
},
 
'ALTER SEQUENCE test_table_col1_seq' => {
@@ -1285,9 +1285,7 @@
  { %full_runs, %dump_test_schema_runs, section_pre_data => 1, 
},
unlike => {
exclude_dump_test_schema => 1,
-   only_dump_test_table => 1,
no_owner => 1,
-   role => 1,
only_dump_measurement => 1,
},
},
@@ -1351,7 +1349,6 @@
binary_upgrade => 1,
no_large_objects => 1,
schema_only => 1,
-   section_pre_data => 1,
},
},
 
@@ -3210,7 +3207,6 @@
binary_upgrade => 1,
exclude_dump_test_schema => 1,
schema_only => 1,
-   only_dump_measurement => 1,
},
},
 
@@ -3457,7 +3453,6 @@
'Disabled trigger on partition is not created' => {
regexp => qr/CREATE TRIGGER test_trigger.*ON 
dump_test_second_schema/,
like => {},
-   unlike => { %full_runs, %dump_test_schema_runs },
},
 
# Triggers on partitions should not be dropped individually
@@ -3834,35 +3829,12 @@
\QCREATE INDEX measurement_city_id_logdate_idx ON ONLY 
dump_test.measurement USING\E
/xm,
like => {
-   binary_upgrade => 1,
-   clean => 1,
-   clean_if_exists => 1,
-   compression => 1,
-   createdb => 1,
-   defaults => 1,
-   exclude_test_table => 1,
-   exclude_test_table_data => 1,
-   no_toast_compression => 1,
-   no_large_objects => 1,
-   no_privs => 1,
-   no_owner => 1,
-   no_table_access_method => 1,
-   only_dump_test_schema => 1,
-   pg_dumpall_dbprivs => 1,
-   pg_dumpall_exclude => 1,
-   schema_only => 1,
+   %full_runs,
+   %dump_test_schema_runs,
section_post_data => 1,
-   test_schema_plus_large_objects => 1,
-   only_dump_measurement => 1,
-   exclude_measurement_data => 1,
},
unlike => {
exclude_dump_test_schema => 1,
-  

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-02 Thread Drouvot, Bertrand

Hi,

On 9/29/23 8:19 AM, Michael Paquier wrote:

On Thu, Sep 28, 2023 at 02:37:02PM +0200, Drouvot, Bertrand wrote:

This patch allows the role provided in BackgroundWorkerInitializeConnection()
and BackgroundWorkerInitializeConnectionByOid() to lack login authorization.


Interesting.  Yes, there would be use cases for that, I suppose.


Thanks for looking at it!




+uint32 flags,
 char *out_dbname)
  {


This may be more adapted with a bits32 for the flags.


Done that way in v2 attached.




+# Ask the background workers to connect with this role with the flag in place.
+$node->append_conf(
+'postgresql.conf', q{
+worker_spi.role = 'nologrole'
+worker_spi.bypass_login_check = true
+});
+$node->restart;
+
+# An error message should not be issued.
+ok( !$node->log_contains(
+"role \"nologrole\" is not permitted to log in", $log_start),
+"nologrole allowed to connect if BGWORKER_BYPASS_ROLELOGINCHECK is set");
+
  done_testing();


It would be cheaper to use a dynamic background worker for such tests.
Something that I've been tempted to do in this module is to extend the
amount of data that's given to bgw_main_arg when launching a worker
with worker_spi_launch().  How about extending the SQL function so as
it is possible to give in input a role name (or a regrole), a database
name (or a database OID) and a text[] for the flags?  This would
require a bit more refactoring, but this would be benefitial to show
or one can pass down a full structure from the registration to the
main() routine.  On top of that, it would make the addition of the new
GUCs worker_spi.bypass_login_check and worker_spi.role unnecessary.



I think that would make sense to have more flexibility in the worker_spi
module. I think that could be done in a dedicated patch though. I think it makes
more sense to have the current patch "focusing" on this new flag (while adding 
a test
about it without too much refactoring). What about doing the worker_spi module
re-factoring as a follow up of this one?


+# return the size of logfile of $node in bytes
+sub get_log_size
+{
+my ($node) = @_;
+
+return (stat $node->logfile)[7];
+}


Just use -s here.


Done in v2 attached.


See other tests that want to check the contents of
the logs from an offset.



Oh right, worth to modify 019_replslot_limit.pl, 002_corrupted.pl and
001_pg_controldata.pl in a separate patch for consistency? (they are using
"(stat $node->logfile)[7]" or "(stat($pg_control))[7]").


- * Allow bypassing datallowconn restrictions when connecting to database
+ * Allow bypassing datallowconn restrictions and login check when connecting
+ * to database
   */
-#define BGWORKER_BYPASS_ALLOWCONN 1
+#define BGWORKER_BYPASS_ALLOWCONN 0x0001
+#define BGWORKER_BYPASS_ROLELOGINCHECK 0x0002


The structure of the patch is inconsistent.  These flags are in
bgworker.h, but they are used also by InitPostgres().  Perhaps a
second boolean flag would be OK rather than a second set of flags for
InitPostgres() mapping with the bgworker set.


I did not want initially to add an extra parameter to InitPostgres() but
I agree it would make more sense. So done that way in v2.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom 59de41f3c305b009babfa6d05c054d8ae9e4d730 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Wed, 20 Sep 2023 08:28:59 +
Subject: [PATCH v2] Allow background workers to bypass login check

This adds a new flag value (BGWORKER_BYPASS_ROLELOGINCHECK) to the flags being
used in BackgroundWorkerInitializeConnection() and 
BackgroundWorkerInitializeConnectionByOid().

This flag allows the background workers to bypass the login check done in 
InitializeSessionUserId().
---
 doc/src/sgml/bgworker.sgml|  3 +-
 src/backend/bootstrap/bootstrap.c |  2 +-
 src/backend/postmaster/autovacuum.c   |  5 +--
 src/backend/postmaster/postmaster.c   |  6 ++-
 src/backend/tcop/postgres.c   |  1 +
 src/backend/utils/init/miscinit.c |  4 +-
 src/backend/utils/init/postinit.c |  5 ++-
 src/include/miscadmin.h   |  4 +-
 src/include/postmaster/bgworker.h | 10 +++--
 .../modules/worker_spi/t/001_worker_spi.pl| 40 +++
 src/test/modules/worker_spi/worker_spi.c  | 25 +++-
 11 files changed, 88 insertions(+), 17 deletions(-)
   7.6% doc/src/sgml/
  11.4% src/backend/postmaster/
  12.6% src/backend/utils/init/
  11.8% src/include/postmaster/
  29.2% src/test/modules/worker_spi/t/
  23.2% src/test/modules/worker_spi/
   4.0% src/

diff --git a/doc/src/sgml/bgworker.sgml b/doc/src/sgml/bgworker.sgml
index 9ad1146ba0..7335c02ed8 100644
--- a/doc/src/sgml/bgworker.sgml
+++ b/doc/src/sgml/bgworker.sgml
@@ -200,7 +200,8 @@ typedef struct BackgroundWorker
InvalidOid, the process will run as the superuser 

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-02 Thread Michael Paquier
On Mon, Oct 02, 2023 at 10:01:04AM +0200, Drouvot, Bertrand wrote:
> I think that would make sense to have more flexibility in the worker_spi
> module. I think that could be done in a dedicated patch though. I
> think it makes more sense to have the current patch "focusing" on
> this new flag (while adding a test about it without too much
> refactoring). What about doing the worker_spi module  re-factoring
> as a follow up of this one?

I would do that first, as that's what I usually do, but I see also
your point that this is not mandatory.  If you want, I could give it a
shot tomorrow to see where it leads.

> Oh right, worth to modify 019_replslot_limit.pl, 002_corrupted.pl and
> 001_pg_controldata.pl in a separate patch for consistency? (they are using
> "(stat $node->logfile)[7]" or "(stat($pg_control))[7]").

Indeed, that's strange.  Let's remove the dependency to stat here.
The other solution is slightly more elegant IMO, as we don't rely on
the position of the result from stat().
--
Michael


signature.asc
Description: PGP signature


Re: pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()

2023-10-02 Thread Jakub Wartak
On Fri, Sep 29, 2023 at 4:00 AM Michael Paquier  wrote:
>
> On Thu, Sep 28, 2023 at 11:01:14AM +0200, Jakub Wartak wrote:
> > v3 attached. I had a problem coming out with a better error message,
> > so suggestions are welcome.  The cast still needs to be present as per
> > above suggestion as 3GB is still valid buf size and still was causing
> > integer overflow. We just throw an error on >= 4GB with v3.
>
> +/* Safety net to prevent requesting huge memory by each query to 
> pg_stat_activity */
> +#define PGSTAT_MAX_ACTIVITY_BUF_SIZE 4 * 1024 * 1024 * 1024L
>
> -   size = add_size(size,
> -   
> mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
> +   pgstat_track_size = mul_size(pgstat_track_activity_query_size,
> +   NumBackendStatSlots);
> +   if(pgstat_track_size >= PGSTAT_MAX_ACTIVITY_BUF_SIZE)
> +   elog(FATAL, "too big Backend Activity Buffer allocation of 
> %zu bytes", pgstat_track_size);
> +   size = add_size(size, pgstat_track_size);
>
> That should be enough to put in a hardcoded 4GB safety limit, while
> mul_size() detects it at a higher range.  Note, however, that elog()
> is only used for internal errors that users should never face, but
> this one can come from a misconfiguration.  This would be better as an
> ereport(), with ERRCODE_CONFIG_FILE_ERROR as errcode, I guess.
>
> "Backend Activity Buffer" is the name of the internal struct.  Sure,
> it shows up on the system views for shmem, but I wouldn't use this
> term in a user-facing error message.  Perhaps something like "size
> requested for backend status is out of range" would be cleaner.  Other
> ideas are welcome.

Hi Michael,

I've attached v4 that covers your suggestions.

-J.


v4-0001-Introduce-memory-limit-for-BackendActivityBuffer-.patch
Description: Binary data


Re: TAP tests for psql \g piped into program

2023-10-02 Thread Heikki Linnakangas

On 29/03/2023 21:39, Daniel Verite wrote:

Peter Eisentraut wrote:


So for your patch, I would just do the path adjustment ad hoc in-line.
It's just one additional line.


Here's the patch updated that way.


Committed, thanks!

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Pre-proposal: unicode normalized text

2023-10-02 Thread Peter Eisentraut

On 13.09.23 00:47, Jeff Davis wrote:

The idea is to have a new data type, say "UTEXT", that normalizes the
input so that it can have an improved notion of equality while still
using memcmp().


I think a new type like this would obviously be suboptimal because it's 
nonstandard and most people wouldn't use it.


I think a better direction here would be to work toward making 
nondeterministic collations usable on the global/database level and then 
encouraging users to use those.


It's also not clear which way the performance tradeoffs would fall.

Nondeterministic collations are obviously going to be slower, but by how 
much?  People have accepted moving from C locale to "real" locales 
because they needed those semantics.  Would it be any worse moving from 
real locales to "even realer" locales?


On the other hand, a utext type would either require a large set of its 
own functions and operators, or you would have to inject text-to-utext 
casts in places, which would also introduce overhead.





Re: Making the subquery alias optional in the FROM clause

2023-10-02 Thread Dean Rasheed
On Mon, 2 Oct 2023 at 01:01, Tom Lane  wrote:
>
> Erwin Brandstetter  writes:
> > On Mon, 2 Oct 2023 at 00:33, Dean Rasheed  wrote:
> >> The only place that exposes the eref's made-up relation name is the
> >> existing query deparsing code in ruleutils.c, which uniquifies it and
> >> generates SQL spec-compliant output. For example:
>
> > I ran into one other place: error messages.
> > SELECT unnamed_subquery.a
> > FROM (SELECT 1 AS a)
> > ERROR:  There is an entry for table "unnamed_subquery", but it cannot be
> > referenced from this part of the query.invalid reference to FROM-clause
> > entry for table "unnamed_subquery"
>
> Yeah, that's exposing more of the implementation than we really want.
>

Note that this isn't a new issue, specific to unnamed subqueries. The
same thing happens for unnamed joins:

create table foo(a int);
create table bar(a int);
select unnamed_join.a from foo join bar using (a);

ERROR:  invalid reference to FROM-clause entry for table "unnamed_join"
LINE 1: select unnamed_join.a from foo join bar using (a);
   ^
DETAIL:  There is an entry for table "unnamed_join", but it cannot be
referenced from this part of the query.


And there's a similar problem with VALUES RTEs:

insert into foo values (1),(2) returning "*VALUES*".a;

ERROR:  invalid reference to FROM-clause entry for table "*VALUES*"
LINE 1: insert into foo values (1),(2) returning "*VALUES*".a;
 ^
DETAIL:  There is an entry for table "*VALUES*", but it cannot be
referenced from this part of the query.

> I'm inclined to think we should avoid letting "unnamed_subquery"
> appear in the parse tree, too.  It might not be a good idea to
> try to leave the eref field null, but could we set it to an
> empty string instead, that is
>
> -   eref = alias ? copyObject(alias) : makeAlias("unnamed_subquery", NIL);
> +   eref = alias ? copyObject(alias) : makeAlias("", NIL);
>
> and then let ruleutils replace that with "unnamed_subquery"?

Hmm, I think that there would be other side-effects if we did that --
at least doing it for VALUES RTEs would also require additional
changes to retain current EXPLAIN output. I think perhaps it would be
better to try for a more targeted fix of the parser error reporting.

In searchRangeTableForRel() we try to find any RTE that could possibly
match the RangeVar, but certain kinds of RTE don't naturally have
names, and if they also haven't been given aliases, then they can't
possibly match anywhere in the query (and thus it's misleading to
report that they can't be referred to from specific places).

So I think perhaps it's better to just have searchRangeTableForRel()
exclude these kinds of RTE, if they haven't been given an alias.

Regards,
Dean
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
new file mode 100644
index 864ea9b..0afe177
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -395,6 +395,20 @@ searchRangeTableForRel(ParseState *pstat
 		{
 			RangeTblEntry *rte = (RangeTblEntry *) lfirst(l);
 
+			/*
+			 * Ignore RTEs that do not automatically take their names from
+			 * database objects, and have not been supplied with aliases
+			 * (e.g., unnamed joins).  Such RTEs cannot possibly match the
+			 * RangeVar, and cannot be referenced from anywhere in the query.
+			 * Thus, it would be misleading to complain that they can't be
+			 * referenced from particular parts of the query.
+			 */
+			if (!rte->alias &&
+(rte->rtekind == RTE_SUBQUERY ||
+ rte->rtekind == RTE_JOIN ||
+ rte->rtekind == RTE_VALUES))
+continue;
+
 			if (rte->rtekind == RTE_RELATION &&
 OidIsValid(relId) &&
 rte->relid == relId)
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
new file mode 100644
index 9b8638f..2bbf3bf
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2895,6 +2895,26 @@ select bar.*, unnamed_join.* from
 ERROR:  FOR UPDATE cannot be applied to a join
 LINE 3:   for update of bar;
 ^
+-- Test error reporting of references to internal aliases
+select unnamed_join.* from
+  t1 join t2 on (t1.a = t2.a);
+ERROR:  missing FROM-clause entry for table "unnamed_join"
+LINE 1: select unnamed_join.* from
+   ^
+select unnamed_join.* from
+  (select * from t1 join t2 using (a)) ss join t3 on (t3.x = ss.a);
+ERROR:  missing FROM-clause entry for table "unnamed_join"
+LINE 1: select unnamed_join.* from
+   ^
+select unnamed_subquery.* from
+  (select * from t1);
+ERROR:  missing FROM-clause entry for table "unnamed_subquery"
+LINE 1: select unnamed_subquery.* from
+   ^
+insert into t1 values (1, 10), (2, 20) returning "*VALUES*".*;
+ERROR:  missing FROM-clause entry for table "*VALUES*"
+LINE 1: insert into t1 values (1, 10), (2, 20) returning "*VALUES*"
+ ^

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-02 Thread Drouvot, Bertrand

Hi,

On 10/2/23 10:17 AM, Michael Paquier wrote:

On Mon, Oct 02, 2023 at 10:01:04AM +0200, Drouvot, Bertrand wrote:

I think that would make sense to have more flexibility in the worker_spi
module. I think that could be done in a dedicated patch though. I
think it makes more sense to have the current patch "focusing" on
this new flag (while adding a test about it without too much
refactoring). What about doing the worker_spi module  re-factoring
as a follow up of this one?


I would do that first, as that's what I usually do,


The reason I was thinking not doing that first is that there is no real use
case in the current worker_spi module test.


but I see also
your point that this is not mandatory.  If you want, I could give it a
shot tomorrow to see where it leads.


Oh yeah that would be great (and maybe you already see a use case in the
current test). Thanks!


Oh right, worth to modify 019_replslot_limit.pl, 002_corrupted.pl and
001_pg_controldata.pl in a separate patch for consistency? (they are using
"(stat $node->logfile)[7]" or "(stat($pg_control))[7]").


Indeed, that's strange.  Let's remove the dependency to stat here.
The other solution is slightly more elegant IMO, as we don't rely on
the position of the result from stat().


Agree, I will propose a new patch for this.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Replace (stat())[7] in TAP tests with -s

2023-10-02 Thread Drouvot, Bertrand

Hi hackers,

Please find attached a tiny patch to $SUBJECT.

It:

 - provides more consistency to the way we get files size in TAP tests
 - seems more elegant that relying on a hardcoded result position

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom a227eb957ca32dcb1b53a7786992732b1d02c130 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Mon, 2 Oct 2023 09:01:20 +
Subject: [PATCH v1] Be consistent in TAP tests for the way we get files size

Using -s is mainly used to get files size but there is some places making use
of "stat" and its 7th result position.

Replacing those by -s instead as it's more elegant than relying on this 
hardcoded
7th position.
---
 src/bin/pg_controldata/t/001_pg_controldata.pl |  2 +-
 src/bin/pg_resetwal/t/002_corrupted.pl |  2 +-
 src/test/recovery/t/019_replslot_limit.pl  | 14 +++---
 3 files changed, 5 insertions(+), 13 deletions(-)
  12.5% src/bin/pg_controldata/t/
  12.5% src/bin/pg_resetwal/t/
  74.8% src/test/recovery/t/

diff --git a/src/bin/pg_controldata/t/001_pg_controldata.pl 
b/src/bin/pg_controldata/t/001_pg_controldata.pl
index 4918ea8944..9db8015d0b 100644
--- a/src/bin/pg_controldata/t/001_pg_controldata.pl
+++ b/src/bin/pg_controldata/t/001_pg_controldata.pl
@@ -24,7 +24,7 @@ command_like([ 'pg_controldata', $node->data_dir ],
 # check with a corrupted pg_control
 
 my $pg_control = $node->data_dir . '/global/pg_control';
-my $size = (stat($pg_control))[7];
+my $size = -s $pg_control;
 
 open my $fh, '>', $pg_control or BAIL_OUT($!);
 binmode $fh;
diff --git a/src/bin/pg_resetwal/t/002_corrupted.pl 
b/src/bin/pg_resetwal/t/002_corrupted.pl
index 6d19a1efd5..b3a37728a4 100644
--- a/src/bin/pg_resetwal/t/002_corrupted.pl
+++ b/src/bin/pg_resetwal/t/002_corrupted.pl
@@ -14,7 +14,7 @@ my $node = PostgreSQL::Test::Cluster->new('main');
 $node->init;
 
 my $pg_control = $node->data_dir . '/global/pg_control';
-my $size = (stat($pg_control))[7];
+my $size = -s $pg_control;
 
 # Read out the head of the file to get PG_CONTROL_VERSION in
 # particular.
diff --git a/src/test/recovery/t/019_replslot_limit.pl 
b/src/test/recovery/t/019_replslot_limit.pl
index 33e50ad933..7d94f15778 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -173,7 +173,7 @@ $node_primary->safe_psql('postgres',
"ALTER SYSTEM SET max_wal_size='40MB'; SELECT pg_reload_conf()");
 
 # Advance WAL again. The slot loses the oldest segment by the next checkpoint
-my $logstart = get_log_size($node_primary);
+my $logstart = -s $node_primary->logfile;
 advance_wal($node_primary, 7);
 
 # Now create another checkpoint and wait until the WARNING is issued
@@ -229,7 +229,7 @@ $node_primary->safe_psql('postgres',
 is($oldestseg, $redoseg, "check that segments have been removed");
 
 # The standby no longer can connect to the primary
-$logstart = get_log_size($node_standby);
+$logstart = -s $node_standby->logfile;
 $node_standby->start;
 
 my $failed = 0;
@@ -368,7 +368,7 @@ my $receiverpid = $node_standby3->safe_psql('postgres',
"SELECT pid FROM pg_stat_activity WHERE backend_type = 'walreceiver'");
 like($receiverpid, qr/^[0-9]+$/, "have walreceiver pid $receiverpid");
 
-$logstart = get_log_size($node_primary3);
+$logstart = -s $node_primary3->logfile;
 # freeze walsender and walreceiver. Slot will still be active, but walreceiver
 # won't get anything anymore.
 kill 'STOP', $senderpid, $receiverpid;
@@ -433,12 +433,4 @@ sub advance_wal
return;
 }
 
-# return the size of logfile of $node in bytes
-sub get_log_size
-{
-   my ($node) = @_;
-
-   return (stat $node->logfile)[7];
-}
-
 done_testing();
-- 
2.34.1



Re: wal recycling problem

2023-10-02 Thread Christoph Moench-Tegeder
Hi,

## Fabrice Chapuis (fabrice636...@gmail.com):

> on the other hand there are 2 slots for logical replication which display
> status extended. I don't understand why given that the confirmed_flush_lsn
> field that is up to date. The restart_lsn remains frozen, for what reason?

There you have it - "extended" means "holding wal". And as long as the
restart_lsn does not advance, checkpointer cannot free any wal beyond
that lsn. My first idea would be some long-running (or huge) transaction
which is in process (active or still being streamed). I'd recommend
looking into what the clients on these slots are doing.

Regards,
Christoph

-- 
Spare Space




Re: Trigger violates foreign key constraint

2023-10-02 Thread Laurenz Albe
Perhaps it would be enough to run "RI_FKey_noaction_del" after
"RI_FKey_cascade_del", although that would impact the performance.

Yours,
Laurenz Albe




Re: Replace (stat())[7] in TAP tests with -s

2023-10-02 Thread Dagfinn Ilmari Mannsåker
"Drouvot, Bertrand"  writes:

> Hi hackers,
>
> Please find attached a tiny patch to $SUBJECT.
>
> It:
>
>  - provides more consistency to the way we get files size in TAP tests
>  - seems more elegant that relying on a hardcoded result position

I approve of removing use of the list form of stat, it's a horrible API.

If we weren't already using -s everywhere else, I would prefer
File::stat, which makes stat (in scalar context) return an object with
methods for the fields, so you'd do stat($file)->size.  It's included in
Perl core since 5.4, and we're already using it in several places for
other fields (mode and ino at least).

I see another use of stat array positions (for mtime) in
src/tools/msvc/Solution.pm, but that's on the chopping block, so not
much point in fixing.

- ilmari




Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

2023-10-02 Thread Pavel Borisov
Hi!

On Mon, 2 Oct 2023 at 03:34, Peter Geoghegan  wrote:
>
> On Sun, Oct 1, 2023 at 11:46 AM Peter Eisentraut  wrote:
> > What is the status of this patch discussion now?  It had been set as
> > Ready for Committer for some months.  Do these recent discussions
> > invalidate that?  Does it need more discussion?
>
> I don't think that recent discussion invalidated anything. I meant to
> follow-up on investigating the extent to which anything could hold up
> OldestMXact without also holding up OldestXmin/removable cutoff, but
> that doesn't seem essential.
>
> This patch does indeed seem "ready for committer". John?
>
> --
> Peter Geoghegan

FWIW I think the patch is still in good shape and worth to be committed.

Regards,
Pavel Borisov
Supabase




Re: Remove ParallelReadyList and worker_spi_state from typedefs.list

2023-10-02 Thread vignesh C
On Sat, 30 Sept 2023 at 21:44, Tom Lane  wrote:
>
> vignesh C  writes:
> > Remove ParallelReadyList and worker_spi_state from typedefs.list,
> > these structures have been removed as part of an earlier commits,
> > ParallelReadyList was removed as part of
> > 9bfd44bbde4261181bf94738f3b041c629c65a7e. worker_spi_state was removed
> > as part of af720b4c50a122647182f4a030bb0ea8f750fe2f.
> > Attached a patch for fixing the same.
>
> I don't think we need to trouble with removing such entries by hand.
> I still anticipate updating typedefs.list from the buildfarm at least
> once per release cycle, and that will take care of cleaning out
> obsolete entries.

Thanks, that will take care of removal at that time.

Regards,
Vignesh




Re: Eager page freeze criteria clarification

2023-10-02 Thread Robert Haas
On Fri, Sep 29, 2023 at 8:50 PM Peter Geoghegan  wrote:
> While pgbench makes a fine stress-test, for the most part its workload
> is highly unrealistic. And yet we seem to think that it's just about
> the most important benchmark of all. If we're not willing to get over
> even small regressions in pgbench, I fear we'll never make significant
> progress in this area. It's at least partly a cultural problem IMV.

I think it's true that the fact that pgbench does what pgbench does
makes us think more about that workload than about some other, equally
plausible workload. It's the test we have, so we end up running it a
lot. If we had some other test, we'd run that one.

But I don't think I buy that it's "highly unrealistic." It is true
that pgbench with default options is limited only by the speed of the
database, while real update-heavy workloads are typically limited by
something external, like the number of users hitting the web site that
the database is backing. It's also true that real workloads tend to
involve some level of non-uniform access. But I'm not sure that either
of those things really matter that much in the end.

The problem I have with the external rate-limiting argument is that it
ignores hardware selection, architectural decision-making, and
workload growth. Sure, people are unlikely to stand up a database that
can do 10,000 transactions per second and hit it with a workload that
requires doing 20,000 transactions per second, because they're going
to find out in testing that it doesn't work. Then they will either buy
more hardware or rearchitect the system to reduce the required number
of transactions per second or give up on using PostgreSQL. So when
they do put it into production, it's unlikely to be overloaded on day
one. But that's just because all of the systems that would have been
overloaded on day one never make it to day one. They get killed off or
changed before they get there. So it isn't like higher maximum
throughput wouldn't have been beneficial. And over time, systems that
initially started out not running maxed out can and, in my experience,
fairly often do end up maxed out because once you've got the thing in
production, it's hard to change anything, and load often does grow
over time.

As for non-uniform access, that is real and does matter, but there are
certainly installations with tables where no rows survive long enough
to need freezing, either because the table is regularly emptied, or
just because the update load is high enough to hit all the rows fairly
quickly.

Maybe I'm misunderstanding your point here, in which case all of the
above may be irrelevant. But my feeling is that we can't simply ignore
cases where all/many rows are short-lived and say, well, those are
unrealistic, so let's just freeze everything super-aggressively and
that should be fine. I don't think that's OK. We can (and I think we
should) treat that situation as a special case rather than as the
typical case, but I think it would be a bad idea to dismiss it as a
case that we don't need to worry about at all.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Making the subquery alias optional in the FROM clause

2023-10-02 Thread Tom Lane
Dean Rasheed  writes:
> On Mon, 2 Oct 2023 at 01:01, Tom Lane  wrote:
>> Yeah, that's exposing more of the implementation than we really want.

> Note that this isn't a new issue, specific to unnamed subqueries. The
> same thing happens for unnamed joins:

True, and we've had few complaints about that.  Still, if we can
clean it up without too much effort, let's do so.

> So I think perhaps it's better to just have searchRangeTableForRel()
> exclude these kinds of RTE, if they haven't been given an alias.

Would we need a new flag in the ParseNamespaceItem data structure,
or will the existing data serve?  I see how to do this if we add
a "doesn't really have a name" flag, but it's not clear to me that
we can reliably identify them otherwise.  Maybe a test involving
the rtekind and whether the "alias" field is set would do, but
that way seems a bit ugly.

regards, tom lane




Implementing LRU cache for postgresql extension

2023-10-02 Thread Lakshmi Narayana Velayudam
Hi I am trying to implement a LRU cache in postgres extension. Please find
the markdown file for more details. Looking forward to hearing from you.
I am trying to write an LRU cache which can hold any type of data item in c for a postgresql extension. Everything is fine except that I am relying on postgresql hashmap which itself allocates  memory for entry and also maintains the same memory in a free list when removed.

The struct I defined:
```c
typedef void *(*KeyConstructFunction)(void *data);
typedef void *(*EntryConstructFunction)(void *data);
typedef void (*DestoryEntryFunction)(void *data);

struct LRU_Cache
{
	dlist_head *cacheList;
	HTAB *cacheHashmap;

	KeyConstructFunction keyConstructFunction;
	EntryConstructFunction valueConstructFunction;
	// will be called when I remove an entry from cache
	DestoryEntryFunction elemDestroyFunction;
};
```

So the problem is since I need to maintain the data order I am using dlist_node in entry which will  be given to dlist related functions to add into cacheList. since I won't the type of data passed I can't use the dlist functions as I don't know to which type I need to type cast. Consider the below function.
```c
dlist_container(ENTRY_TYPE, node, dlist_head_node(&LRU_Cache->cacheList));
```
This can be solved by giving called may be another function pointer but I would have to then do it for some other dlist functions to so I thought that won't be a good approach and avoided that approach. since I don't know the entry type I am unable to do this operation which gives me the entire data item. I though using a standard struct like below
```c
typedef struct Cache_Entry
{
	void *key;
	void *value;
	dlist_node node;
} Cache_Entry;
```
But the problem is if I give key size and entry size then it will be like 8 bytes and 32 bytes respectively due to this I can only hold 32 bytes in free list and have to specify called to never free key, value he/she passes. 

Seems like I got into some kind of dead lock. Is there a way to solve this design issue without compromising the generic thing.

[PATCH] Fix memory leak in memoize for numeric key

2023-10-02 Thread Orlov Aleksej
Hello, all!

I found a query which consumes a lot of memory and triggers OOM killer. 
Memory leak occurs in memoize node for numeric key.
Version postgresql is 14.9. The problem is very 
similar 
https://www.postgresql.org/message-id/17844-d2f6f9e75a622...@postgresql.org

I attached to the backend with a debugger and set a breakpoint in AllocSetAlloc 

(gdb) bt 10
#0  AllocSetAlloc (context=0x5c55086dc2f0, size=12) at aset.c:722
#1  0x5c5507d886e0 in palloc (size=size@entry=12) at mcxt.c:1082
#2  0x5c5507890bba in detoast_attr (attr=0x715d5daa04c9) at detoast.c:184
#3  0x5c5507d62375 in pg_detoast_datum (datum=) at 
fmgr.c:1725
#4  0x5c5507cc94ea in hash_numeric (fcinfo=) at 
numeric.c:2554
#5  0x5c5507d61570 in FunctionCall1Coll 
(flinfo=flinfo@entry=0x5c5508b93d00, 
collation=, arg1=) at fmgr.c:1138
#6  0x5c5507aadc16 in MemoizeHash_hash (key=0x0, tb=) at 
nodeMemoize.c:199
#7  0x5c5507aadf22 in memoize_insert (key=0x0, found=,
 tb=0x5c5508bb4760) at ../../../src/include/lib/simplehash.h:762
#8  cache_lookup (found=, mstate=0x5c5508b91418) at 
nodeMemoize.c:519
#9  ExecMemoize (pstate=0x5c5508b91418) at nodeMemoize.c:705

I was able to create reproducible test case on machine with default config
and postgresql 14.9: 

CREATE TABLE table1 (
id numeric(38) NOT NULL,
col1 text,
CONSTRAINT id2 PRIMARY KEY (id)
);
CREATE TABLE table2 (
id numeric(38) NOT NULL,
id_table1 numeric(38) NULL,
CONSTRAINT id1 PRIMARY KEY (id)
);
ALTER TABLE table2 ADD CONSTRAINT constr1 FOREIGN KEY (id_table1) REFERENCES 
table1(id);

INSERT INTO table1 (id, col1)
SELECT id::numeric, id::text
 FROM  generate_series(30, 30 + 60) gs(id);

INSERT INTO table2 (id, id_table1)
SELECT id::numeric , (select floor(random() * 60)::numeric + 
30)::numeric
  FROM generate_series(1,60) gs(id);

set max_parallel_workers_per_gather=0;
set enable_hashjoin = off;

EXPLAIN analyze
select sum(q.id_table1)
   from (
SELECT t2.*
  FROM table1 t1
  JOIN table2 t2
ON t2.id_table1 = t1.id) q;

Plan:
Aggregate  (cost=25744.90..25744.91 rows=1 width=32) (actual 
time=380.140..380.142 rows=1 loops=1)
  ->  Nested Loop  (cost=0.43..24244.90 rows=60 width=9) (actual 
time=0.063..310.915 rows=60 loops=1)
->  Seq Scan on table2 t2  (cost=0.00..9244.00 rows=60 width=9) 
(actual time=0.009..38.629 rows=60 loops=1)
->  Memoize  (cost=0.43..0.47 rows=1 width=8) (actual time=0.000..0.000 
rows=1 loops=60)
  Cache Key: t2.id_table1
  Cache Mode: logical
  Hits: 59  Misses: 1  Evictions: 0  Overflows: 0  Memory 
Usage: 1kB
  ->  Index Only Scan using id2 on table1 t1  (cost=0.42..0.46 
rows=1 width=8) (actual time=0.039..0.040 rows=1 loops=1)
Index Cond: (id = t2.id_table1)
Heap Fetches: 0
Planning Time: 0.445 ms
Execution Time: 380.750 ms

I've attached memoize_memory_leak_numeric_key.patch to address this.

Using test case, here are the memory stats before and after the
fix (taken during ExecEndMemoize by using 
MemoryContextStatsDetail(TopMemoryContext, 100, 1)).
Before:
 ExecutorState: 25209672 total in 15 blocks; 1134432 free (7 chunks); 24075240 
used
  MemoizeHashTable: 8192 total in 1 blocks; 7480 free (1 chunks); 712 used

After:
 ExecutorState: 76616 total in 5 blocks; 1776 free (8 chunks); 74840 used
  MemoizeHashTable: 8192 total in 1 blocks; 7480 free (1 chunks); 712 used  


Thanks,
Alexei Orlov
al.or...@cft.ru,
apor...@gmail.com


memoize_memory_leak_numeric_key.patch
Description: memoize_memory_leak_numeric_key.patch


Re: Trigger violates foreign key constraint

2023-10-02 Thread Tom Lane
Laurenz Albe  writes:
> CREATE FUNCTION silly() RETURNS trigger LANGUAGE plpgsql AS 'BEGIN RETURN 
> NULL; END;';
> CREATE TRIGGER silly BEFORE DELETE ON child FOR EACH ROW EXECUTE FUNCTION 
> silly();

> The trigger function cancels the cascaded delete on "child", and we are left 
> with
> a row in "child" that references no row in "parent".

Yes.  This is by design: triggers operate at a lower level than
foreign keys, so an ill-conceived trigger can break an FK constraint.
That's documented somewhere, though maybe not visibly enough.

There are good reasons to want triggers to be able to see and
react to FK-driven updates, so it's unlikely that we'd want to
revisit that design decision, even if it hadn't already stood
for decades.

regards, tom lane




Re: Trigger violates foreign key constraint

2023-10-02 Thread Laurenz Albe
On Mon, 2023-10-02 at 09:49 -0400, Tom Lane wrote:
> Laurenz Albe  writes:
> > CREATE FUNCTION silly() RETURNS trigger LANGUAGE plpgsql AS 'BEGIN RETURN 
> > NULL; END;';
> > CREATE TRIGGER silly BEFORE DELETE ON child FOR EACH ROW EXECUTE FUNCTION 
> > silly();
> 
> > The trigger function cancels the cascaded delete on "child", and we are 
> > left with
> > a row in "child" that references no row in "parent".
> 
> Yes.  This is by design: triggers operate at a lower level than
> foreign keys, so an ill-conceived trigger can break an FK constraint.
> That's documented somewhere, though maybe not visibly enough.
> 
> There are good reasons to want triggers to be able to see and
> react to FK-driven updates, so it's unlikely that we'd want to
> revisit that design decision, even if it hadn't already stood
> for decades.

Thanks for the clarification.  I keep learning.

I didn't find anything about that in the documentation or the
READMEs in the source, but perhaps I didn't look well enough.

Yours,
Laurenz Albe




pg*.dll and *.pdb files in psqlODBC have no version numbers

2023-10-02 Thread Mark Hill
I posted this question to 
pgsql-gene...@lists.postgrsql.org 
last week but on one has responded so posting here now.

We download the ODBC source from 
http://ftp.postgresql.org and build it on-site, 
13.02. in this case.

A colleague noticed that the following files in the psqlODBC MSI for Windows 
have no version numbers:
pgenlist.dll
pgenlista.dll
pgxalib.dll
pgenlist.pdb
pgenlista.pdb
psqlodbc30a.pdb
psqlodbc35w.pdb

Does anyone know if that is be design or some other reason?   Should they have 
version numbers?

I checked earlier build and the same holds for ODBC 12.02..
Thanks, Mark



Re: CHECK Constraint Deferrable

2023-10-02 Thread Himanshu Upadhyaya
On Tue, Sep 12, 2023 at 2:56 PM vignesh C  wrote:

> On Thu, 7 Sept 2023 at 17:26, Himanshu Upadhyaya
>  wrote:
> >
> > Attached is v2 of the patch, rebased against the latest HEAD.
>
> Thanks for working on this, few comments:
> 1) "CREATE TABLE check_constr_tbl (i int CHECK(i<>0) DEFERRABLE, t
> text)" is crashing in windows, the same was noticed in CFBot too:
> 2023-09-11 08:11:36.585 UTC [58563][client backend]
> [pg_regress/constraints][13/880:0] LOG:  statement: CREATE TABLE
> check_constr_tbl (i int CHECK(i<>0) DEFERRABLE, t text);
> 2023-09-11 08:11:36.586 UTC [58560][client backend]
> [pg_regress/inherit][15/391:0] LOG:  statement: drop table c1;
> ../src/backend/commands/trigger.c:220:26: runtime error: member access
> within null pointer of type 'struct CreateTrigStmt'
> ==58563==Using libbacktrace symbolizer.
>
> The details of CFBot failure can be seen at [1]
>
> I have tried it with my latest patch on windows environment and not
getting any crash with the above statement, will do further analysis if
this patch also has the same issue.

> 2) Alter of check constraint deferrable is not handled, is this
> intentional?
> CREATE TABLE check_constr_tbl (i int CHECK(i<>0) DEFERRABLE, t text);
> postgres=# alter table check_constr_tbl alter constraint
> check_constr_tbl_i_check not deferrable;
> ERROR:  constraint "check_constr_tbl_i_check" of relation
> "check_constr_tbl" is not a foreign key constraint
>
> This is not allowed for any constraint type but FOREIGN key. I am not very
sure about if there is any limitation with this so wanted to take opinion
from other hackers on this.

> 3) Should we handle this scenario for domains too:
> CREATE DOMAIN c1_check AS INT CHECK(VALUE > 10);
> create table test(c1 c1_check);
> alter domain c1_check ADD check (VALUE > 20) DEFERRABLE INITIALLY DEFERRED;
>
> begin;
> -- should this be deffered
> insert into test values(19);
> ERROR:  value for domain c1_check violates check constraint
> "c1_check_check1"
>
> We are planning to have a follow-up patch once this initial patch is
committed.

> 4) There is one warning:
> heap.c: In function ‘StoreRelCheck’:
> heap.c:2178:24: warning: implicit declaration of function
> ‘CreateTrigger’ [-Wimplicit-function-declaration]
>  2178 | (void) CreateTrigger(trigger, NULL,
> RelationGetRelid(rel),
>   |

Fixed in V3 patch.

> ^
>
> 5) This should be added to typedefs.list file:
> +typedef enum checkConstraintRecheck
> +{
> +   CHECK_RECHECK_DISABLED, /* Recheck of CHECK constraint
> is disabled, so
> +*
> DEFERRED CHECK constraint will be
> +*
> considered as non-deferrable check
> +*
> constraint.  */
> +   CHECK_RECHECK_ENABLED,  /* Recheck of CHECK constraint
> is enabled, so
> +*
> CHECK constraint will be validated but
> +*
> error will not be reported for deferred
> +*
> CHECK constraint. */
> +   CHECK_RECHECK_EXISTING  /* Recheck of existing violated
> CHECK
> +*
> constraint, indicates that this is a
> +*
> deferred recheck of a row that was reported
> +* as
> a potential violation of CHECK
> +*
> CONSTRAINT */
> +}  checkConstraintRecheck;
>
> Fixed in V3 patch.


-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com


Re: CHECK Constraint Deferrable

2023-10-02 Thread Himanshu Upadhyaya
On Mon, Oct 2, 2023 at 8:31 PM Himanshu Upadhyaya <
upadhyaya.himan...@gmail.com> wrote:

V3 patch attached.

>

-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com


v3-0001-Implementation-of-CHECK-Constraint-to-make-it.patch
Description: Binary data


Re: Eager page freeze criteria clarification

2023-10-02 Thread Peter Geoghegan
On Mon, Oct 2, 2023 at 6:26 AM Robert Haas  wrote:
> I think it's true that the fact that pgbench does what pgbench does
> makes us think more about that workload than about some other, equally
> plausible workload. It's the test we have, so we end up running it a
> lot. If we had some other test, we'd run that one.

I find pgbench very useful as a stress-test. It usually represents the
most extreme possible adversarial case, which puts certain patches in
context, helping to paint a complete picture.

> As for non-uniform access, that is real and does matter, but there are
> certainly installations with tables where no rows survive long enough
> to need freezing, either because the table is regularly emptied, or
> just because the update load is high enough to hit all the rows fairly
> quickly.

That's true, but I think that they're all queue-like tables. Not large
tables where there just isn't any chance of any row/page not being
updating after a fairly short period of time.

> Maybe I'm misunderstanding your point here, in which case all of the
> above may be irrelevant. But my feeling is that we can't simply ignore
> cases where all/many rows are short-lived and say, well, those are
> unrealistic, so let's just freeze everything super-aggressively and
> that should be fine. I don't think that's OK. We can (and I think we
> should) treat that situation as a special case rather than as the
> typical case, but I think it would be a bad idea to dismiss it as a
> case that we don't need to worry about at all.

I think that my point about pgbench being generally unrepresentative
might have overshadowed the more important point about the VACUUM
requirements of pgbench_accounts in particular. Those are particularly
unrealistic.

If no vacuuming against pgbench_accounts is strictly better than some
vacuuming (unless it's just to advance relfrozenxid, which can't be
avoided), then to what extent is Melanie's freezing patch obligated to
not make the situation worse? I'm not saying that it doesn't matter at
all; just that there needs to be a sense of proportion. If even
pruning (by VACUUM) is kinda, sorta, useless, then it seems like we
might need to revisit basic assumptions. (Or maybe the problem itself
can just be avoided for the most part; whatever works.)

I have a patch that provides a simple way of tracking whether each
PRUNE record was generated by VACUUM or by opportunistic pruning (also
something Melanie is involved with):

https://commitfest.postgresql.org/44/4288/

Although it's really hard (maybe impossible) to track right now, I've
noticed that the vast majority of all PRUNE records are from
opportunistic vacuuming with just about every write-heavy benchmark,
including pgbench and TPC-C. And that's just going on the raw count of
each type of record, not the utility provided by each pruning
operation. Admittedly "utility" is a very hard thing to measure here,
but it might still matter. VACUUM might be doing some pruning, but
only pruning targeting pages that never particularly "need to be
pruned" by VACUUM itself (since pruning could happen whenever via
opportunistic pruning anyway).

Whatever your priorities might be, workload-wise, it could still be
useful to recognize useless freezing as part of a broader problem of
useless (or at least dubious) work performed by VACUUM -- especially
with pgbench-like workloads. The utility of freezing is a lot easier
to measure than the utility of pruning, but why should we assume that
pruning isn't already just as much of a problem? (Maybe that's not a
problem that particularly interests you right now; I'm bringing it up
because it seems possible that putting it in scope could somehow
clarify what to do about freezing.)

-- 
Peter Geoghegan




Re: Eliminate redundant tuple visibility check in vacuum

2023-10-02 Thread Robert Haas
On Sat, Sep 30, 2023 at 1:02 PM Andres Freund  wrote:
> The only thought I have is that it might be worth to amend the comment in
> lazy_scan_prune() to mention that such a tuple won't need to be frozen,
> because it was visible to another session when vacuum started.

I revised the comment a bit, incorporating that language, and committed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: bgwriter doesn't flush WAL stats

2023-10-02 Thread Heikki Linnakangas
The first patch, to flush the bgwriter's WAL stats to the stats 
collector, seems like a straightforward bug fix, so committed and 
backpatched that. Thank you!


I didn't look at the second patch.

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

2023-10-02 Thread Robert Haas
On Mon, Oct 2, 2023 at 11:52 AM Pavel Borisov  wrote:
> FWIW I think the patch is still in good shape and worth to be committed.

I'm also pretty happy with these patches and would like to see at
least 0001 and 0002 committed, and probably 0003 as well. I am,
however, -1 on back-patching. Perhaps that is overly cautious, but I
don't like changing existing messages in back-branches. It will break
translations, and potentially monitoring scripts, etc.

If John's not available to take this forward, I can volunteer as
substitute committer, unless Peter or Peter would like to handle it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: commitfest app down for repairs

2023-10-02 Thread Robert Haas
On Sat, Sep 30, 2023 at 9:31 AM Joe Conway  wrote:
> We restored to a backup from one day prior. We will take a look at what
> changed in between, but it might be up to folks to redo some things.
>
> A cooling off period was added to the commitfest app for new community
> accounts, similar to what was done with the wiki a few years ago.

Ouch. Thanks for cleaning up the mess.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-02 Thread Vik Fearing

On 9/29/23 03:17, Tom Lane wrote:

Vik Fearing  writes:

On 9/28/23 20:46, Tom Lane wrote:

We went through all these points years ago when the enum feature
was first developed, as I recall.  Nobody thought that the ability
to remove an enum value was worth the amount of complexity it'd
entail.



This issue comes up regularly (although far from often).  Do we want to
put some comments right where would-be implementors would be sure to see it?


Perhaps.  I'd be kind of inclined to leave the "yet" out of "not yet
implemented" in the error message, as that wording sounds like we just
haven't got round to it.


I see your point, but should we be dissuading people who might want to 
work on solving those problems?  I intentionally did not document that 
this syntax exists so the only people seeing the message are those who 
just try it, and those wanting to write a patch like Danil did.


No one except you has said anything about this patch.  I think it would 
be good to commit it, wordsmithing aside.

--
Vik Fearing





Re: Show various offset arrays for heap WAL records

2023-10-02 Thread Heikki Linnakangas

On 04/09/2023 23:02, Melanie Plageman wrote:

I might phrase the last bit as "neither the description functions nor
the output format should be considered part of a stable API"

+Guidelines for rmgrdesc output format
+=

I noticed you used === for both headings and wondered if it was
intentional. Other READMEs I looked at in src/backend/access tend to
have a single heading underlined with  and then subsequent
headings are underlined with -. I could see an argument either way
here, but I just thought I would bring it up in case it was not a
conscious choice.

Otherwise, LGTM.


Made these changes and committed. Thank you!

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-02 Thread Dagfinn Ilmari Mannsåker
Vik Fearing  writes:

> On 9/29/23 03:17, Tom Lane wrote:
>> Vik Fearing  writes:
>>> On 9/28/23 20:46, Tom Lane wrote:
 We went through all these points years ago when the enum feature
 was first developed, as I recall.  Nobody thought that the ability
 to remove an enum value was worth the amount of complexity it'd
 entail.
>> 
>>> This issue comes up regularly (although far from often).  Do we want to
>>> put some comments right where would-be implementors would be sure to see it?
>> Perhaps.  I'd be kind of inclined to leave the "yet" out of "not yet
>> implemented" in the error message, as that wording sounds like we just
>> haven't got round to it.
>
> I see your point, but should we be dissuading people who might want to
> work on solving those problems?  I intentionally did not document that 
> this syntax exists so the only people seeing the message are those who
> just try it, and those wanting to write a patch like Danil did.
>
> No one except you has said anything about this patch.  I think it would
> be good to commit it, wordsmithing aside.

FWIW I'm +1 on this patch, and with Tom on dropping the "yet".  To me it
makes it sound like we intend to implement it soon (fsvo).

- ilmari




Re: On login trigger: take three

2023-10-02 Thread Robert Haas
Sorry to have gone dark on this for a long time after having been
asked for my input back in March. I'm not having a great time trying
to keep up with email, and the threads getting split up makes it a lot
worse for me.

On Fri, Sep 29, 2023 at 6:15 AM Daniel Gustafsson  wrote:
> Running the same pgbench command on my laptop looking at the average 
> connection
> times, and the averaging that over five runs (low/avg/high) I see ~5% increase
> over master with the patched version (compiled without assertions and debug):
>
> Patched event_triggers on:  6.858 ms/7.038 ms/7.434 ms
> Patched event_triggers off: 6.601 ms/6.958 ms/7.539 ms
> Master: 6.676 ms/6.697 ms/6.760 ms

This seems kind of crazy to me. Why does it happen? It sounds to me
like we must be doing a lot of extra catalog access to find out
whether there are any on-login event triggers. Like maybe a sequential
scan of pg_event_trigger. Maybe we need to engineer a way to avoid
that. I don't have a brilliant idea off-hand, but I feel like there
should be something we can do. I think a lot of users would say that
logins on PostgreSQL are too slow already.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Commitfest 2023-09 has finished

2023-10-02 Thread Peter Eisentraut
As September has ended, I have closed commitfest 2023-09.  The status 
before I started moving patches over was


Status summary: Needs review: 189. Waiting on Author: 30. Ready for 
Committer: 28. Committed: 68. Moved to next CF: 1. Returned with 
Feedback: 16. Rejected: 1. Withdrawn: 5. Total: 338.


The final status is now

Status summary: Committed: 68. Moved to next CF: 248. Withdrawn: 5. 
Rejected: 1. Returned with Feedback: 16. Total: 338.


The "Committed" number is lower than in the preceding commitfests, but 
it is actually higher than in the September commitfests of preceding years.


I did some more adjustments of status and removal of stale reviewer 
entries.  But overall, everything looked pretty accurate and up to date.


In the November commitfest right now, there are more than 50 patches 
that are 2 or 3 CFs old (meaning they arrived after PG16 feature freeze) 
and that have no reviewers.  Plus at least 30 patches without reviewers 
that are completely new in the November CF.  Also, it looks like a lot 
of these are in the Performance category, which is typically harder to 
review and get agreement on.  So, while we do have some really old 
patches that, well, are hard to get sorted out, there is a lot of new 
work coming in that really just needs plain reviewer attention.





Re: [DOCS] HOT - correct claim about indexes not referencing old line pointers

2023-10-02 Thread Robert Haas
On Sat, Sep 30, 2023 at 1:05 AM Peter Geoghegan  wrote:
> > This is why I discovered it: it says "indexes do not reference their
> > page item identifiers", which is manifestly not true when talking
> > about the root item, and in fact would defeat the whole purpose of HOT
> > (at least in a old-to-new chain like Postgres uses).
>
> Yeah, but...that's not what was intended. Obviously, the index hasn't
> changed, and we expect index scans to continue to give correct
> answers. So it is pretty strongly implied that it continues to point
> to something valid.

I took a look at this. I agree with James that the current wording is
just plain wrong.

 periodic vacuum operations.  (This is possible because indexes
 do not reference their page
 item identifiers.)

Here, the antecedent of "their" is "old versions of updated rows". It
is slightly unclear whether we should interpret this to mean (a) the
tuples together with the line pointers which point to them or (b) only
the tuple data to which the line pointers point. If (a), then it's
wrong because we can't actually get rid of the root line pointer;
rather, we have to change it to a redirect. If (b), then it's wrong
because heap_page_prune() removes dead tuples in this sense whether
HOT is involved or not. I can see no interpretation under which this
statement is true as written.

I reviewed James's proposed alternative:

+ periodic vacuum operations. However because indexes reference the old
+ version's page item identifiers
+ the line pointer must remain in place. Such a line pointer has its
+ LP_REDIRECT bit set and its offset updated to the
+ page item identifiers of
+ the updated row.

I don't think that's really right either. That's true for the root
line pointer, but the phrasing seems to be referring to old versions
generally, which would seem to include not only the root, for which
this is correct, and also all subsequent now-dead row versions, for
which it is wrong.

Here is my attempt:

When a row is updated multiple times, row versions other than the
oldest and the newest can be completely removed during normal
operation, including SELECTs, instead of requiring
periodic vacuum operations. (Indexes always refer to the page item identifiers of the
original row version. The tuple data associated with that row version
is removed, and its item identifier is converted to a redirect that
points to the oldest version that may still be visible to some
concurrent transaction. Intermediate row versions that are no longer
visible to anyone are completely removed, and the associated page item
identifiers are made available for reuse.)

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: CHECK Constraint Deferrable

2023-10-02 Thread Tom Lane
Himanshu Upadhyaya  writes:
> V3 patch attached.

Sorry for not weighing in on this before, but ... is this a feature
we want at all?  We are very clear in the existing docs that CHECK
conditions must be immutable [1], and that's not something we can
easily relax because if they are not then it's unclear when we need
to recheck them to ensure they stay satisfied.  But here we have a
feature whose only possible use is with constraints that *aren't*
immutable; else we might as well just check them immediately.
So that gives rise to a bunch of subtle questions about exactly what
properties a user-written constraint would need to have to guarantee
sane semantics given this implementation.  Can we define what those
properties are, or what the ensuing semantic guarantees are exactly?
Can we explain those things clearly enough that the average user would
have a shot at writing a valid deferred constraint?  Is a deferred
constraint having those properties likely to be actually useful?

I don't know the answers to these questions, but it troubles me a
lot that zero consideration appears to have been given to them.
I do not think we should put more effort into this patch unless
satisfactory answers are forthcoming.

regards, tom lane

[1] See Note at the bottom of "5.4.1. Check Constraints" here:
https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-CHECK-CONSTRAINTS




Re: Pre-proposal: unicode normalized text

2023-10-02 Thread Robert Haas
On Mon, Oct 2, 2023 at 3:42 PM Peter Eisentraut  wrote:
> I think a better direction here would be to work toward making
> nondeterministic collations usable on the global/database level and then
> encouraging users to use those.

It seems to me that this overlooks one of the major points of Jeff's
proposal, which is that we don't reject text input that contains
unassigned code points. That decision turns out to be really painful.
Here, Jeff mentions normalization, but I think it's a major issue with
collation support. If new code points are added, users can put them
into the database before they are known to the collation library, and
then when they become known to the collation library the sort order
changes and indexes break. Would we endorse a proposal to make
pg_catalog.text with encoding UTF-8 reject code points that aren't yet
known to the collation library? To do so would be tighten things up
considerably from where they stand today, and the way things stand
today is already rigid enough to cause problems for some users. But if
we're not willing to do that then I find it easy to understand why
Jeff wants an alternative type that does.

Now, there is still the question of whether such a data type would
properly belong in core or even contrib rather than being an
out-of-core project. It's not obvious to me that such a data type
would get enough traction that we'd want it to be part of PostgreSQL
itself. But at the same time I can certainly understand why Jeff finds
the status quo problematic.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg16: XX000: could not find pathkey item to sort

2023-10-02 Thread David Rowley
On Tue, 19 Sept 2023 at 23:45, Richard Guo  wrote:
> My first thought about the fix is that we artificially add resjunk
> target entries to parse->targetList for the ordered aggregates'
> arguments that are ORDER BY expressions, as attached.  While this can
> fix the given query, it would cause Assert failure for the query in
> sql/triggers.sql.

> Any thoughts?

Unfortunately, we can't do that as it'll lead to target entries
existing in the GroupAggregate's target list that have been
aggregated.

postgres=# explain verbose SELECT a, COUNT(DISTINCT b) FROM rp GROUP BY a;
  QUERY PLAN
--
 Append  (cost=88.17..201.39 rows=400 width=44)
   ->  GroupAggregate  (cost=88.17..99.70 rows=200 width=44)
 Output: rp.a, count(DISTINCT rp.b), rp.b

Your patch adds rp.b as an output column of the GroupAggregate.
Logically, that column cannot exist there as there is no correct
single value of rp.b after aggregation.

I think the fix needs to go into create_agg_path().  The problem is
that for AGG_SORTED we do:

if (aggstrategy == AGG_SORTED)
pathnode->path.pathkeys = subpath->pathkeys; /* preserves order */

which assumes that all of the columns before the aggregate will be
available after the aggregate.  That likely used to work ok before
1349d2790 as the planner wouldn't have requested any Pathkeys for
columns that were not available below the Agg node.

We can no longer take the subpath pathkey's verbatim. We need to strip
off pathkeys for columns that are not in pathnode's targetlist.

I've attached a patch which adds a new function to pathkeys.c to strip
off any PathKeys in a list that don't have a corresponding item in the
given PathTarget and just return a prefix of the input pathkey list up
until the first expr that can't be found.

I'm concerned that this patch will be too much overhead when creating
paths when a PathKey's EquivalenceClass has a large number of members
from partitioned tables.  I wondered if we should instead just check
if the subpath's pathkeys match root->group_pathkeys and if they do
set the AggPath's pathkeys to list_copy_head(subpath->pathkeys,
root->num_groupby_pathkeys),  that'll be much cheaper, but it just
feels a bit too much like a special case.

David


strip_aggregated_pathkeys_from_aggpaths.patch
Description: Binary data


Re: CHECK Constraint Deferrable

2023-10-02 Thread David G. Johnston
On Mon, Oct 2, 2023 at 12:25 PM Tom Lane  wrote:

> Himanshu Upadhyaya  writes:
> > V3 patch attached.
>
> Sorry for not weighing in on this before, but ... is this a feature
> we want at all?  We are very clear in the existing docs that CHECK
> conditions must be immutable [1], and that's not something we can
> easily relax because if they are not then it's unclear when we need
> to recheck them to ensure they stay satisfied.


Agreed.  I'm not sold on conforming to the standard being an appropriate
ideal here.  Either we already don't because our check constraints are
immutable, or I'm missing what use case the committee had in mind when they
designed this feature.  In any case, its absence doesn't seem that sorely
missed, and the OP's only actual example would require relaxing the
immutable property which I disagree with.  We have deferrable triggers to
serve that posited use case.

David J.


Re: Various small doc improvements; plpgsql, schemas, permissions, oidvector

2023-10-02 Thread Karl O. Pinc
On Sun, 1 Oct 2023 18:18:07 -0500
"Karl O. Pinc"  wrote:

Version 6

Added:
v6-0015-Trigger-authors-need-not-worry-about-parallelism.patch

Can't say if this is an awesome idea or not.  (Might have saved me time.)  
Read the commit message for a justification.

> > On Mon, 25 Sep 2023 23:37:44 -0500
> > "Karl O. Pinc"  wrote:
> >   
> > > > On Mon, 25 Sep 2023 17:55:59 -0500
> > > > "Karl O. Pinc"  wrote:
> > > >   
> > > > > On Mon, 25 Sep 2023 14:14:37 +0200
> > > > > Daniel Gustafsson  wrote:
> > > >   
> > > > > > Once done you can do "git format-patch origin/master -v 1"
> > > > > > which will generate a set of n patches named v1-0001 through
> > > > > > v1-000n.
> >   
> > > > > I am not particularly confident in the top-line commit
> > > > > descriptions.
> > > >   
> > > > > The bulk of the commit descriptions are very wordy
> > > >   
> > > > > Listing all the attachments here for future discussion: 

v6-0001-Change-section-heading-to-better-reflect-saving-a.patch
v6-0002-Change-section-heading-to-better-describe-referen.patch
v6-0003-Better-section-heading-for-plpgsql-exception-trap.patch
v6-0004-Describe-how-to-raise-an-exception-in-the-excepti.patch
v6-0005-Improve-sentences-in-overview-of-system-configura.patch
v6-0006-Provide-examples-of-listing-all-settings.patch
v6-0007-Cleanup-summary-of-role-powers.patch
v6-0008-Explain-the-difference-between-role-attributes-an.patch
v6-0009-Document-the-oidvector-type.patch
v6-0010-Improve-sentences-about-the-significance-of-the-s.patch
v6-0011-Add-a-sub-section-to-describe-schema-resolution.patch
v6-0012-Explain-role-management.patch
v6-0013-Hyperlink-from-CREATE-FUNCTION-reference-page-to-.patch
v6-0014-Add-index-entries-for-parallel-safety.patch
v6-0015-Trigger-authors-need-not-worry-about-parallelism.patch

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein
>From 122665c4155698abe88e2bd17639a991791b94e3 Mon Sep 17 00:00:00 2001
From: "Karl O. Pinc" 
Date: Sun, 24 Sep 2023 15:49:30 -0500
Subject: [PATCH v6 01/15] Change section heading to better reflect saving a
 result in variable(s)

The current section title of "Executing a Command with a Single-Row
Result" does not reflect what the section is really about.  Other
sections make clear how to _execute_ commands, single-row result or not.
What this section is about is how to _save_ a single row of results into
variable(s).

It would be nice to talk about saving results into variables in the
section heading but I couldn't come up with anything pithy.  "Saving a
Single-Row of a Command's Result" seems good enough, especially since
there's few other places to save results other than in variables.
---
 doc/src/sgml/plpgsql.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index f55e901c7e..8747e84245 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1126,7 +1126,7 @@ PERFORM create_mv('cs_session_page_requests_mv', my_query);

 

-Executing a Command with a Single-Row Result
+Saving a Single-Row of a Command's Result
 
 
  SELECT INTO
-- 
2.30.2

>From 4de4a31d41124dfa793cc5cce0516673811ea414 Mon Sep 17 00:00:00 2001
From: "Karl O. Pinc" 
Date: Sun, 24 Sep 2023 15:52:21 -0500
Subject: [PATCH v6 02/15] Change section heading to better describe reference
 of existing types

The section heading of "Copying Types" does not reflect what the
section is about.  It is not about making copies of data types but
about using the data type of existing columns (or rows) in new type
declarations without having to know what the existing type is.

"Re-Using the Type of Columns and Variables" seems adequate.  Getting
something in there about declartions seems too wordy.  I thought
perhaps "Referencing" instead of "Re-Using", but "referencing" isn't
perfect and "re-using" is generic enough, shorter, and simpler to read.
---
 doc/src/sgml/plpgsql.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 8747e84245..874578265e 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -672,7 +672,7 @@ DECLARE

 
   
-   Copying Types
+   Re-Using the Type of Columns and Variables
 
 
 variable%TYPE
-- 
2.30.2

>From 80c2b8ef7ad6e610f5c7bdc61b827983a87110e2 Mon Sep 17 00:00:00 2001
From: "Karl O. Pinc" 
Date: Sun, 24 Sep 2023 16:03:29 -0500
Subject: [PATCH v6 03/15] Better section heading for plpgsql exception
 trapping

The docs seem to use "error" and "exception" interchangeably, perhaps
50% each.  But they never say that the are the same thing, and in the
larger world they are not.  Errors tend to be something that drop on
the floor and usually halt execution whereas exceptions can be trapped
and give the programmer more control over the flow of the program.
(Although, to be fair, exceptions are a subset of

Re: Eager page freeze criteria clarification

2023-10-02 Thread Robert Haas
On Mon, Oct 2, 2023 at 11:37 AM Peter Geoghegan  wrote:
> If no vacuuming against pgbench_accounts is strictly better than some
> vacuuming (unless it's just to advance relfrozenxid, which can't be
> avoided), then to what extent is Melanie's freezing patch obligated to
> not make the situation worse? I'm not saying that it doesn't matter at
> all; just that there needs to be a sense of proportion.

I agree. I think Melanie's previous test results showed no measurable
performance regression but a significant regression in WAL volume. I
don't remember how much of a regression it was, but it was enough to
make me think that the extra WAL volume could probably be turned into
a performance loss by adjusting the test scenario (a bit less
hardware, or a bandwidth-constrained standby connection, etc.). I
think I'd be OK to accept a small amount of additional WAL, though. Do
you see it differently?

> Whatever your priorities might be, workload-wise, it could still be
> useful to recognize useless freezing as part of a broader problem of
> useless (or at least dubious) work performed by VACUUM -- especially
> with pgbench-like workloads. The utility of freezing is a lot easier
> to measure than the utility of pruning, but why should we assume that
> pruning isn't already just as much of a problem? (Maybe that's not a
> problem that particularly interests you right now; I'm bringing it up
> because it seems possible that putting it in scope could somehow
> clarify what to do about freezing.)

I wonder how much useless work we're doing in this area today. I mean,
most pruning in that workload gets done by HOT rather than by vacuum,
because vacuum simply can't keep up, but I don't think it's worthless
work: if it wasn't done in the background by VACUUM, it would happen
in the foreground anyway very soon after. I don't have a good feeling
for how much index cleanup we end up doing in a pgbench workload, but
it seems to me that if we don't do index cleanup at least now and
then, we'll lose the ability to recycle line pointers in the wake of
non-HOT updates, and that seems bad. Maybe that's rare in pgbench
specifically, or maybe it isn't, but it seems like you'd only have to
change the workload a tiny bit for that to become a real problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Pre-proposal: unicode normalized text

2023-10-02 Thread Nico Williams
On Tue, Sep 12, 2023 at 03:47:10PM -0700, Jeff Davis wrote:
> One of the frustrations with using the "C" locale (or any deterministic
> locale) is that the following returns false:
> 
>   SELECT 'á' = 'á'; -- false
> 
> because those are the unicode sequences U&'\0061\0301' and U&'\00E1',
> respectively, so memcmp() returns non-zero. But it's really the same
> character with just a different representation, and if you normalize
> them they are equal:
> 
>   SELECT normalize('á') = normalize('á'); -- true

I think you misunderstand Unicode normalization and equivalence.  There
is no standard Unicode `normalize()` that would cause the above equality
predicate to be true.  If you normalize to NFD (normal form decomposed)
then a _prefix_ of those two strings will be equal, but that's clearly
not what you're looking for.

PostgreSQL already has Unicode normalization support, though it would be
nice to also have form-insensitive indexing and equality predicates.

There are two ways to write 'á' in Unicode: one is pre-composed (one
codepoint) and the other is decomposed (two codepoints in this specific
case), and it would be nice to be able to preserve input form when
storing strings but then still be able to index and match them
form-insensitively (in the case of 'á' both equivalent representations
should be considered equal, and for UNIQUE indexes they should be
considered the same).

You could also have functions that perform lossy normalization in the
sort of way that soundex does, such as first normalizing to NFD then
dropping all combining codepoints which then could allow 'á' to be eq to
'a'.  But this would not be a Unicode normalization function.

Nico
-- 




Re: [PGDOCS] change function linkend to refer to a more relevant target

2023-10-02 Thread Peter Smith
On Sat, Sep 30, 2023 at 12:04 AM Daniel Gustafsson  wrote:
>
> > On 29 Sep 2023, at 10:54, Daniel Gustafsson  wrote:
> >
> >> On 29 Sep 2023, at 06:51, Peter Smith  wrote:
> >
> >> I found a link in the docs that referred to the stats "views" section,
> >> instead of the more relevant (IMO)  stats "functions" section.
> >
> > Agreed.  This link was added in 2007 (in 7d3b7011b), and back in 7.3/7.4 
> > days
> > the functions were listed in the same section as the views, so the anchor 
> > was
> > at the time pointing to the right section.  In 2012 it was given its own
> > section (in aebe989477) but the link wasn't updated.
> >
> > Thanks for the patch, I'll go ahead with it.
>
> Applied to all supported branches, thanks!
>

Thank you for pushing.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Fixup some more appendStringInfo misusages

2023-10-02 Thread David Rowley
The attached v1-0001 patch adjusts some code in stringinfo.h to find
misusages of the appendStringInfo functions.  I don't intend to commit
that, but I do intend to commit the patch for the new misusages that
it found, which is also attached.

This is along the same lines as 8b26769bc, f736e188c, 110d81728 and 8abc13a88.

David


fixup_some_appendStringInfo_misuages_2023.patch
Description: Binary data


v1-0001-Adjust-code-to-highlight-appendStringInfo-misusag.patch
Description: Binary data


Re: CHECK Constraint Deferrable

2023-10-02 Thread Vik Fearing

On 10/2/23 21:25, Tom Lane wrote:

Himanshu Upadhyaya  writes:

V3 patch attached.


Sorry for not weighing in on this before, but ... is this a feature
we want at all?


For standards conformance, I vote yes.


We are very clear in the existing docs that CHECK
conditions must be immutable [1], and that's not something we can
easily relax because if they are not then it's unclear when we need
to recheck them to ensure they stay satisfied.


That is what the *user* documentation says, but we both know it isn't true.

Here is a short conversation you and I had about five years ago where 
you defended the non-immutability of CHECK constraints:

https://www.postgresql.org/message-id/flat/12539.1544107316%40sss.pgh.pa.us


But here we have a
feature whose only possible use is with constraints that *aren't*
immutable; else we might as well just check them immediately.


I disagree with this.  The whole point of deferring constraints is to be 
able to do some cleanup before the consistency is checked.



So that gives rise to a bunch of subtle questions about exactly what
properties a user-written constraint would need to have to guarantee
sane semantics given this implementation.  Can we define what those
properties are, or what the ensuing semantic guarantees are exactly?
Can we explain those things clearly enough that the average user would
have a shot at writing a valid deferred constraint?


A trivial example is CHECK (c IS NOT NULL) which, according to the 
standard, is the only way to check for such a condition.  The NOT NULL 
syntax is explicitly translated to that by 11.4  SR 
17.a.  We implement it a bit differently, but that does not negate the 
usefulness of being able to defer it.  In fact, all of the work Álvaro 
is currently doing is mainly (or even fully) to be able to defer such a 
constraint.



Is a deferred
constraint having those properties likely to be actually useful?


I believe the answer is yes.
--
Vik Fearing





Re: Does anyone ever use OPTIMIZER_DEBUG?

2023-10-02 Thread David Rowley
On Fri, 29 Sept 2023 at 10:59, Tom Lane  wrote:
> We could also discuss keeping the "tracing" aspect of it, but
> replacing debug_print_rel with pprint(rel), which'd still allow
> removal of all the "DEBUG SUPPORT" stuff at the bottom of allpaths.c.
> That's pretty much all of the maintenance-requiring stuff in it.

To assist discussion, I've attached a patch for that.

I likely can't contribute much to that discussion due to being more of
an "attach a debugger" person rather than an "add printf statements"
person.

To eliminate a hurdle for anyone who wants to chip in, I've attached
the old and new debug output from the following query:

select * from pg_class where oid = 1234;

One observation is that the output is quite a bit larger with the
patched version and does not seem as useful if you wanted
OPTIMIZER_DEBUG to help you figure out why a given Path was chosen.

David
After canonicalize_qual()
   {OPEXPR
   :opno 607
   :opfuncid 184
   :opresulttype 16
   :opretset false
   :opcollid 0
   :inputcollid 0
   :args (
  {VAR
  :varno 1
  :varattno 1
  :vartype 26
  :vartypmod -1
  :varcollid 0
  :varnullingrels (b)
  :varlevelsup 0
  :varnosyn 1
  :varattnosyn 1
  :location 29
  }
  {CONST
  :consttype 26
  :consttypmod -1
  :constcollid 0
  :constlen 4
  :constbyval true
  :constisnull false
  :location 33
  :constvalue 4 [ -46 4 0 0 0 0 0 0 ]
  }
   )
   :location 32
   }

RELOPTINFO (pg_class): rows=1 width=273
baserestrictinfo: pg_class.oid = 1234
path list:
IdxScan(pg_class) rows=1 cost=0.27..8.29

cheapest parameterized paths:
IdxScan(pg_class) rows=1 cost=0.27..8.29

cheapest startup path:
IdxScan(pg_class) rows=1 cost=0.27..8.29

cheapest total path:
IdxScan(pg_class) rows=1 cost=0.27..8.29After canonicalize_qual()
   {OPEXPR
   :opno 607
   :opfuncid 184
   :opresulttype 16
   :opretset false
   :opcollid 0
   :inputcollid 0
   :args (
  {VAR
  :varno 1
  :varattno 1
  :vartype 26
  :vartypmod -1
  :varcollid 0
  :varnullingrels (b)
  :varlevelsup 0
  :varnosyn 1
  :varattnosyn 1
  :location 29
  }
  {CONST
  :consttype 26
  :consttypmod -1
  :constcollid 0
  :constlen 4
  :constbyval true
  :constisnull false
  :location 33
  :constvalue 4 [ -46 4 0 0 0 0 0 0 ]
  }
   )
   :location 32
   }

   {RELOPTINFO
   :reloptkind 0
   :relids (b 1)
   :rows 1
   :consider_startup false
   :consider_param_startup false
   :consider_parallel true
   :reltarget
  {PATHTARGET
  :exprs (
 {VAR
 :varno 1
 :varattno 1
 :vartype 26
 :vartypmod -1
 :varcollid 0
 :varnullingrels (b)
 :varlevelsup 0
 :varnosyn 1
 :varattnosyn 1
 :location 7
 }
 {VAR
 :varno 1
 :varattno 2
 :vartype 19
 :vartypmod -1
 :varcollid 950
 :varnullingrels (b)
 :varlevelsup 0
 :varnosyn 1
 :varattnosyn 2
 :location 7
 }
 {VAR
 :varno 1
 :varattno 3
 :vartype 26
 :vartypmod -1
 :varcollid 0
 :varnullingrels (b)
 :varlevelsup 0
 :varnosyn 1
 :varattnosyn 3
 :location 7
 }
 {VAR
 :varno 1
 :varattno 4
 :vartype 26
 :vartypmod -1
 :varcollid 0
 :varnullingrels (b)
 :varlevelsup 0
 :varnosyn 1
 :varattnosyn 4
 :location 7
 }
 {VAR
 :varno 1
 :varattno 5
 :vartype 26
 :vartypmod -1
 :varcollid 0
 :varnullingrels (b)
 :varlevelsup 0
 :varnosyn 1
 :varattnosyn 5
 :location 7
 }
 {VAR
 :varno 1
 :varattno 6
 :vartype 26
 :vartypmod -1
 :varcollid 0
 :varnullingrels (b)
 :varlevelsup 0
 :varnosyn 1
 :varattnosyn 6
 :location 7
 }
 {VAR
 :varno 1
 :varattno 7
 :vartype 26
 :vartypmod -1
 :varcollid 0
 :varnullingrels (b)
 :varlevelsup 0
 :varnosyn 1
 :varattnosyn 7
 :location 7
 }
 {VAR
 :varno 1
 :varattno 8
 :vartype 26
 :vartypmod -1
 :varcollid 0
 :varnullingrels (b)
 :varlevelsup 0
 :varnosyn 1
 :varattnosyn 8
 :location 7
 }
 {VAR
 :varno 1
 :varattno 9
 :vartype 26
 :vartypmod -1
 :varcollid 0
 :varnullingrels (b)
 :varlevelsup 0
 :varnosyn 1
 :varattnosyn 9
 :location 7
 }
 {VAR
 :varno 1
 :varattno 10
 :varty

Re: pg*.dll and *.pdb files in psqlODBC have no version numbers

2023-10-02 Thread Michael Paquier
On Mon, Oct 02, 2023 at 02:28:58PM +, Mark Hill wrote:
> A colleague noticed that the following files in the psqlODBC MSI for Windows 
> have no version numbers:
> pgenlist.dll
> pgenlista.dll
> pgxalib.dll
> pgenlist.pdb
> pgenlista.pdb
> psqlodbc30a.pdb
> psqlodbc35w.pdb
> 
> Does anyone know if that is be design or some other reason?   Should
> they have version numbers?

Version numbers are critical in MSI installers to make sure that
components get updated, so yes, these are important depending on the
upgrade mode.  (I vaguely remember that there's a hard mode where
things are forcibly replaced, and a soft mode where only components
with newer version numbers are replaced, but it's from memories from
quite a few years ago so I may recall incorrectly).

> I checked earlier build and the same holds for ODBC 12.02..

Perhaps it would be better to discuss that on the pgsql-odbc list,
where the driver is maintained, not pgsql-hackers.
--
Michael


signature.asc
Description: PGP signature


Re: Replace (stat())[7] in TAP tests with -s

2023-10-02 Thread Michael Paquier
On Mon, Oct 02, 2023 at 12:44:59PM +0100, Dagfinn Ilmari Mannsåker wrote:
> I approve of removing use of the list form of stat, it's a horrible API.

Agreed, I've appied the suggestion to use -s, like we do anywhere
else.

> If we weren't already using -s everywhere else, I would prefer
> File::stat, which makes stat (in scalar context) return an object with
> methods for the fields, so you'd do stat($file)->size.  It's included in
> Perl core since 5.4, and we're already using it in several places for
> other fields (mode and ino at least).

Right, like in 017_shm.pl.  I didn't notice that.

> I see another use of stat array positions (for mtime) in
> src/tools/msvc/Solution.pm, but that's on the chopping block, so not
> much point in fixing.

The removal of this code depends on a few more things, hopefully it
will be able to get rid of it during this release cycle.
--
Michael


signature.asc
Description: PGP signature


Re: CHECK Constraint Deferrable

2023-10-02 Thread Andreas Joseph Krogh


På fredag 07. juli 2023 kl. 13:50:44, skrev Dilip Kumar mailto:dilipbal...@gmail.com>>:
On Wed, Jul 5, 2023 at 3:08 PM Himanshu Upadhyaya
 wrote:
>
> Hi,
>
> Currently, there is no support for CHECK constraint DEFERRABLE in a create 
table statement.
> SQL standard specifies that CHECK constraint can be defined as DEFERRABLE.

I think this is a valid argument that this is part of SQL standard so
it would be good addition to PostgreSQL. So +1 for the feature.

But I am wondering whether there are some real-world use cases for
deferred CHECK/NOT NULL constraints? I mean like for foreign key
constraints if there is a cyclic dependency between two tables then
deferring the constraint is the simplest way to insert without error.


The real-world use case, at least for me, is when using an ORM. For large 
object-graphs ORMs have a tendency to INSERT first with NULLs then UPDATE the 
“NOT NULLs” later.

“Rewrite the ORM” is not an option for most of us…



--

Andreas Joseph Krogh


Re: CHECK Constraint Deferrable

2023-10-02 Thread Tom Lane
Vik Fearing  writes:
> On 10/2/23 21:25, Tom Lane wrote:
>> Sorry for not weighing in on this before, but ... is this a feature
>> we want at all?

> For standards conformance, I vote yes.

Only if we can actually implement it in a defensible way, which this
patch is far short of accomplishing.

>> We are very clear in the existing docs that CHECK
>> conditions must be immutable [1],

> That is what the *user* documentation says, but we both know it isn't true.
> Here is a short conversation you and I had about five years ago where 
> you defended the non-immutability of CHECK constraints:
> https://www.postgresql.org/message-id/flat/12539.1544107316%40sss.pgh.pa.us

What I intended to defend was not *checking* immutability strictly.
Our CHECK constraint implementation is based very much on the assumption
that the constraints are immutable, and nobody has proposed that we
try to remove that assumption AFAIR.  So I think the docs are fine
as-is; anybody who wants to get into monotonically-weakening constraints
is probably smart enough to work out for themselves whether it will
fly or not.

So my problem with this patch is that it does nothing about that
assumption, and yet the feature it adds seems useless without
weakening the assumption.  So what weaker assumption could we
make, and how would we modify the when-to-check rules to match
that, and what would it cost us in performance?  Without good
answers to those questions, this patch is just a facade.

> I disagree with this.  The whole point of deferring constraints is to be 
> able to do some cleanup before the consistency is checked.

What cleanup would you need that couldn't be performed beforehand
(e.g. in a BEFORE INSERT/UPDATE trigger)?  All the practical
examples that occur to me involve cross-row conditions, which
CHECK is unsuitable to enforce --- at least, without doing a
thorough implementation rethink.

I continue to assert that basing this feature on the current
CHECK implementation will produce nothing but a toy feature,
that's not only of little practical use but will be an active
foot-gun for people who expect it to do more than it can.

regards, tom lane




Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-02 Thread Vik Fearing

On 10/2/23 20:07, Dagfinn Ilmari Mannsåker wrote:

Vik Fearing  writes:


No one except you has said anything about this patch.  I think it would
be good to commit it, wordsmithing aside.


FWIW I'm +1 on this patch,


Thanks.


and with Tom on dropping the "yet".  To me it
makes it sound like we intend to implement it soon (fsvo).
I am not fundamentally opposed to it, nor to any other wordsmithing the 
committer (probably Tom) wants to do.  The main point of the patch is to 
list at least some of the problems that need to be solved in a correct 
implementation.

--
Vik Fearing





Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-10-02 Thread Peter Smith
Some review comments for v5.

==
src/backend/catalog/pg_subscription.c

1. GetSubscription - comment

+ /* Get superuser for subscription owner */
+ sub->ownersuperuser = superuser_arg(sub->owner);
+

The comment doesn't seem very good.

SUGGESTION
/* Is the subscription owner a superuser? */

==

2. General - consistency

Below are the code fragments using the new Subscription field.

AlterSubscription_refresh:
must_use_password = !sub->ownersuperuser && sub->passwordrequired;

AlterSubscription:
walrcv_check_conninfo(stmt->conninfo, sub->passwordrequired &&
!sub->ownersuperuser);

LogicalRepSyncTableStart:
must_use_password = MySubscription->passwordrequired &&
!MySubscription->ownersuperuser;

run_apply_worker:
must_use_password = MySubscription->passwordrequired &&
!MySubscription->ownersuperuser;

~

It is not a difference caused by this patch, but since you are
modifying these lines anyway, I felt it would be better if all the
expressions were consistent. So, in AlterSubscription_refresh IMO it
would be better like:

BEFORE
must_use_password = !sub->ownersuperuser && sub->passwordrequired;

SUGGESTION
must_use_password = sub->passwordrequired && !sub->ownersuperuser;

==

Other than those trivial things, v5 LGTM.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




[PGDOCS] Inconsistent linkends to "monitoring" views.

2023-10-02 Thread Peter Smith
I noticed one or two "monitoring" links and linkends that are slightly
inconsistent from all the others.

~~~

>From "Dynamic Statistics Views"

pg_stat_activity, linkend="monitoring-pg-stat-activity-view" ==> ok
pg_stat_replication, linkend="monitoring-pg-stat-replication-view" ==> ok
pg_stat_wal_receiver, linkend="monitoring-pg-stat-wal-receiver-view"> ==> ok
pg_stat_recovery_prefetch,
linkend="monitoring-pg-stat-recovery-prefetch" ==> MODIFY
linkend="monitoring-pg-stat-recovery-prefetch-view"
pg_stat_subscription, linkend="monitoring-pg-stat-subscription" ==>
MODIFY linkend="monitoring-pg-stat-subscription-view"
pg_stat_ssl, linkend="monitoring-pg-stat-ssl-view" ==> ok

~~~

>From "Collected Statistics Views"

pg_stat_archiver, linkend="monitoring-pg-stat-archiver-view" ==> ok
pg_stat_bgwriter, linkend="monitoring-pg-stat-bgwriter-view" ==> ok
pg_stat_database, linkend="monitoring-pg-stat-database-view" ==> ok
pg_stat_database_conflicts,
linkend="monitoring-pg-stat-database-conflicts-view" ==> ok
pg_stat_io, linkend="monitoring-pg-stat-io-view"> ==> ok
pg_stat_replication_slots,
linkend="monitoring-pg-stat-replication-slots-view" ==> ok
pg_stat_slru, linkend="monitoring-pg-stat-slru-view" ==> ok
pg_stat_subscription_stats,
linkend="monitoring-pg-stat-subscription-stats" ==> MODIFY
linkend="monitoring-pg-stat-subscription-stats-view"
pg_stat_wal, linkend="monitoring-pg-stat-wal-view" ==> ok
pg_stat_all_tables, linkend="monitoring-pg-stat-all-tables-view" ==> ok
pg_stat_all_indexes, linkend="monitoring-pg-stat-all-indexes-view" ==> ok
pg_stat_user_functions, linkend="monitoring-pg-stat-user-functions-view" ==> ok
pg_statio_all_tables, linkend="monitoring-pg-statio-all-tables-view" ==> ok
pg_statio_all_indexes, linkend="monitoring-pg-statio-all-indexes-view" ==> ok
pg_statio_all_sequences,
linkend="monitoring-pg-statio-all-sequences-view" ==> ok

~~~

PSA a patch to make these few changes.

==
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Change-some-monitoring-linkends-for-consistency.patch
Description: Binary data


Re: Fixup some more appendStringInfo misusages

2023-10-02 Thread David Rowley
On Tue, 3 Oct 2023 at 11:24, David Rowley  wrote:
> This is along the same lines as 8b26769bc, f736e188c, 110d81728 and 8abc13a88.

I've pushed the patch to fix the misusages of the functions.

David




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-02 Thread Bharath Rupireddy
On Fri, Sep 29, 2023 at 5:27 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> Yeah, the approach enforces developers to check the decodability.
> But the benefit seems smaller than required efforts for it because the 
> function
> would be used only by pg_upgrade. Could you tell me if you have another use 
> case
> in mind? We may able to adopt if we have...

I'm attaching 0002 patch (on top of v45) which implements the new
decodable callback approach that I have in mind. IMO, this new
approach is extensible, better than the current approach (hard-coding
of certain WAL records that may be generated during pg_upgrade) taken
by the patch, and helps deal with the issue that custom WAL resource
managers can have with the current approach taken by the patch.

> Also, this approach cannot be backported.

Neither the current patch as-is. I'm not looking at backporting this
feature right now, but making it as robust and extensible as possible
for PG17.

Thoughts?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v45-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description: Binary data


v45-0002-Add-rm_is_record_decodable-callback-for-WAL-rmgr.patch
Description: Binary data


Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-02 Thread Dilip Kumar
On Tue, Oct 3, 2023 at 9:58 AM Bharath Rupireddy
 wrote:
>
> On Fri, Sep 29, 2023 at 5:27 PM Hayato Kuroda (Fujitsu)
>  wrote:
> >
> > Yeah, the approach enforces developers to check the decodability.
> > But the benefit seems smaller than required efforts for it because the 
> > function
> > would be used only by pg_upgrade. Could you tell me if you have another use 
> > case
> > in mind? We may able to adopt if we have...
>
> I'm attaching 0002 patch (on top of v45) which implements the new
> decodable callback approach that I have in mind. IMO, this new
> approach is extensible, better than the current approach (hard-coding
> of certain WAL records that may be generated during pg_upgrade) taken
> by the patch, and helps deal with the issue that custom WAL resource
> managers can have with the current approach taken by the patch.

I did not see the patch, but I like this approach better.  I mean this
approach does not check what record types are generated during updagre
instead this directly targets that after the confirmed_flush_lsn what
type of records shouldn't be generated.  So if rmgr says that after
commit_flush_lsn no decodable record was generated then we are safe to
upgrade that slot.  So this seems an expandable approach.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Add support for AT LOCAL

2023-10-02 Thread Vik Fearing

On 9/29/23 09:27, Michael Paquier wrote:

On Sat, Sep 23, 2023 at 12:54:01AM +0200, Vik Fearing wrote:

On 9/22/23 23:46, cary huang wrote:

I think this feature can be a useful addition in dealing with time
zones. I have applied and tried out the patch, The feature works as
described and seems promising. The problem with compilation failure
was probably reported on CirrusCI when compiled on different
platforms. I have run the latest patch on my own Cirrus CI environment
and everything checked out fine.


Thank you for reviewing!


+| a_expr AT LOCAL%prec AT
+{
+/* Use the value of the session's time zone */
+FuncCall *tz = 
makeFuncCall(SystemFuncName("current_setting"),
+
list_make1(makeStringConst("TimeZone", -1)),
+COERCE_SQL_SYNTAX,
+-1);
+$$ = (Node *) makeFuncCall(SystemFuncName("timezone"),
+   list_make2(tz, $1),
+   COERCE_SQL_SYNTAX,
+   @2);

As the deparsing code introduced by this patch is showing, this leads
to a lot of extra complexity.  And, actually, this can be quite
expensive as well with these two layers of functions.  Note also that
in comparison to SQLValueFunctions, COERCE_SQL_SYNTAX does less
inlining.  So here comes my question: why doesn't this stuff just use
one underlying function to do this job?


I guess I don't understand the question.  Why do you think a single 
function that repeats what these functions do would be preferable?  I am 
not sure how doing it a different way would be better.

--
Vik Fearing





Differences between = ANY and IN?

2023-10-02 Thread Maciek Sakrejda
Hello,

My colleague has been working on submitting a patch [1] to the Ruby
Rails framework to address some of the problems discussed in [2].
Regardless of whether that change lands, the change in Rails would be
useful since people will be running Postgres versions without this
patch for a while.

My colleague's patch changes SQL generated from Ruby expressions like
`where(id: [1, 2])` . This is currently translated to roughly `WHERE
id IN (1, 2)` and would be changed to `id = ANY('{1,2}')`.

As far as we know, the expressions are equivalent, but we wanted to
double-check: are there any edge cases to consider here (other than
the pg_stat_statements behavior, of course)?

Thanks,
Maciek

[1]: https://github.com/rails/rails/pull/49388
[2]: 
https://www.postgresql.org/message-id/flat/20230209172651.cfgrebpyyr72h7fv%40alvherre.pgsql#eef3c77bc28b9922ea6b9660b0221b5d




Re: Making aggregate deserialization (and WAL receive) functions slightly faster

2023-10-02 Thread David Rowley
On Sun, 12 Feb 2023 at 23:43, David Rowley  wrote:
>
> On Sun, 12 Feb 2023 at 19:39, Tom Lane  wrote:
> > It could maybe be okay if we added the capability for StringInfoData
> > to understand (and enforce) that its "data" buffer is read-only.
> > However, that'd add overhead to every existing use-case.
>
> I'm not very excited by that.  I considered just setting maxlen = -1
> in the new function and adding Asserts to check for that in each of
> the appendStringInfo* functions. However, since the performance gains
> are not so great, I'll probably just drop the whole thing given
> there's resistance.

I know I said I'd drop this, but I was reminded of it again today.  I
ended up adjusting the patch so that it no longer adds a helper
function to stringinfo.c and instead just manually assigns the
StringInfo.data field to point to the bytea's buffer.  This follows
what's done in some existing places such as
LogicalParallelApplyLoop(), ReadArrayBinary() and record_recv() to
name a few.

I ran a fresh set of benchmarks on today's master with and without the
patch applied. I used the same benchmark as I did in [1].  The average
performance increase from between 0 and 12 workers is about 6.6%.

This seems worthwhile to me.  Any objections?

David

[1] 
https://postgr.es/m/CAApHDvr%3De-YOigriSHHm324a40HPqcUhSp6pWWgjz5WwegR%3DcQ%40mail.gmail.com


v1-0001-Optimize-various-aggregate-deserialization-functi.patch
Description: Binary data


Re: Differences between = ANY and IN?

2023-10-02 Thread Tom Lane
Maciek Sakrejda  writes:
> My colleague's patch changes SQL generated from Ruby expressions like
> `where(id: [1, 2])` . This is currently translated to roughly `WHERE
> id IN (1, 2)` and would be changed to `id = ANY('{1,2}')`.

> As far as we know, the expressions are equivalent, but we wanted to
> double-check: are there any edge cases to consider here (other than
> the pg_stat_statements behavior, of course)?

You would find it profitable to read transformAExprIn() in parse_expr.c.
The most important points are in this comment:

 * We try to generate a ScalarArrayOpExpr from IN/NOT IN, but this is only
 * possible if there is a suitable array type available.  If not, we fall
 * back to a boolean condition tree with multiple copies of the lefthand
 * expression.  Also, any IN-list items that contain Vars are handled as
 * separate boolean conditions, because that gives the planner more scope
 * for optimization on such clauses.

If all the values in the IN form were being sent to the backend as
constants of the same datatype, I think you're okay to consider it
as exactly equivalent to =ANY.  It would likely be a good idea to
provide an explicit cast `id = ANY('{1,2}'::int[])` rather than just
hoping an unadorned literal will be taken as the type you want
(see transformAExprOpAny and thence make_scalar_array_op).

regards, tom lane




Re: CHECK Constraint Deferrable

2023-10-02 Thread Himanshu Upadhyaya
On Thu, Sep 14, 2023 at 9:57 AM vignesh C  wrote:

> 2) I was not sure, if the error message change was intentional:
> 2a)
> In Head:
> CREATE FOREIGN TABLE t9(a int CHECK(a<>0) DEFERRABLE) SERVER s1;
> ERROR:  misplaced DEFERRABLE clause
> LINE 1: CREATE FOREIGN TABLE t9(a int CHECK(a<>0) DEFERRABLE) SERVER...
>   ^
> postgres=# CREATE FOREIGN TABLE t9(a int CHECK(a<>0) DEFERRABLE) SERVER s1;
> ERROR:  "t9" is a foreign table
> DETAIL:  Foreign tables cannot have constraint triggers.
>
> 2b)
> In Head:
> postgres=# CREATE FOREIGN TABLE t2(a int CHECK(a<>0)) SERVER s1;
> CREATE FOREIGN TABLE
> postgres=# ALTER FOREIGN TABLE t2 ADD CONSTRAINT t2_chk_1 CHECK(a<>1)
> DEFERRABLE;
> ERROR:  CHECK constraints cannot be marked DEFERRABLE
>
> With patch:
> postgres=# ALTER FOREIGN TABLE t8 ADD CONSTRAINT t8_chk_1 CHECK(a<>1)
> DEFERRABLE;
> ERROR:  "t8" is a foreign table
> DETAIL:  Foreign tables cannot have constraint triggers.
>
> We are creating a constraint trigger for DEFERRED check constraint and as
per implementation of FOREIGN table we are restricting to have a constraint
trigger. I need to do more analysis before reaching to any conclusion, I
think we can restrict this gram.y itself.

> 3) Insert check is not deferred to commit:
> This insert check here is deferred to commit:
> postgres=# CREATE TABLE tbl (i int ) partition by range (i);
> CREATE TABLE tbl_1 PARTITION OF tbl FOR VALUES FROM (0) TO (10);
> CREATE TABLE tbl_2 PARTITION OF tbl FOR VALUES FROM (20) TO (30);
> CREATE TABLE
> CREATE TABLE
> CREATE TABLE
> postgres=# ALTER TABLE tbl ADD CONSTRAINT tbl_chk_1 CHECK(i<>1) DEFERRABLE;
> ALTER TABLE
> postgres=# begin;
> BEGIN
> postgres=*# SET CONSTRAINTS tbl_chk_1 DEFERRED;
> SET CONSTRAINTS
> postgres=*# INSERT INTO tbl values (1);
> INSERT 0 1
> postgres=*# commit;
> ERROR:  new row for relation "tbl_1" violates check constraint "tbl_chk_1"
> DETAIL:  Failing row contains (1).
>
> But the check here is not deferred to commit:
> postgres=# CREATE TABLE tbl (i int check(i<>0) DEFERRABLE) partition
> by range (i);
> CREATE TABLE tbl_1 PARTITION OF tbl FOR VALUES FROM (0) TO (10);
> CREATE TABLE tbl_2 PARTITION OF tbl FOR VALUES FROM (20) TO (30);
> CREATE TABLE
> CREATE TABLE
> CREATE TABLE
> postgres=# ALTER TABLE tbl ADD CONSTRAINT tbl_chk_1 CHECK(i<>1) DEFERRABLE;
> ALTER TABLE
> postgres=# begin;
> BEGIN
> postgres=*#  SET CONSTRAINTS tbl_chk_1 DEFERRED;
> SET CONSTRAINTS
> postgres=*# INSERT INTO tbl values (1);
> ERROR:  new row for relation "tbl_1" violates check constraint "tbl_chk_1"
> DETAIL:  Failing row contains (1).
>
> Fixed in V3 patch.

> 4) There is a new warning popping up now:
> CREATE TABLE tbl_new_3 (i int check(i<>0)) partition by range (i);
> CREATE FOREIGN TABLE ftbl_new_3 PARTITION OF tbl_new_3 FOR VALUES FROM
> (40) TO (50) server s1;
> postgres=# ALTER TABLE tbl_new_3 ADD CONSTRAINT tbl_new_3_chk
> CHECK(i<>1) DEFERRABLE;
> WARNING:  unexpected pg_constraint record found for relation "tbl_new_3"
> ERROR:  "ftbl_new_3" is a foreign table
> DETAIL:  Foreign tables cannot have constraint triggers.
>
> Fixed in V3 patch.

-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com


Re: pg_stat_statements and "IN" conditions

2023-10-02 Thread Maciek Sakrejda
I've also tried the patch and I see the same results as Jakub, which
make sense to me. I did have issues getting it to apply, though: `git
am` complains about a conflict, though patch itself was able to apply
it.




Re: Differences between = ANY and IN?

2023-10-02 Thread Maciek Sakrejda
Great, thanks for the guidance!




Re: Fail hard if xlogreader.c fails on out-of-memory

2023-10-02 Thread Michael Paquier
On Thu, Sep 28, 2023 at 09:36:37AM +0900, Michael Paquier wrote:
> If none, I propose to apply the patch to switch to palloc() instead of
> palloc_extended(NO_OOM) in this code around the beginning of next
> week, down to 12.

Done down to 12 as of 6b18b3fe2c2f, then.
--
Michael


signature.asc
Description: PGP signature


Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-10-02 Thread vignesh C
On Tue, 3 Oct 2023 at 06:09, Peter Smith  wrote:
>
> Some review comments for v5.
>
> ==
> src/backend/catalog/pg_subscription.c
>
> 1. GetSubscription - comment
>
> + /* Get superuser for subscription owner */
> + sub->ownersuperuser = superuser_arg(sub->owner);
> +
>
> The comment doesn't seem very good.
>
> SUGGESTION
> /* Is the subscription owner a superuser? */

Modified

> ==
>
> 2. General - consistency
>
> Below are the code fragments using the new Subscription field.
>
> AlterSubscription_refresh:
> must_use_password = !sub->ownersuperuser && sub->passwordrequired;
>
> AlterSubscription:
> walrcv_check_conninfo(stmt->conninfo, sub->passwordrequired &&
> !sub->ownersuperuser);
>
> LogicalRepSyncTableStart:
> must_use_password = MySubscription->passwordrequired &&
> !MySubscription->ownersuperuser;
>
> run_apply_worker:
> must_use_password = MySubscription->passwordrequired &&
> !MySubscription->ownersuperuser;
>
> ~
>
> It is not a difference caused by this patch, but since you are
> modifying these lines anyway, I felt it would be better if all the
> expressions were consistent. So, in AlterSubscription_refresh IMO it
> would be better like:
>
> BEFORE
> must_use_password = !sub->ownersuperuser && sub->passwordrequired;
>
> SUGGESTION
> must_use_password = sub->passwordrequired && !sub->ownersuperuser;

Modified

Thanks for the comments, the attached v6 version patch has the changes
for the same.

Regards,
Vignesh
From 228c1439c722d74d1bc8746541e0fb0fcf127b57 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 22 Sep 2023 15:12:23 +0530
Subject: [PATCH v6] Restart the apply worker if the subscription owner has
 changed from superuser to non-superuser.

Restart the apply worker if the subscription owner has changed from
superuser to non-superuser. This is required so that the subscription
connection string gets revalidated to identify cases where the
password option is not specified as part of the connection string for
non-superuser.
---
 src/backend/catalog/pg_subscription.c   |  3 +++
 src/backend/commands/subscriptioncmds.c |  4 +--
 src/backend/replication/logical/tablesync.c |  3 +--
 src/backend/replication/logical/worker.c| 30 ++---
 src/include/catalog/pg_subscription.h   |  1 +
 src/test/subscription/t/027_nosuperuser.pl  | 24 +
 6 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..d6a978f136 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -108,6 +108,9 @@ GetSubscription(Oid subid, bool missing_ok)
    Anum_pg_subscription_suborigin);
 	sub->origin = TextDatumGetCString(datum);
 
+	/* Is the subscription owner a superuser? */
+	sub->ownersuperuser = superuser_arg(sub->owner);
+
 	ReleaseSysCache(tup);
 
 	return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 6fe111e98d..edc82c11be 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -869,7 +869,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 	load_file("libpqwalreceiver", false);
 
 	/* Try to connect to the publisher. */
-	must_use_password = !superuser_arg(sub->owner) && sub->passwordrequired;
+	must_use_password = sub->passwordrequired && !sub->ownersuperuser;
 	wrconn = walrcv_connect(sub->conninfo, true, must_use_password,
 			sub->name, &err);
 	if (!wrconn)
@@ -1249,7 +1249,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 			load_file("libpqwalreceiver", false);
 			/* Check the connection info string. */
 			walrcv_check_conninfo(stmt->conninfo,
-  sub->passwordrequired && !superuser_arg(sub->owner));
+  sub->passwordrequired && !sub->ownersuperuser);
 
 			values[Anum_pg_subscription_subconninfo - 1] =
 CStringGetTextDatum(stmt->conninfo);
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e2cee92cf2..d87ac474b1 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1278,9 +1278,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 
 	/* Is the use of a password mandatory? */
 	must_use_password = MySubscription->passwordrequired &&
-		!superuser_arg(MySubscription->owner);
+		!MySubscription->ownersuperuser;
 
-	/* Note that the superuser_arg call can access the DB */
 	CommitTransactionCommand();
 
 	SpinLockAcquire(&MyLogicalRepWorker->relmutex);
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 597947410f..d056407419 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3966,6 +3966,24 @@ maybe_reread_subscription(void)
 		apply_worker_exit();
 	}
 
+	/*
+	 * Exit if the owner of the subscription has changed from su

Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-10-02 Thread Peter Smith
On Tue, Oct 3, 2023 at 5:42 PM vignesh C  wrote:
>
> Thanks for the comments, the attached v6 version patch has the changes
> for the same.
>

v6 LGTM.

==
Kind Regards,
Peter Smith.
Fujitsu Australia