Re: Remove unneeded PGDATABASE setting from TAP tests

2024-01-01 Thread Bharath Rupireddy
On Sun, Dec 31, 2023 at 8:36 AM Michael Paquier  wrote:
>
> On Sun, Dec 31, 2023 at 07:24:08AM +0530, Bharath Rupireddy wrote:
> > Some of the TAP tests are explicitly setting PGDATABASE environment
> > variable to 'postgres', which isn't needed because the TAP test's perl
> > library PostgreSQL::Test::Cluster already sets it once and for all.
> > I've attached a patch that removes all such unneeded PGDATABASE
> > settings.
> >
> > Thoughts?
>
> Hmm.  I don't see any reason to not do that as these are not really
> self-documenting either.  How about 011_clusterdb_all.pl for
> PGDATABASE?

Oh, yeah. We can remove that too, after all, PGDATABASE is being set
to postgres the default database which PostgreSQL::Test::Cluster does
set for us. I was earlier swayed by the comment atop the PGDATABASE
setting in 011_clusterdb_all.pl. Attached is v2 patch with this
change.

We perhaps can re-word and retain the comment in 011_clusterdb_all.pl,
which I think unnecessary as the code around 'alldb' if-else condition
in clusterdb.c can tell that.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v2-0001-Remove-unneeded-PGDATABASE-setting-from-TAP-tests.patch
Description: Binary data


Re: Assorted typo fixes

2024-01-01 Thread Shubham Khanna
On Thu, Dec 28, 2023 at 3:21 AM Dagfinn Ilmari Mannsåker
 wrote:
>
> Hi folks,
>
> I was playing around with the `typos` tool
> (https://github.com/crate-ci/typos), and thought I'd run it on the
> posgres repo for fun.  After a bit of tweaking to get rid of most false
> positives (see separately attached .typos.toml file), it came up with a
> useful set of suggestions, some of which I applied verbatim, others
> which needed a bit more rewording.
>
> Attached is a series of patches.  The first one are what I consider
> obvious, unambiguous fixes to code comments.  The subsequent ones are
> fixes for actual code (variable, function, type names) and docs, one
> patch per class of typo.  As far as I can tell, none of the code changes
> (except the ECPG one, see below) affect anything exported, so this
> should not cause any compatibility issues for extensions.
>
> The ECPG change affects the generated C code, but from my reading of the
> callers in descriptor.c and ecpg.trailer, any code that would have
> caused it to encounter the affected enum value would fail to compile, so
> either the case is not possible, or nobody actually uses whatever syntax
> is affected (I don't know enough about ECPG to tell without spending far
> too much time digging in the code).

I was reviewing the Patch and came across a minor issue that the Patch
does not apply on the current Head. Please provide the updated version
of the patch. Also, I found one typo:
0008-ecpg-fix-typo-in-get_dtype-return-value-for-ECPGd_co.patch
All the other enum values return a string mathing the enum label, but
this has had a trailing r since the function was added in commit
339a5bbfb17ecd171ebe076c5bf016c4e66e2c0a

 Here 'mathing' should be 'matching'.

Thanks and Regards,
Shubham Khanna.




Re: speed up a logical replica setup

2024-01-01 Thread vignesh C
On Wed, 6 Dec 2023 at 12:53, Euler Taveira  wrote:
>
> On Thu, Nov 9, 2023, at 8:12 PM, Michael Paquier wrote:
>
> On Thu, Nov 09, 2023 at 03:41:53PM +0100, Peter Eisentraut wrote:
> > On 08.11.23 00:12, Michael Paquier wrote:
> >> - Should the subdirectory pg_basebackup be renamed into something more
> >> generic at this point?  All these things are frontend tools that deal
> >> in some way with the replication protocol to do their work.  Say
> >> a replication_tools?
> >
> > Seems like unnecessary churn.  Nobody has complained about any of the other
> > tools in there.
>
> Not sure.  We rename things across releases in the tree from time to
> time, and here that's straight-forward.
>
>
> Based on this discussion it seems we have a consensus that this tool should be
> in the pg_basebackup directory. (If/when we agree with the directory renaming,
> it could be done in a separate patch.) Besides this move, the v3 provides a 
> dry
> run mode. It basically executes every routine but skip when should do
> modifications. It is an useful option to check if you will be able to run it
> without having issues with connectivity, permission, and existing objects
> (replication slots, publications, subscriptions). Tests were slightly 
> improved.
> Messages were changed to *not* provide INFO messages by default and --verbose
> provides INFO messages and --verbose --verbose also provides DEBUG messages. I
> also refactored the connect_database() function into which the connection will
> always use the logical replication mode. A bug was fixed in the transient
> replication slot name. Ashutosh review [1] was included. The code was also 
> indented.
>
> There are a few suggestions from Ashutosh [2] that I will reply in another
> email.
>
> I'm still planning to work on the following points:
>
> 1. improve the cleanup routine to point out leftover objects if there is any
>connection issue.
> 2. remove the physical replication slot if the standby is using one
>(primary_slot_name).
> 3. provide instructions to promote the logical replica into primary, I mean,
>stop the replication between the nodes and remove the replication setup
>(publications, subscriptions, replication slots). Or even include another
>action to do it. We could add both too.
>
> Point 1 should be done. Points 2 and 3 aren't essential but will provide a 
> nice
> UI for users that would like to use it.

1) This Assert can fail if source is shutdown:
+static void
+drop_replication_slot(PGconn *conn, LogicalRepInfo *dbinfo, const
char *slot_name)
+{
+   PQExpBuffer str = createPQExpBuffer();
+   PGresult   *res;
+
+   Assert(conn != NULL);

I could simulate it by shutting the primary while trying to reach the
consistent state:
pg_subscriber: postmaster reached the consistent state
pg_subscriber: error: connection to database failed: connection to
server at "localhost" (127.0.0.1), port 5432 failed: Connection
refused
Is the server running on that host and accepting TCP/IP connections?
pg_subscriber: error: connection to database failed: connection to
server at "localhost" (127.0.0.1), port 5432 failed: Connection
refused
Is the server running on that host and accepting TCP/IP connections?
pg_subscriber: error: connection to database failed: connection to
server at "localhost" (127.0.0.1), port 5432 failed: Connection
refused
Is the server running on that host and accepting TCP/IP connections?
pg_subscriber: pg_subscriber.c:692: drop_replication_slot: Assertion
`conn != ((void *)0)' failed.
Aborted

2) Should we have some checks to see if the max replication slot
configuration is ok based on the number of slots that will be created,
we have similar checks in upgrade replication slots in
check_new_cluster_logical_replication_slots

3) Should we check if wal_level is set to logical, we have similar
checks in upgrade replication slots in
check_new_cluster_logical_replication_slots

4) The physical replication slot that was created will still be
present in the primary node, I felt this should be removed.

5) I felt the target server should be started before completion of
pg_subscriber:
+   /*
+* Stop the subscriber.
+*/
+   pg_log_info("stopping the subscriber");
+
+   pg_ctl_cmd = psprintf("\"%s\" stop -D \"%s\" -s", pg_ctl_path,
subscriber_dir);
+   rc = system(pg_ctl_cmd);
+   pg_ctl_status(pg_ctl_cmd, rc, 0);
+
+   /*
+* Change system identifier.
+*/
+   modify_sysid(pg_resetwal_path, subscriber_dir);
+
+   success = true;
+
+   pg_log_info("Done!");
+
+   return 0;

Regards,
Vignesh




Update for copyright messages to 2024 (Happy New Year!)

2024-01-01 Thread Michael Paquier
Hi,
(CC-ing Bruce)

As of this new year, and in the vein of c8e1ba736b2b.  Bruce, are you
planning an update of the copyright dates with a run of
./src/tools/copyright.pl on HEAD, and the smallish updates of the back
branches?

And of course, Happy New Year to all!
--
Michael


signature.asc
Description: PGP signature


Re: Track in pg_replication_slots the reason why slots conflict?

2024-01-01 Thread Amit Kapila
On Mon, Jan 1, 2024 at 12:32 PM shveta malik  wrote:
>
> PFA v3 after changing column name to 'conflict_reason'
>

Few minor comments:
===
1.
+  
+   wal_removed = required WAL has been removed.
+  
+ 
+ 
+  
+   rows_removed = required rows have been removed.
+  
+ 
+ 
+  
+   wal_level_insufficient = wal_level
insufficient on the primary server.
+  

Should we use the same style to write the description as we are using
for the wal_status column? For example, wal_removed
means that the required WAL has been removed.

2.
+  
+   The reason of logical slot's conflict with recovery.

My grammar tool says it should be: "The reason for the logical slot's
conflict with recovery."

Other than these minor comments, the patch looks good to me.

-- 
With Regards,
Amit Kapila.




Re: Autonomous transactions 2023, WIP

2024-01-01 Thread Ivan Kush



On 01.01.2024 09:47, Pavel Stehule wrote:



All use cases of pg_background, except asynchronous execution. If
later
add asynchronous execution, then all =)

For example, also:

* conversion from Oracle's `PRAGMA AUTONOMOUS` to Postgres.

* possibility to create functions that calls utility statements, like
VACUUM, etc.


almost all these tasks are more or less dirty. It is a serious 
question if we want to integrate pg_background to core.


What do you mean by the "dirty"?



I don't have good benchmarks now. Some simple, like many INSERTs.
Pool
gives advantage, more tps compared to pg_background with increasing
number of backends.

The main advantage over pg_background is pool of workers. In this
patch
separate pool is created for each backend. At the current time I'm
coding one shared pool for all backends.


I afraid so this solution can be very significantly slower than 
logging to postgres log or forwarding to syslog


Maybe. Need to benchmark. Also OLAP like ClickHouse is better for 
storing logs.


But in this case (log file -> database) a company needs to write a 
custom utility to load log file to the database:


* detect file size,

* load to database

* autorotate file by timeout of filesize

* etc.

Some of our customers use "Autonomous transactions" for logging =)

--
Best wishes,
Ivan Kush
Tantor Labs LLC





Re: Track in pg_replication_slots the reason why slots conflict?

2024-01-01 Thread shveta malik
On Mon, Jan 1, 2024 at 4:30 PM Amit Kapila  wrote:
>
> On Mon, Jan 1, 2024 at 12:32 PM shveta malik  wrote:
> >
> > PFA v3 after changing column name to 'conflict_reason'
> >
>
> Few minor comments:
> ===
> 1.
> +  
> +   wal_removed = required WAL has been removed.
> +  
> + 
> + 
> +  
> +   rows_removed = required rows have been removed.
> +  
> + 
> + 
> +  
> +   wal_level_insufficient = wal_level
> insufficient on the primary server.
> +  
>
> Should we use the same style to write the description as we are using
> for the wal_status column? For example, wal_removed
> means that the required WAL has been removed.
>
> 2.
> +  
> +   The reason of logical slot's conflict with recovery.
>
> My grammar tool says it should be: "The reason for the logical slot's
> conflict with recovery."
>
> Other than these minor comments, the patch looks good to me.

PFA  v4 which addresses the doc comments.

thanks
Shveta


v4-0001-Track-conflicting_reason-in-pg_replication_slots.patch
Description: Binary data


Re: Track in pg_replication_slots the reason why slots conflict?

2024-01-01 Thread shveta malik
On Mon, Jan 1, 2024 at 5:17 PM shveta malik  wrote:
>
> On Mon, Jan 1, 2024 at 4:30 PM Amit Kapila  wrote:
> >
> > On Mon, Jan 1, 2024 at 12:32 PM shveta malik  wrote:
> > >
> > > PFA v3 after changing column name to 'conflict_reason'
> > >
> >
> > Few minor comments:
> > ===
> > 1.
> > +  
> > +   wal_removed = required WAL has been removed.
> > +  
> > + 
> > + 
> > +  
> > +   rows_removed = required rows have been 
> > removed.
> > +  
> > + 
> > + 
> > +  
> > +   wal_level_insufficient = wal_level
> > insufficient on the primary server.
> > +  
> >
> > Should we use the same style to write the description as we are using
> > for the wal_status column? For example, wal_removed
> > means that the required WAL has been removed.
> >
> > 2.
> > +  
> > +   The reason of logical slot's conflict with recovery.
> >
> > My grammar tool says it should be: "The reason for the logical slot's
> > conflict with recovery."
> >
> > Other than these minor comments, the patch looks good to me.
>
> PFA  v4 which addresses the doc comments.

Please ignore the previous patch and PFA new v4 (v4_2). The only
change from the earlier v4 is the subject correction in commit msg.

thanks
Shveta


v4_2-0001-Track-conflict_reason-in-pg_replication_slots.patch
Description: Binary data


Re: add AVX2 support to simd.h

2024-01-01 Thread John Naylor
On Thu, Nov 30, 2023 at 12:15 AM Nathan Bossart
 wrote:
> I don't intend for this patch to be
> seriously considered until we have better support for detecting/compiling
> AVX2 instructions and a buildfarm machine that uses them.

That's completely understandable, yet I'm confused why there is a
commitfest entry for it marked "needs review".




Re: warn if GUC set to an invalid shared library

2024-01-01 Thread John Naylor
On Thu, Dec 28, 2023 at 12:27 PM Shubham Khanna
 wrote:
>
> I was reviewing the Patch and came across a minor issue that the Patch
> does not apply on the current Head. Please provide the updated version
> of the patch.

For your information, the commitfest manager has the ability to send
private messages to authors about procedural issues like this. There
is no need to tell the whole list about it.




Re: Proposal to include --exclude-extension Flag in pg_dump

2024-01-01 Thread Ashutosh Bapat
On Mon, Dec 25, 2023 at 3:48 PM Ayush Vatsa  wrote:
>
> Hi PostgreSQL Community,
> Recently I have been working on pg_dump regarding my project and wanted to 
> exclude an extension from the dump generated. I wonder why it doesn't have 
> --exclude-extension type of support whereas --extension exists!
> Since I needed that support, I took the initiative to contribute to the 
> community by adding the --exclude-extension flag.

Aren't extensions excluded by default? That's why we have --extension.
Why do we need to explicitly exclude extensions?

-- 
Best Wishes,
Ashutosh Bapat




Re: Autonomous transactions 2023, WIP

2024-01-01 Thread Pavel Stehule
po 1. 1. 2024 v 12:15 odesílatel Ivan Kush 
napsal:

>
> On 01.01.2024 09:47, Pavel Stehule wrote:
> >
> >
> > All use cases of pg_background, except asynchronous execution. If
> > later
> > add asynchronous execution, then all =)
> >
> > For example, also:
> >
> > * conversion from Oracle's `PRAGMA AUTONOMOUS` to Postgres.
> >
> > * possibility to create functions that calls utility statements, like
> > VACUUM, etc.
> >
> >
> > almost all these tasks are more or less dirty. It is a serious
> > question if we want to integrate pg_background to core.
>
> What do you mean by the "dirty"?
>

I don't think so these task should be implemented in stored procedures


>
> >
> > I don't have good benchmarks now. Some simple, like many INSERTs.
> > Pool
> > gives advantage, more tps compared to pg_background with increasing
> > number of backends.
> >
> > The main advantage over pg_background is pool of workers. In this
> > patch
> > separate pool is created for each backend. At the current time I'm
> > coding one shared pool for all backends.
> >
> >
> > I afraid so this solution can be very significantly slower than
> > logging to postgres log or forwarding to syslog
>
> Maybe. Need to benchmark. Also OLAP like ClickHouse is better for
> storing logs.
>
> But in this case (log file -> database) a company needs to write a
> custom utility to load log file to the database:
>
> * detect file size,
>
> * load to database
>
> * autorotate file by timeout of filesize
>
> * etc.
>
> Some of our customers use "Autonomous transactions" for logging =)
>

I understand the motivation. But it was designed 20 years ago, and I don't
see a reason why we need to implement the same "bad" patterns, mainly when
the proposed implementation is not fully robust or can have performance
issues.




>
> --
> Best wishes,
> Ivan Kush
> Tantor Labs LLC
>
>


Re: Proposal to include --exclude-extension Flag in pg_dump

2024-01-01 Thread Ayush Vatsa
Hi,
> Aren't extensions excluded by default? That's why we have --extension.
According to the documentation of pg_dump when the --extension option is
not specified, all non-system extensions in the target database will get
dumped.
> Why do we need to explicitly exclude extensions?
Hence to include only a few we use --extension, but to exclude a few I am
proposing --exclude-extension.

On Mon, 1 Jan 2024 at 18:18, Ashutosh Bapat 
wrote:

> On Mon, Dec 25, 2023 at 3:48 PM Ayush Vatsa 
> wrote:
> >
> > Hi PostgreSQL Community,
> > Recently I have been working on pg_dump regarding my project and wanted
> to exclude an extension from the dump generated. I wonder why it doesn't
> have --exclude-extension type of support whereas --extension exists!
> > Since I needed that support, I took the initiative to contribute to the
> community by adding the --exclude-extension flag.
>
> Aren't extensions excluded by default? That's why we have --extension.
> Why do we need to explicitly exclude extensions?
>
> --
> Best Wishes,
> Ashutosh Bapat
>


Re: Shared detoast Datum proposal

2024-01-01 Thread Andy Fan

Andy Fan  writes:

>
> Some Known issues:
> --
>
> 1. Currently only Scan & Join nodes are considered for this feature.
> 2. JIT is not adapted for this purpose yet.

JIT is adapted for this feature in v2. Any feedback is welcome.

-- 
Best Regards
Andy Fan

>From ee44d4721147589dbba8366382a18adeee05419b Mon Sep 17 00:00:00 2001
From: "yizhi.fzh" 
Date: Wed, 27 Dec 2023 18:43:56 +0800
Subject: [PATCH v2 1/1] shared detoast feature.

---
 src/backend/executor/execExpr.c  |  65 ++-
 src/backend/executor/execExprInterp.c| 181 +++
 src/backend/executor/execTuples.c| 130 +
 src/backend/executor/execUtils.c |   5 +
 src/backend/executor/nodeHashjoin.c  |   2 +
 src/backend/executor/nodeMergejoin.c |   2 +
 src/backend/executor/nodeNestloop.c  |   1 +
 src/backend/jit/llvm/llvmjit_expr.c  |  22 +
 src/backend/jit/llvm/llvmjit_types.c |   1 +
 src/backend/nodes/bitmapset.c|  13 +
 src/backend/optimizer/plan/createplan.c  |  73 ++-
 src/backend/optimizer/plan/setrefs.c | 536 +++
 src/include/executor/execExpr.h  |   6 +
 src/include/executor/tuptable.h  |  60 +++
 src/include/nodes/bitmapset.h|   1 +
 src/include/nodes/execnodes.h|   5 +
 src/include/nodes/plannodes.h|  50 ++
 src/test/regress/sql/shared_detoast_slow.sql |  70 +++
 18 files changed, 1117 insertions(+), 106 deletions(-)
 create mode 100644 src/test/regress/sql/shared_detoast_slow.sql

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 2c62b0c9c8..749bcc9023 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -935,22 +935,81 @@ ExecInitExprRec(Expr *node, ExprState *state,
 }
 else
 {
+	int			attnum;
+	Plan	   *plan = state->parent ? state->parent->plan : NULL;
+
 	/* regular user column */
 	scratch.d.var.attnum = variable->varattno - 1;
 	scratch.d.var.vartype = variable->vartype;
+	attnum = scratch.d.var.attnum;
+
 	switch (variable->varno)
 	{
 		case INNER_VAR:
-			scratch.opcode = EEOP_INNER_VAR;
+
+			if (is_join_plan(plan) &&
+bms_is_member(attnum,
+			  ((JoinState *) state->parent)->inner_pre_detoast_attrs))
+			{
+/* debug purpose. */
+if (!jit_enabled)
+{
+	elog(INFO,
+		 "EEOP_INNER_VAR_TOAST: flags = %d costs=%.2f..%.2f, attnum: %d",
+		 state->flags,
+		 plan->startup_cost,
+		 plan->total_cost,
+		 attnum);
+}
+scratch.opcode = EEOP_INNER_VAR_TOAST;
+			}
+			else
+			{
+scratch.opcode = EEOP_INNER_VAR;
+			}
 			break;
 		case OUTER_VAR:
-			scratch.opcode = EEOP_OUTER_VAR;
+			if (is_join_plan(plan) &&
+bms_is_member(attnum,
+			  ((JoinState *) state->parent)->outer_pre_detoast_attrs))
+			{
+/* debug purpose. */
+if (!jit_enabled)
+{
+	elog(INFO,
+		 "EEOP_OUTER_VAR_TOAST: flags = %u costs=%.2f..%.2f, attnum: %d",
+		 state->flags,
+		 plan->startup_cost,
+		 plan->total_cost,
+		 attnum);
+}
+scratch.opcode = EEOP_OUTER_VAR_TOAST;
+			}
+			else
+scratch.opcode = EEOP_OUTER_VAR;
 			break;
 
 			/* INDEX_VAR is handled by default case */
 
 		default:
-			scratch.opcode = EEOP_SCAN_VAR;
+			if (is_scan_plan(plan) && bms_is_member(
+	attnum,
+	((ScanState *) state->parent)->scan_pre_detoast_attrs))
+			{
+if (!jit_enabled)
+{
+	elog(INFO,
+		 "EEOP_SCAN_VAR_TOAST: flags = %u costs=%.2f..%.2f, scanId: %d, attnum: %d",
+		 state->flags,
+		 plan->startup_cost,
+		 plan->total_cost,
+		 ((Scan *) plan)->scanrelid,
+		 attnum);
+}
+scratch.opcode = EEOP_SCAN_VAR_TOAST;
+			}
+			else
+scratch.opcode = EEOP_SCAN_VAR;
 			break;
 	}
 }
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 24c2b60c62..b5a464bf80 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -57,6 +57,7 @@
 #include "postgres.h"
 
 #include "access/heaptoast.h"
+#include "access/detoast.h"
 #include "catalog/pg_type.h"
 #include "commands/sequence.h"
 #include "executor/execExpr.h"
@@ -157,6 +158,9 @@ static void ExecEvalRowNullInt(ExprState *state, ExprEvalStep *op,
 static Datum ExecJustInnerVar(ExprState *state, ExprContext *econtext, bool *isnull);
 static Datum ExecJustOuterVar(ExprState *state, ExprContext *econtext, bool *isnull);
 static Datum ExecJustScanVar(ExprState *state, ExprContext *econtext, bool *isnull);
+static Datum ExecJustInnerVarToast(ExprState *state, ExprContext *econtext, boo

Re: Transaction timeout

2024-01-01 Thread Andrey M. Borodin


> On 29 Dec 2023, at 16:15, Andrey M. Borodin  wrote:

PFA v20. Code steps are intact.

Further refactored tests:
1. Check termination of active and idle queries (previously tests from Li 
were testing only termination of idle query)
2. Check timeout reschedule (even when last active query was 'SET 
transaction_timeout')
3. Check that timeout is not rescheduled by new queries (Nik's case)


Do we have any other open items?
I've left 'make check-timeouts' in isolation directory, it's for development 
purposes. I think we should remove this before committing. Obviously, all patch 
steps are expected to be squashed before commit.


Best regards, Andrey Borodin.


v20-0001-Introduce-transaction_timeout.patch
Description: Binary data


v20-0004-fix-reschedule-timeout-for-each-commmand.patch
Description: Binary data


v20-0003-Try-to-enable-transaction_timeout-before-next-co.patch
Description: Binary data


v20-0002-Add-better-tests-for-transaction_timeout.patch
Description: Binary data


Re: Commitfest manager January 2024

2024-01-01 Thread Magnus Hagander
On Mon, Jan 1, 2024 at 4:35 AM vignesh C  wrote:
>
> On Sun, 24 Dec 2023 at 18:40, vignesh C  wrote:
> >
> > On Sun, 24 Dec 2023 at 07:16, Michael Paquier  wrote:
> > >
> > > On Sat, Dec 23, 2023 at 08:52:38AM +0530, vignesh C wrote:
> > > > I didn't see anyone volunteering for the January Commitfest, so I'll
> > > > volunteer to be CF manager for January 2024 Commitfest.
> > >
> > > (Adding Magnus in CC.)
> > >
> > > That would be really helpful.  Thanks for helping!  Do you have the
> > > admin rights on the CF app?  You are going to require them in order to
> > > mark the CF as in-process, and you would also need to switch the CF
> > > after that from "Future" to "Open" so as people can still post
> > > patches once January one begins.
> >
> > I don't have admin rights for the CF app. Please provide admin rights.
>
> I have not yet got the admin rights, Kindly provide admin rights for the CF 
> app.

It's been christmas holidays...


What's your community username?

//Magnus




Re: add function argument names to regex* functions.

2024-01-01 Thread Dian Fay
On Wed Dec 27, 2023 at 10:28 PM EST, jian he wrote:
> On Thu, Dec 28, 2023 at 6:25 AM Peter Eisentraut  wrote:
> >
> > On 27.12.23 17:53, jian he wrote:
> > > similar to [1], add function argument names to the following functions:
> > > regexp_like, regexp_match,regexp_matches,regexp_replace,
> > > regexp_substr,regexp_split_to_array,regexp_split_to_table,regexp_count
> >
> > Note that these functions are a quasi-standard that is shared with other
> > SQL implementations.  It might be worth looking around if there are
> > already other implementations of this idea.
> >
>
> turns out people do like calling functions via explicitly mentioning
> function argument names, example: [0]
> There are no provisions for the argument names.
>
> I looked around the oracle implementation in [1], and the oracle regex
> related function argumentation name in [2]
> I use the doc [3] syntax explanation and add the related function names.
>
> Current, regex.* function syntax seems fine. but only parameter `N`
> seems a little bit weird.
> If we change the function's argument name, we also need to change
> function syntax explanation in the doc; vise versa.
>
> QUOTE:
> The regexp_instr function returns the starting or ending position of
> the N'th match of a POSIX regular expression pattern to a string, or
> zero if there is no such match. It has the syntax regexp_instr(string,
> pattern [, start [, N [, endoption [, flags [, subexpr ]). pattern
> is searched for in string, normally from the beginning of the string,
> but if the start parameter is provided then beginning from that
> character index. If N is specified then the N'th match of the pattern
> is located, otherwise the first match is located.
> END OF QUOTE.
>
> maybe we can change `N` to occurrence. but `occurrence` is kind of verbose.
>
> [0] 
> https://stackoverflow.com/questions/33387348/oracle-named-parameters-in-regular-functions
> [1] 
> https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/REGEXP_SUBSTR.html#GUID-2903904D-455F-4839-A8B2-1731EF4BD099
> [2] https://dbfiddle.uk/h_SBDEKi
> [3] 
> https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP

I've been trying to use named arguments more diligently so expanding
support for built-in functions is welcome. The patch applies cleanly and
works as advertised.

I agree that the parameter name `n` is not ideal. For example, in
`regexp_replace` it's easy to misinterpret it as "make up to n
replacements". This has not been a problem when `n` only lives in the
documentation which explains exactly what it does, but that context is
not readily available in code expressing `n => 3`.

Another possibility is `index`, which is relatively short and not a
reserved keyword ^1. `position` is not as precise but would avoid the
conceptual overloading of ordinary indices.

1. https://www.postgresql.org/docs/current/sql-keywords-appendix.html




Minor cleanup for search path cache

2024-01-01 Thread Tom Lane
I happened to notice that there is a not-quite-theoretical crash
hazard in spcache_init().  If we see that SPCACHE_RESET_THRESHOLD
is exceeded and decide to reset the cache, but then nsphash_create
fails for some reason (perhaps OOM), an error will be thrown
leaving the SearchPathCache pointer pointing at already-freed
memory.  Next time through, we'll try to dereference that dangling
pointer, potentially causing SIGSEGV, or worse we might find a
value less than SPCACHE_RESET_THRESHOLD and decide that the cache
is okay despite having been freed.

The fix of course is to make sure we reset the pointer variables
*before* the MemoryContextReset.

I also observed that the code seems to have been run through
pgindent without fixing typedefs.list, making various places
uglier than they should be.

The attached proposed cleanup patch fixes those things and in
passing improves (IMO anyway) some comments.  I assume it wasn't
intentional to leave two copies of the same comment block in
check_search_path().

regards, tom lane

diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 37a69e9023..465a4e40bc 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -156,6 +156,11 @@ static Oid	namespaceUser = InvalidOid;
 
 /* The above four values are valid only if baseSearchPathValid */
 static bool baseSearchPathValid = true;
+
+/*
+ * Storage for search path cache.  Clear searchPathCacheValid as a simple
+ * way to invalidate *all* the cache entries, not just the active one.
+ */
 static bool searchPathCacheValid = false;
 static MemoryContext SearchPathCacheContext = NULL;
 
@@ -163,7 +168,7 @@ typedef struct SearchPathCacheKey
 {
 	const char *searchPath;
 	Oid			roleid;
-}			SearchPathCacheKey;
+} SearchPathCacheKey;
 
 typedef struct SearchPathCacheEntry
 {
@@ -176,7 +181,7 @@ typedef struct SearchPathCacheEntry
 
 	/* needed for simplehash */
 	char		status;
-}			SearchPathCacheEntry;
+} SearchPathCacheEntry;
 
 /*
  * myTempNamespace is InvalidOid until and unless a TEMP namespace is set up
@@ -281,8 +286,8 @@ spcachekey_equal(SearchPathCacheKey a, SearchPathCacheKey b)
  */
 #define SPCACHE_RESET_THRESHOLD		256
 
-static nsphash_hash * SearchPathCache = NULL;
-static SearchPathCacheEntry * LastSearchPathCacheEntry = NULL;
+static nsphash_hash *SearchPathCache = NULL;
+static SearchPathCacheEntry *LastSearchPathCacheEntry = NULL;
 
 /*
  * Create or reset search_path cache as necessary.
@@ -296,8 +301,11 @@ spcache_init(void)
 		SearchPathCache->members < SPCACHE_RESET_THRESHOLD)
 		return;
 
-	MemoryContextReset(SearchPathCacheContext);
+	/* make sure we don't leave dangling pointers if nsphash_create fails */
+	SearchPathCache = NULL;
 	LastSearchPathCacheEntry = NULL;
+
+	MemoryContextReset(SearchPathCacheContext);
 	/* arbitrary initial starting size of 16 elements */
 	SearchPathCache = nsphash_create(SearchPathCacheContext, 16, NULL);
 	searchPathCacheValid = true;
@@ -4267,7 +4275,7 @@ recomputeNamespacePath(void)
 {
 	Oid			roleid = GetUserId();
 	bool		pathChanged;
-	const		SearchPathCacheEntry *entry;
+	const SearchPathCacheEntry *entry;
 
 	/* Do nothing if path is already valid. */
 	if (baseSearchPathValid && namespaceUser == roleid)
@@ -4635,9 +4643,7 @@ check_search_path(char **newval, void **extra, GucSource source)
 	 * schemas that don't exist; and often, we are not inside a transaction
 	 * here and so can't consult the system catalogs anyway.  So now, the only
 	 * requirement is syntactic validity of the identifier list.
-	 */
-
-	/*
+	 *
 	 * Checking only the syntactic validity also allows us to use the search
 	 * path cache (if available) to avoid calling SplitIdentifierString() on
 	 * the same string repeatedly.
@@ -4667,19 +4673,10 @@ check_search_path(char **newval, void **extra, GucSource source)
 		list_free(namelist);
 		return false;
 	}
-
-	/*
-	 * We used to try to check that the named schemas exist, but there are
-	 * many valid use-cases for having search_path settings that include
-	 * schemas that don't exist; and often, we are not inside a transaction
-	 * here and so can't consult the system catalogs anyway.  So now, the only
-	 * requirement is syntactic validity of the identifier list.
-	 */
-
 	pfree(rawname);
 	list_free(namelist);
 
-	/* create empty cache entry */
+	/* OK to create empty cache entry */
 	if (use_cache)
 		(void) spcache_insert(searchPath, roleid);
 
@@ -4732,8 +4729,9 @@ InitializeSearchPath(void)
 	}
 	else
 	{
-		SearchPathCacheContext = AllocSetContextCreate(
-	   TopMemoryContext, "search_path processing cache",
+		/* Make the context we'll keep search path cache hashtable in */
+		SearchPathCacheContext = AllocSetContextCreate(TopMemoryContext,
+	   "search_path processing cache",
 	   ALLOCSET_DEFAULT_SIZES);
 
 		/*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 3310726cdd..cf8c59a411 10

Re: Assorted typo fixes

2024-01-01 Thread Dagfinn Ilmari Mannsåker
Shubham Khanna  writes:

> I was reviewing the Patch and came across a minor issue that the Patch
> does not apply on the current Head. Please provide the updated version
> of the patch.

Thanks for the heads-up. Commit 5ccb3bb13dcbedc30d015fc06d306d5106701e16
removed one of the instances of "data struture" fixed by the patch.

Rebased patch set attached.  I also squashed the check_decls.m4 change
into the main comment typos commit.

> Also, I found one typo:
> 0008-ecpg-fix-typo-in-get_dtype-return-value-for-ECPGd_co.patch All
> the other enum values return a string mathing the enum label, but this
> has had a trailing r since the function was added in commit
> 339a5bbfb17ecd171ebe076c5bf016c4e66e2c0a
>
>  Here 'mathing' should be 'matching'.

Thanks. I've fixed the commit message (and elaborated it a bit more why
I think it's a valid and safe fix).

> Thanks and Regards,
> Shubham Khanna.

- ilmari

>From 5ccb3bb13dcbedc30d015fc06d306d5106701e16 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Fri, 22 Dec 2023 01:46:27 +
Subject: [PATCH v2 01/12] Fix typos in comments

---
 config/check_decls.m4  |  2 +-
 contrib/bloom/bloom.h  |  2 +-
 contrib/pgcrypto/expected/pgp-compression.out  |  2 +-
 contrib/pgcrypto/openssl.c |  2 +-
 contrib/pgcrypto/pgp-encrypt.c |  2 +-
 contrib/pgcrypto/sql/pgp-compression.sql   |  2 +-
 contrib/postgres_fdw/expected/postgres_fdw.out |  2 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql  |  2 +-
 src/backend/access/brin/brin.c |  6 +++---
 src/backend/access/common/heaptuple.c  |  2 +-
 src/backend/access/gist/gistbuild.c|  2 +-
 src/backend/access/heap/heapam.c   |  4 ++--
 src/backend/access/nbtree/nbtree.c |  2 +-
 src/backend/catalog/namespace.c|  2 +-
 src/backend/catalog/pg_constraint.c|  2 +-
 src/backend/commands/event_trigger.c   |  2 +-
 src/backend/executor/execMain.c|  2 +-
 src/backend/optimizer/plan/initsplan.c |  4 ++--
 src/backend/utils/adt/rangetypes.c |  2 +-
 src/backend/utils/cache/catcache.c |  2 +-
 src/backend/utils/sort/tuplesortvariants.c | 10 +-
 src/backend/utils/time/combocid.c  |  2 +-
 src/bin/pg_rewind/t/001_basic.pl   |  2 +-
 src/bin/pg_rewind/t/004_pg_xlog_symlink.pl |  2 +-
 src/bin/pg_rewind/t/007_standby_source.pl  |  2 +-
 src/bin/pg_rewind/t/009_growing_files.pl   |  2 +-
 src/include/pg_config_manual.h |  2 +-
 src/test/isolation/specs/stats.spec| 16 
 src/test/recovery/t/029_stats_restart.pl   |  2 +-
 src/test/regress/expected/boolean.out  |  2 +-
 src/test/regress/expected/brin_multi.out   |  4 ++--
 src/test/regress/expected/join.out |  2 +-
 src/test/regress/sql/boolean.sql   |  2 +-
 src/test/regress/sql/brin_multi.sql|  4 ++--
 src/test/regress/sql/join.sql  |  2 +-
 35 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/config/check_decls.m4 b/config/check_decls.m4
index f1b90c5430..2dfcfe13fb 100644
--- a/config/check_decls.m4
+++ b/config/check_decls.m4
@@ -31,7 +31,7 @@
 # respectively.  If not, see .
 
 # Written by David MacKenzie, with help from
-# Franc,ois Pinard, Karl Berry, Richard Pixley, Ian Lance Taylor,
+# François Pinard, Karl Berry, Richard Pixley, Ian Lance Taylor,
 # Roland McGrath, Noah Friedman, david d zuhn, and many others.
 
 
diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 330811ec60..7c4407b9ec 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -127,7 +127,7 @@ typedef struct BloomMetaPageData
 	FreeBlockNumberArray notFullPage;
 } BloomMetaPageData;
 
-/* Magic number to distinguish bloom pages among anothers */
+/* Magic number to distinguish bloom pages from others */
 #define BLOOM_MAGICK_NUMBER (0xDBAC0DED)
 
 /* Number of blocks numbers fit in BloomMetaPageData */
diff --git a/contrib/pgcrypto/expected/pgp-compression.out b/contrib/pgcrypto/expected/pgp-compression.out
index d4c57feba3..67e2dce897 100644
--- a/contrib/pgcrypto/expected/pgp-compression.out
+++ b/contrib/pgcrypto/expected/pgp-compression.out
@@ -60,7 +60,7 @@ WITH random_string AS
   -- This generates a random string of 16366 bytes.  This is chosen
   -- as random so that it does not get compressed, and the decompression
   -- would work on a string with the same length as the origin, making the
-  -- test behavior more predictible.  lpad() ensures that the generated
+  -- test behavior more predictable.  lpad() ensures that the generated
   -- hexadecimal value is completed by extra zero characters if random()
   -- has generated a value strictly lower than 16.
   SELECT string_agg(decode(lpad(to_hex((random()*256)::int), 2, '0'), 'hex

Re: Change GUC hashtable to use simplehash?

2024-01-01 Thread jian he
On Tue, Dec 26, 2023 at 4:01 PM John Naylor  wrote:
>
> 0001-0003 are same as earlier
> 0004 takes Jeff's idea and adds in an optimization from NetBSD's
> strlen (I said OpenBSD earlier, but it goes back further). I added
> stub code to simulate big-endian when requested at compile time, but a
> later patch removes it. Since it benched well, I made the extra effort
> to generalize it for other callers. After adding to the hash state, it
> returns the length so the caller can pass it to the finalizer.
> 0005 is the benchmark (not for commit) -- I took the parser keyword
> list and added enough padding to make every string aligned when the
> whole thing is copied to an alloc'd area.
>
> Each of the bench_*.sql files named below are just running the
> similarly-named function, all with the same argument, e.g. "select *
> from bench_pgstat_hash_fh(10);", so not attached.
>
> Strings:
>
> -- strlen + hash_bytes
> pgbench -n -T 20 -f bench_hash_bytes.sql -M prepared | grep latency
> latency average = 1036.732 ms
>
> -- word-at-a-time hashing, with bytewise lookahead
> pgbench -n -T 20 -f bench_cstr_unaligned.sql -M prepared | grep latency
> latency average = 664.632 ms
>
> -- word-at-a-time for both hashing and lookahead (Jeff's aligned
> coding plus a technique from NetBSD strlen)
> pgbench -n -T 20 -f bench_cstr_aligned.sql -M prepared | grep latency
> latency average = 436.701 ms
>
> So, the fully optimized aligned case is worth it if it's convenient.
>
> 0006 adds a byteswap for big-endian so we can reuse little endian
> coding for the lookahead.
>
> 0007 - I also wanted to put numbers to 0003 (pgstat hash). While the
> motivation for that was cleanup, I had a hunch it would shave cycles
> and take up less binary space. It does on both accounts:
>
> -- 3x murmur + hash_combine
> pgbench -n -T 20 -f bench_pgstat_orig.sql -M prepared | grep latency
> latency average = 333.540 ms
>
> -- fasthash32 (simple call, no state setup and final needed for a single 
> value)
> pgbench -n -T 20 -f bench_pgstat_fh.sql -M prepared | grep latency
> latency average = 277.591 ms
>
> 0008 - We can optimize the tail load when it's 4 bytes -- to save
> loads, shifts, and OR's. My compiler can't figure this out for the
> pgstat hash, with its fixed 4-byte tail. It's pretty simple and should
> help other cases:
>
> pgbench -n -T 20 -f bench_pgstat_fh.sql -M prepared | grep latency
> latency average = 226.113 ms


--- /dev/null
+++ b/contrib/bench_hash/bench_hash.c
@@ -0,0 +1,103 @@
+/*-
+ *
+ * bench_hash.c
+ *
+ * Copyright (c) 2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *  src/test/modules/bench_hash/bench_hash.c
+ *
+ *-
+ */
you added this module to contrib module (root/contrib), your intention
(i guess) is to add in root/src/test/modules.
later I saw "0005 is the benchmark (not for commit)".


--- /dev/null
+++ b/src/include/common/hashfn_unstable.h
@@ -0,0 +1,213 @@
+/*
+Building blocks for creating fast inlineable hash functions. The
+unstable designation is in contrast to hashfn.h, which cannot break
+compatibility because hashes can be writen to disk and so must produce
+the same hashes between versions.
+
+ *
+ * Portions Copyright (c) 2018-2023, PostgreSQL Global Development Group
+ *
+ * src/include/common/hashfn_unstable.c
+ */
+
here should be "src/include/common/hashfn_unstable.h". typo: `writen`

In pgbench, I use --no-vacuum  --time=20 -M prepared
My local computer is slow. but here is the test results:

select * from bench_cstring_hash_aligned(10);7318.893 ms
select * from bench_cstring_hash_unaligned(10);10383.173 ms
select * from bench_pgstat_hash(10);   4474.989 ms
select * from bench_pgstat_hash_fh(10);  9192.245 ms
select * from bench_string_hash(10);2048.008 ms




Re: Add pg_basetype() function to obtain a DOMAIN base type

2024-01-01 Thread jian he
On Mon, Dec 4, 2023 at 5:11 PM John Naylor  wrote:
>
> On Thu, Sep 28, 2023 at 12:22 AM Alexander Korotkov
>  wrote:
> > The one thing triggering my perfectionism is that the patch does two
> > syscache lookups instead of one.
>
> For an admin function used interactively, I'm not sure why that
> matters? Or do you see another use case?

I did a minor refactor based on v1-0001.
I think pg_basetype should stay at "9.26.4. System Catalog Information
Functions".
So I placed it before pg_char_to_encoding.
Now functions listed on "Table 9.73. System Catalog Information
Functions" will look like alphabetical ordering.
I slightly changed the src/include/catalog/pg_proc.dat.
now it looks like very similar to pg_typeof

src6=# \df pg_typeof
   List of functions
   Schema   |   Name| Result data type | Argument data types | Type
+---+--+-+--
 pg_catalog | pg_typeof | regtype  | "any"   | func
(1 row)

src6=# \df pg_basetype
List of functions
   Schema   |Name | Result data type | Argument data types | Type
+-+--+-+--
 pg_catalog | pg_basetype | regtype  | "any"   | func
(1 row)

v2-0001 is as is in the first email thread, 0002 is my changes based on v2-0001.
From a3a180b7074c9196434381d46c636f417089659f Mon Sep 17 00:00:00 2001
From: steve-chavez 
Date: Sat, 9 Sep 2023 00:58:44 -0300
Subject: [PATCH v2 1/2] Add pg_basetype(regtype)

Currently obtaining the base type of a domain involves a long SQL query,
this specially in the case of recursive domain types.

To solve this, use the already available getBaseType() function,
and expose it as the pg_basetype SQL function.
---
 doc/src/sgml/func.sgml   | 25 +++
 src/backend/utils/adt/misc.c | 14 +++
 src/include/catalog/pg_proc.dat  |  3 +++
 src/test/regress/expected/domain.out | 36 
 src/test/regress/sql/domain.sql  | 17 +
 5 files changed, 95 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0a4f8520..7b14c87c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24698,6 +24698,31 @@ SELECT has_function_privilege('joeuser', 'myfunc(int, text)', 'execute');

   
 
+  
+   
+
+ pg_basetype
+
+pg_basetype ( type oid )
+regtype
+   
+   
+   Returns the OID of the base type of a domain or if the argument is a basetype it returns the same type.
+   If there's a chain of domain dependencies, it will recurse until finding the base type.
+   
+   
+For example:
+
+CREATE DOMAIN mytext as text;
+
+SELECT pg_basetype('mytext'::regtype);
+ pg_typeof
+---
+ text
+
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 5d78d6dc..c0c3c9e9 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -43,6 +43,7 @@
 #include "tcop/tcopprot.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
+#include "utils/syscache.h"
 #include "utils/lsyscache.h"
 #include "utils/ruleutils.h"
 #include "utils/timestamp.h"
@@ -566,6 +567,19 @@ pg_typeof(PG_FUNCTION_ARGS)
 	PG_RETURN_OID(get_fn_expr_argtype(fcinfo->flinfo, 0));
 }
 
+/*
+ * Return the base type of the argument.
+ */
+Datum
+pg_basetype(PG_FUNCTION_ARGS)
+{
+	Oid			oid = PG_GETARG_OID(0);
+
+	if (!SearchSysCacheExists1(TYPEOID, ObjectIdGetDatum(oid)))
+		PG_RETURN_NULL();
+
+	PG_RETURN_OID(getBaseType(oid));
+}
 
 /*
  * Implementation of the COLLATE FOR expression; returns the collation
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9052f526..f84f106b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3877,6 +3877,9 @@
 { oid => '1619', descr => 'type of the argument',
   proname => 'pg_typeof', proisstrict => 'f', provolatile => 's',
   prorettype => 'regtype', proargtypes => 'any', prosrc => 'pg_typeof' },
+{ oid => '6312', descr => 'get the base type of a domain',
+  proname => 'pg_basetype', proisstrict => 'f', provolatile => 's',
+  prorettype => 'regtype', proargtypes => 'oid', prosrc => 'pg_basetype' },
 { oid => '3162',
   descr => 'collation of the argument; implementation of the COLLATION FOR expression',
   proname => 'pg_collation_for', proisstrict => 'f', provolatile => 's',
diff --git a/src/test/regress/expected/domain.out b/src/test/regress/expected/domain.out
index 6d94e844..4f0253cd 100644
--- a/src/test/regress/expected/domain.out
+++ b/src/test/regress/expected/domain.out
@@ -1207,3 +1207,39 @@ create domain testdomain1 as int constraint unsigned check (value > 0);
 alter domain testdomain1 rename constraint unsigned to unsigned_foo;
 alter domain testdomain1 drop constraint unsigned_foo;

Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value

2024-01-01 Thread jian he
On Mon, Nov 6, 2023 at 8:00 AM jian he  wrote:
>
> minor doc issues.
> Returns the chunk id of the TOASTed value, or NULL if the value is not 
> TOASTed.
> Should it be "chunk_id"?
>
> you may place it after pg_create_logical_replication_slot entry to
> make it look like alphabetical order.
>
> There is no test. maybe we can add following to src/test/regress/sql/misc.sql
> create table val(t text);
> INSERT into val(t) SELECT string_agg(
>   chr((ascii('B') + round(random() * 25)) :: integer),'')
> FROM generate_series(1,2500);
> select pg_column_toast_chunk_id(t) is  not null from val;
> drop table val;

Hi
the main C function (pg_column_toast_chunk_id)  I didn't change.
I added tests as mentioned above.
tests put it on src/test/regress/sql/misc.sql, i hope that's fine.
I placed pg_column_toast_chunk_id in "Table 9.99. Database Object
Location Functions" (below Table 9.98. Database Object Size
Functions).
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5a47ce4343..c2c3156cd4 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27608,6 +27608,20 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset

   
 
+  
+   
+
+ pg_column_toast_chunk_id
+
+pg_column_toast_chunk_id ( "any" )
+oid
+   
+   
+Returns the chunk id of the TOASTed value, or NULL
+if the value is not TOASTed.
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 884bfbc8ce..fe8788c1b1 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5069,6 +5069,47 @@ pg_column_compression(PG_FUNCTION_ARGS)
 	PG_RETURN_TEXT_P(cstring_to_text(result));
 }
 
+/*
+ * Return the chunk id of the TOASTed value.
+ * Return NULL for unTOASTed value.
+ */
+Datum
+pg_column_toast_chunk_id(PG_FUNCTION_ARGS)
+{
+	int			typlen;
+	struct varlena *attr;
+	struct varatt_external toast_pointer;
+
+	/* On first call, get the input type's typlen, and save at *fn_extra */
+	if (fcinfo->flinfo->fn_extra == NULL)
+	{
+		/* Lookup the datatype of the supplied argument */
+		Oid			argtypeid = get_fn_expr_argtype(fcinfo->flinfo, 0);
+
+		typlen = get_typlen(argtypeid);
+		if (typlen == 0)		/* should not happen */
+			elog(ERROR, "cache lookup failed for type %u", argtypeid);
+
+		fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+	  sizeof(int));
+		*((int *) fcinfo->flinfo->fn_extra) = typlen;
+	}
+	else
+		typlen = *((int *) fcinfo->flinfo->fn_extra);
+
+	if (typlen != -1)
+		PG_RETURN_NULL();
+
+	attr = (struct varlena *) DatumGetPointer(PG_GETARG_DATUM(0));
+
+	if (!VARATT_IS_EXTERNAL_ONDISK(attr))
+		PG_RETURN_NULL();
+
+	VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
+
+	PG_RETURN_OID(toast_pointer.va_valueid);
+}
+
 /*
  * string_agg - Concatenates values and returns string.
  *
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 6996073989..0cacd0391d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7403,6 +7403,9 @@
 { oid => '2121', descr => 'compression method for the compressed datum',
   proname => 'pg_column_compression', provolatile => 's', prorettype => 'text',
   proargtypes => 'any', prosrc => 'pg_column_compression' },
+{ oid => '8393', descr => 'chunk ID of TOASTed value',
+  proname => 'pg_column_toast_chunk_id', provolatile => 's', prorettype => 'oid',
+  proargtypes => 'any', prosrc => 'pg_column_toast_chunk_id' },
 { oid => '2322',
   descr => 'total disk space usage for the specified tablespace',
   proname => 'pg_tablespace_size', provolatile => 'v', prorettype => 'int8',


Re: Commitfest manager January 2024

2024-01-01 Thread vignesh C
On Mon, 1 Jan 2024 at 21:01, Magnus Hagander  wrote:
>
> On Mon, Jan 1, 2024 at 4:35 AM vignesh C  wrote:
> >
> > On Sun, 24 Dec 2023 at 18:40, vignesh C  wrote:
> > >
> > > On Sun, 24 Dec 2023 at 07:16, Michael Paquier  wrote:
> > > >
> > > > On Sat, Dec 23, 2023 at 08:52:38AM +0530, vignesh C wrote:
> > > > > I didn't see anyone volunteering for the January Commitfest, so I'll
> > > > > volunteer to be CF manager for January 2024 Commitfest.
> > > >
> > > > (Adding Magnus in CC.)
> > > >
> > > > That would be really helpful.  Thanks for helping!  Do you have the
> > > > admin rights on the CF app?  You are going to require them in order to
> > > > mark the CF as in-process, and you would also need to switch the CF
> > > > after that from "Future" to "Open" so as people can still post
> > > > patches once January one begins.
> > >
> > > I don't have admin rights for the CF app. Please provide admin rights.
> >
> > I have not yet got the admin rights, Kindly provide admin rights for the CF 
> > app.
>
> It's been christmas holidays...
>
>
> What's your community username?

My username is vignesh.postgres

Regards,
Vignesh




Re: Random pg_upgrade test failure on drongo

2024-01-01 Thread Alexander Lakhin

Hello Kuroda-san,

28.12.2023 06:08, Hayato Kuroda (Fujitsu) wrote:

Dear Alexander,


I agree with your analysis and would like to propose a PoC fix (see
attached). With this patch applied, 20 iterations succeeded for me.

There are no reviewers so that I will review again. Let's move the PoC
to the concrete patch. Note that I only focused on fixes of random failure,
other parts are out-of-scope.


Thinking about that fix more, I'm not satisfied with the approach proposed.
First, it turns every unlink operation into two write operations
(rename + unlink), not to say about new risks of having stale .tmp files
(perhaps, it's ok for regular files (MoveFileEx can overwrite existing
files), but not for directories)
Second, it does that on any Windows OS versions, including modern ones,
which are not affected by the issue, as we know.

So I started to think about other approach: to perform unlink as it's
implemented now, but then wait until the DELETE_PENDING state is gone.
And I was very surprised to see that this state is not transient in our case.
Additional investigation showed that the test fails not because some aside
process opens a file (concretely, {template1_id/postgres_id}/2683), that is
being deleted, but because of an internal process that opens the file and
holds a handle to it indefinitely.
And the internal process is ... background writer (BgBufferSync()).

So, I tried just adding bgwriter_lru_maxpages = 0 to postgresql.conf and
got 20 x 10 tests passing.

Thus, it we want just to get rid of the test failure, maybe it's enough to
add this to the test's config...

The other way to go is to find out whether the background writer process
should react on a shared-inval message, sent from smgrdounlinkall(), and
close that file's handle,

Maybe we could also (after changing the bgwriter's behaviour) add a waiting
loop into pgwin32_open_handle() to completely rule out transient open()
failures due to some other process (such as Windows Exporer) opening a file
being deleted, but I would not complicate the things until we have a clear
vision/plans of using modern APIs/relying of modern OS versions' behavior.
I mean proceeding with something like:
https://commitfest.postgresql.org/40/3951/

Best regards,
Alexander




Re: Track in pg_replication_slots the reason why slots conflict?

2024-01-01 Thread Amit Kapila
On Mon, Jan 1, 2024 at 5:24 PM shveta malik  wrote:
>
> Please ignore the previous patch and PFA new v4 (v4_2). The only
> change from the earlier v4 is the subject correction in commit msg.
>

The patch looks good to me. I have slightly changed one of the
descriptions in the docs and also modified the commit message a bit. I
will push this after two days unless there are any more
comments/suggestions.

-- 
With Regards,
Amit Kapila.


v5-0001-Track-conflict_reason-in-pg_replication_slots.patch
Description: Binary data


Re: Minor cleanup for search path cache

2024-01-01 Thread Zhang Mingli
Hi,


Zhang Mingli
www.hashdata.xyz
On Jan 2, 2024 at 05:38 +0800, Tom Lane , wrote:
> I happened to notice that there is a not-quite-theoretical crash
> hazard in spcache_init(). If we see that SPCACHE_RESET_THRESHOLD
> is exceeded and decide to reset the cache, but then nsphash_create
> fails for some reason (perhaps OOM), an error will be thrown
> leaving the SearchPathCache pointer pointing at already-freed
> memory. Next time through, we'll try to dereference that dangling
> pointer, potentially causing SIGSEGV, or worse we might find a
> value less than SPCACHE_RESET_THRESHOLD and decide that the cache
> is okay despite having been freed.
>
> The fix of course is to make sure we reset the pointer variables
> *before* the MemoryContextReset.
>
> I also observed that the code seems to have been run through
> pgindent without fixing typedefs.list, making various places
> uglier than they should be.
>
> The attached proposed cleanup patch fixes those things and in
> passing improves (IMO anyway) some comments. I assume it wasn't
> intentional to leave two copies of the same comment block in
> check_search_path().
>
> regards, tom lane
>
Only me?

zml@localhashdata postgres % git apply minor-search-path-cache-cleanup.patch
error: patch failed: src/backend/catalog/namespace.c:156
error: src/backend/catalog/namespace.c: patch does not apply
error: patch failed: src/tools/pgindent/typedefs.list:2479
error: src/tools/pgindent/typedefs.list: patch does not apply

I’m in commit 9a17be1e24 Allow upgrades to preserve the full subscription's 
state


Re: Minor cleanup for search path cache

2024-01-01 Thread Tom Lane
Zhang Mingli  writes:
> Only me?

> zml@localhashdata postgres % git apply minor-search-path-cache-cleanup.patch
> error: patch failed: src/backend/catalog/namespace.c:156
> error: src/backend/catalog/namespace.c: patch does not apply
> error: patch failed: src/tools/pgindent/typedefs.list:2479
> error: src/tools/pgindent/typedefs.list: patch does not apply

Use patch(1).  git-apply is extremely fragile.

regards, tom lane




INFORMATION_SCHEMA node

2024-01-01 Thread Tatsuo Ishii
In the following paragraph in information_schema:

 character encoding form
 
  
   An encoding of some character repertoire.  Most older character
   repertoires only use one encoding form, and so there are no
   separate names for them (e.g., LATIN1 is an
   encoding form applicable to the LATIN1
   repertoire).  But for example Unicode has the encoding forms
   UTF8, UTF16, etc. (not
   all supported by PostgreSQL).  Encoding forms are not exposed
   as an SQL object, but are visible in this view.

This claims that the LATIN1 repertoire only uses one encoding form,
but actually LATIN1 can be encoded in another form: ISO-2022-JP-2 (a 7
bit encoding. See RFC 1554
(https://datatracker.ietf.org/doc/html/rfc1554) for more details).

If we still want to list a use-one-encoding-form example, probably we
could use LATIN2 instead or others that are not supported by
ISO-2022-JP-2 (ISO-2022-JP-2 supports LATIN1 and LATIN7).

Attached is the patch that does this.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Revise the Asserts added to bimapset manipulation functions

2024-01-01 Thread Richard Guo
On Sun, Dec 31, 2023 at 6:44 AM David Rowley  wrote:

> On Fri, 29 Dec 2023 at 23:38, Richard Guo  wrote:
> > After applying this process to the first RestrictInfo, the bitmapset now
> > becomes {t2, t3}.  Consequently, the second RestrictInfo also perceives
> > {t2, t3} as its required_relids.  As a result, when attempting to remove
> > it from the joininfo lists, a problem arises, because it is not in t2's
> > joininfo list.
>
> Changing the relids so they reference t2 instead of t1 requires both
> bms_add_member() and bms_del_member().  The bms_add_member() will
> cause the set to be reallocated with my patch so I don't see why this
> case isn't covered.


Hmm, you're right.  This particular case is covered by your patch.  I
wondered if there might be another case where a bitmapset with more than
one reference is modified without being potentially required to be
reallocated.  I'm not sure if there is such case in reality (or could be
introduced in the future), but if there is, I think it would be great if
REALLOCATE_BITMAPSETS could also help handle it.


> > Also, here are some review comments for the v2 patch:
> >
> > * There is no reallocation that happens in bms_add_members(), do we need
> > to call bms_copy_and_free() there?
>
> The spec I put in the comment at the top of bitmapset.c says:
>
> > have the code *always* reallocate the bitmapset when the
> > * set *could* be reallocated as a result of the modification
>
> Looking at bms_add_members(), it seems to me that the set *could* be
> reallocated as a result of the modification, per:
>
> if (a->nwords < b->nwords)
> {
> result = bms_copy(b);
> other = a;
> }
>
> In my view, that meets the spec I outlined.


I think one purpose of introducing REALLOCATE_BITMAPSETS is to help find
dangling pointers to Bitmapset's.  From this point of view, I agree that
we should call bms_copy_and_free() in bms_add_members(), because the
bitmapset 'a' might be recycled (in-place modified, or pfreed).

According to this criterion, it seems to me that we should also call
bms_copy_and_free() in bms_join(), since both inputs there might be
recycled.  But maybe I'm not understanding it correctly.


> > * Do you think we can add Assert(bms_is_valid_set(a)) for bms_free()?
>
> I did briefly have the Assert in bms_free(), but I removed it as I
> didn't think it was that useful to validate the set before freeing it.
> Certainly, there'd be an argument to do that, but I ended up not
> putting it there. I wondered if there could be a case where we do
> something like bms_int_members() which results in an empty set and
> after checking for and finding the set is now empty, we call
> bms_free().  If we did that then we'd Assert fail.  In reality, we use
> pfree() instead of bms_free() as we already know the set is not NULL,
> so it wouldn't cause a problem for that particular case. I wondered if
> there might be another one that I didn't spot, so felt it was best not
> to Assert there.
>
> I imagine that if we found some case where the bms_free() Assert
> failed, we'd likely just remove it rather than trying to make the set
> valid before freeing it.  So it seems pretty pointless if we'd opt to
> remove the Assert() at the first sign of trouble.


I think I understand your concern.  In some bitmapset manipulation
functions, like bms_int_members(), the result might be computed as
empty.  In such cases we need to free the input bitmap.  If we use
bms_free(), the Assert would fail.

It seems to me that this scenario can only occur within the bitmapset
manipulation functions.  Outside of bitmapset.c, I think it should be
quite safe to call bms_free() with this Assert.  So I think it should
not have problem to have this Assert in bms_free() as long as we are
careful enough when calling bms_free() inside bitmapset.c.

Thanks
Richard