Re: Out of memory error handling in frontend code

2023-10-06 Thread Frédéric Yhuel

Hi Daniel,

Thank you for your answer.

On 9/28/23 14:02, Daniel Gustafsson wrote:

On 28 Sep 2023, at 10:14, Frédéric Yhuel  wrote:



After some time, we understood that the 20 million of large objects were 
responsible for the huge memory usage (more than 10 GB) by pg_dump.


This sounds like a known issue [0] which has been reported several times, and
one we should get around to fixing sometime.



Indeed, I saw some of these reports afterwards :)


I think a more useful error message would help for such cases.


Knowing that this is case that pops up, I agree that we could do better around
the messaging here.


I haven't try to get the patch ready for review, I know that the format of the 
messages isn't right, I'd like to know what do you think of the idea, first.


I don't think adding more details is a bad idea, but it shouldn't require any
knowledge about internals so I think messages like the one below needs to be
reworded to be more helpful.

+   if (loinfo == NULL)
+   {
+   pg_fatal("getLOs: out of memory");
+   }



OK, here is a second version of the patch.

I didn't try to fix the path getLOs -> AssignDumpId -> catalogid_insert 
-> [...] -> catalogid_allocate, but that's annoying because it amounts 
to 11% of the memory allocations from valgrind's output.


Best regards,
FrédéricFrom 575903c1fd4ccbe49c762d1d6424e2264ba6cfad Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= 
Date: Mon, 18 Sep 2023 08:18:19 +0200
Subject: [PATCH] pg_dump: fix OOM handling

Exit with better error messages when there's not enough memory
to process large objects.
---
 src/bin/pg_dump/pg_backup_archiver.c | 45 +---
 src/bin/pg_dump/pg_backup_archiver.h |  4 +++
 src/bin/pg_dump/pg_dump.c| 33 
 src/common/fe_memutils.c | 14 +++--
 src/include/common/fe_memutils.h |  1 +
 5 files changed, 80 insertions(+), 17 deletions(-)

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index ab351e457e..fedbe67a07 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -1101,6 +1101,22 @@ WriteData(Archive *AHX, const void *data, size_t dLen)
 	AH->WriteDataPtr(AH, data, dLen);
 }
 
+#define fill_in_new_toc(a,b) \
+do { \
+	if (opts->b) \
+	{ \
+		newToc->a = pg_strdup_extended(opts->b, MCXT_ALLOC_NO_OOM); \
+		if (newToc->a == NULL) \
+		{ \
+			goto error; \
+		} \
+	} \
+	else \
+	{ \
+		newToc->a = NULL; \
+	} \
+} while(0)
+
 /*
  * Create a new TOC entry. The TOC was designed as a TOC, but is now the
  * repository for all metadata. But the name has stuck.
@@ -1118,7 +1134,11 @@ ArchiveEntry(Archive *AHX, CatalogId catalogId, DumpId dumpId,
 	ArchiveHandle *AH = (ArchiveHandle *) AHX;
 	TocEntry   *newToc;
 
-	newToc = (TocEntry *) pg_malloc0(sizeof(TocEntry));
+	newToc = (TocEntry *) pg_malloc_extended(sizeof(TocEntry), MCXT_ALLOC_NO_OOM|MCXT_ALLOC_ZERO);
+	if (newToc == NULL)
+	{
+		goto error;
+	}
 
 	AH->tocCount++;
 	if (dumpId > AH->maxDumpId)
@@ -1133,15 +1153,15 @@ ArchiveEntry(Archive *AHX, CatalogId catalogId, DumpId dumpId,
 	newToc->dumpId = dumpId;
 	newToc->section = opts->section;
 
-	newToc->tag = pg_strdup(opts->tag);
-	newToc->namespace = opts->namespace ? pg_strdup(opts->namespace) : NULL;
-	newToc->tablespace = opts->tablespace ? pg_strdup(opts->tablespace) : NULL;
-	newToc->tableam = opts->tableam ? pg_strdup(opts->tableam) : NULL;
-	newToc->owner = opts->owner ? pg_strdup(opts->owner) : NULL;
-	newToc->desc = pg_strdup(opts->description);
-	newToc->defn = opts->createStmt ? pg_strdup(opts->createStmt) : NULL;
-	newToc->dropStmt = opts->dropStmt ? pg_strdup(opts->dropStmt) : NULL;
-	newToc->copyStmt = opts->copyStmt ? pg_strdup(opts->copyStmt) : NULL;
+	fill_in_new_toc(tag, tag);
+	fill_in_new_toc(namespace, namespace);
+	fill_in_new_toc(tablespace, tablespace);
+	fill_in_new_toc(tableam, tableam);
+	fill_in_new_toc(owner, owner);
+	fill_in_new_toc(desc, description);
+	fill_in_new_toc(defn, createStmt);
+	fill_in_new_toc(dropStmt, dropStmt);
+	fill_in_new_toc(copyStmt, copyStmt);
 
 	if (opts->nDeps > 0)
 	{
@@ -1166,6 +1186,11 @@ ArchiveEntry(Archive *AHX, CatalogId catalogId, DumpId dumpId,
 		AH->ArchiveEntryPtr(AH, newToc);
 
 	return newToc;
+
+error:
+	pg_log_error("Could not add a new archive entry: %s", strerror(errno));
+	pg_log_error_hint(LO_OOM_HINT);
+	exit(1);
 }
 
 /* Public */
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 18b38c17ab..626fe2cb11 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -97,6 +97,10 @@ typedef struct _archiveHandle ArchiveHandle;
 typedef struct _tocEntry TocEntry;
 struct ParallelState;
 
+#define LO_OOM_HINT "If the database contains a large number of large objects, "\
+		"consider using the \"-B\" option to exclude them from the dump "\
+		"if it makes sense. Otherwise, 

Re: How to update unicode mapping table?

2023-10-06 Thread Peter Eisentraut

On 26.09.23 00:20, Tom Lane wrote:

On a slightly related note, I noticed while preparing 3aff1d3fd
that src/backend/utils/mb/Unicode/Makefile seems a little screwy.
If you go into that directory and type "make distclean", you'll
find it removes some files that are in git:


Since this is only used during update-unicode, whose purpose is to 
overwrite files in git, this might not be totally wrong.


Maybe the target names distclean and maintainer-clean are inappropriate, 
since they suggest some standard semantics.  Maybe the current semantics 
are also not that useful, I'm not sure.  I guess among the things you'd 
want are "delete all intermediate downloaded files", which seemingly 
none of these do.






Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)

2023-10-06 Thread Heikki Linnakangas

On 06/10/2023 09:50, Michael Paquier wrote:

On Fri, Oct 06, 2023 at 02:30:16PM +0900, Michael Paquier wrote:

Okay, the backtrace is not that useful.  I'll see if I can get
something better, still it seems like this has broken the way the
syslogger closes these ports.


And here you go:
Program terminated with signal SIGSEGV, Segmentation fault.
#0  GetMemoryChunkMethodID (pointer=0x0) at mcxt.c:196 196 header =
*((const uint64 *) ((const char *) pointer - sizeof(uint64)));
(gdb) bt
#0  GetMemoryChunkMethodID (pointer=0x0) at mcxt.c:196
#1  0x557d04176d59 in pfree (pointer=0x0) at mcxt.c:1463
#2  0x557d03e8eab3 in ClosePostmasterPorts (am_syslogger=true) at 
postmaster.c:2571
#3  0x557d03e93ac2 in SysLogger_Start () at syslogger.c:686
#4  0x557d03e8c5b7 in PostmasterMain (argc=3, argv=0x557d0471ed00)
at postmaster.c:1148
#5  0x557d03d48e34 in main (argc=3, argv=0x557d0471ed00) at main.c:198
(gdb) up 2
#2  0x557d03e8eab3 in ClosePostmasterPorts (am_syslogger=true) at
postmaster.c:2571
2571 pfree(ListenSockets);
(gdb) p ListenSockets $1 = (pgsocket *) 0x0


Fixed, thanks!

I did a quick test with syslogger enabled before committing, but didn't 
notice the segfault. I missed it because syslogger gets restarted and 
then it worked.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: should frontend tools use syncfs() ?

2023-10-06 Thread Maxim Orlov
Back to the patch v11. I don’t understand a bit, what we should do next?
Make a separate thread or put this one on commitfest?

-- 
Best regards,
Maxim Orlov.


Re: pg_rewind with cascade standby doesn't work well

2023-10-06 Thread Kuwamura Masaki
Thanks for your review!

2023年9月27日(水) 8:33 Michael Paquier :

> On Tue, Sep 26, 2023 at 06:44:50PM +0300, Aleksander Alekseev wrote:
> >> And also, I'm afraid that I'm not sure what kind of tests I have to make
> >> for fix this behavior. Would you mind giving me some advice?
> >
> > Personally I would prefer not to increase the scope of work. Your TAP
> > test added in 0001 seems to be adequate.
>
> Yeah, agreed.  I'm OK with what you are proposing, basically (the
> test could be made a bit cheaper actually).


I guess you meant that it contains an unnecessary insert and wait.
I fixed this and some incorrect comments caused by copy & paste.
Please see the attached patch.


>
 /*
> - * For Hot Standby, we could treat this like a Shutdown
> Checkpoint,
> - * but this case is rarer and harder to test, so the benefit
> doesn't
> - * outweigh the potential extra cost of maintenance.
> + * For Hot Standby, we could treat this like an end-of-recovery
> checkpoint
>   */
> +RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT);
>
> I don't understand what you want to change here.  Archive recovery and
> crash recovery are two different things, still this code would be
> triggered even if there is no standby.signal, aka the node is not a
> standby.  Why isn't this stuff conditional?


You are absolutely right. It should only run in standby mode.
Also, according to the document[1], a server can be "Hot Standby" even if
it is
not in standby mode (i.e. when it is in archive recovery mode).
So I fixed the comment above `RequestCheckpoint()`.

[1]: https://www.postgresql.org/docs/current/hot-standby.html

I hope you will review it again.

Regards,

Masaki Kuwamura


v3-0001-pg_rewind-Fix-bug-using-cascade-standby-as-source.patch
Description: Binary data


Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Peter Eisentraut

On 03.10.23 21:54, Jeff Davis wrote:

Here, Jeff mentions normalization, but I think it's a major issue
with
collation support. If new code points are added, users can put them
into the database before they are known to the collation library, and
then when they become known to the collation library the sort order
changes and indexes break.


The collation version number may reflect the change in understanding
about assigned code points that may affect collation -- though I'd like
to understand whether this is guaranteed or not.


This is correct.  The collation version number produced by ICU contains 
the UCA version, which is effectively the Unicode version (14.0, 15.0, 
etc.).  Since new code point assignments can only come from new Unicode 
versions, a new assigned code point will always result in a different 
collation version.


For example, with ICU 70 / CLDR 40 / Unicode 14:

select collversion from pg_collation where collname = 'unicode';
= 153.112

With ICU 72 / CLDR 42 / Unicode 15:
= 153.120


At minimum I think we need to have some internal functions to check for
unassigned code points. That belongs in core, because we generate the
unicode tables from a specific version.


If you want to be rigid about it, you also need to consider whether the 
Unicode version used by the ICU library in use matches the one used by 
the in-core tables.






Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)

2023-10-06 Thread Michael Paquier
On Fri, Oct 06, 2023 at 10:27:22AM +0300, Heikki Linnakangas wrote:
> I did a quick test with syslogger enabled before committing, but didn't
> notice the segfault. I missed it because syslogger gets restarted and then
> it worked.

Thanks, Heikki.
--
Michael


signature.asc
Description: PGP signature


Re: Remove distprep

2023-10-06 Thread Michael Paquier
On Fri, Oct 06, 2023 at 08:38:31AM +0200, Peter Eisentraut wrote:
> Yes, something like that.  Some people wanted a tarball of just the HTML
> docs for download.  Similar to the PDFs currently, I suppose.

I suspected so.

I've marked the patch as RfC for now.
--
Michael


signature.asc
Description: PGP signature


Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-06 Thread Michael Paquier
On Fri, Oct 06, 2023 at 03:29:28PM +0900, Michael Paquier wrote:
> Andres, are there logs for this TAP test on serinus?  Or perhaps there
> is a core file that could be looked at?  The other animals are not
> showing anything for the moment.

serinus has reported back once again, and just returned with a green
state, twice in a row:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2023-10-06%2007%3A42%3A53
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2023-10-06%2007%3A28%3A02

Well, it looks OK.  Still that's itching a bit.
--
Michael


signature.asc
Description: PGP signature


Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Peter Eisentraut

On 05.10.23 19:30, Jeff Davis wrote:

Agreed, at least until we understand the set of users per-column
encoding is important to. I acknowledge that the presence of per-column
encoding in the standard is some kind of signal there, but not enough
by itself to justify something so invasive.


The per-column encoding support in SQL is clearly a legacy feature from 
before Unicode.  If one were to write something like SQL today, one 
would most likely just specify, "everything is Unicode".






Re: Build the docs if there are changes in docs and don't run other tasks if the changes are only in docs

2023-10-06 Thread Peter Eisentraut

On 26.09.23 16:51, Nazir Bilal Yavuz wrote:

Also note that there are also dependencies in the other direction.  For
example, the psql help is compiled from XML DocBook sources.  So your
other patch would also need to include similar changesInclude() clauses.


If there are more cases like this, it may not be worth it. Instead, we can just:

- Build the docs when the doc related files are changed (This still
creates a dependency like you said).

- Skip CI completely if the README files are changed.

What are your opinions on these?


I don't have a good sense of what you are trying to optimize for.  If 
it's the mainline build-on-every-commit type, then I wonder how many 
commits would really be affected by this.  Like, how many commits touch 
only a README file.  If it's for things like the cfbot, then I think the 
time-triggered builds would be more frequent than new patch versions, so 
I don't know if these kinds of optimizations would affect anything.






Re: Synchronizing slots from primary to standby

2023-10-06 Thread Alvaro Herrera
On 2023-Sep-27, Peter Smith wrote:

> 3. get_local_synced_slot_names
> 
> + for (int i = 0; i < max_replication_slots; i++)
> + {
> + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
> +
> + /* Check if it is logical synchronized slot */
> + if (s->in_use && SlotIsLogical(s) && s->data.synced)
> + {
> + for (int j = 0; j < MySlotSyncWorker->dbcount; j++)
> + {
> 
> Loop variables are not declared in the common PG code way.

Note that since we added C99 as a mandatory requirement for compilers in
commit d9dd406fe281, we've been using declarations in loop initializers
(see 143290efd079).  We have almost 500 occurrences of this already.
Older code, obviously, does not use them, but that's no reason not to
introduce them in new code.  I think they make the code a bit leaner, so
I suggest to use these liberally.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")




Re: remaining sql/json patches

2023-10-06 Thread Amit Langote
On Wed, Oct 4, 2023 at 10:26 PM Amit Langote  wrote:
> On Tue, Oct 3, 2023 at 10:11 PM Amit Langote  wrote:
> > On Mon, Oct 2, 2023 at 2:26 PM Amit Langote  wrote:
> > > On Mon, Oct 2, 2023 at 1:24 PM Amit Langote  
> > > wrote:
> > > > Pushed this 30 min ago (no email on -committers yet!) and am looking
> > > > at the following llvm crash reported by buildfarm animal pogona [1]:
> > > > This seems to me to be complaining about the following addition:
> > > >
> > > > +   {
> > > > +   Oid ioparam = op->d.iocoerce.typioparam;
> > > > +   LLVMValueRef v_params[6];
> > > > +   LLVMValueRef v_success;
> > > > +
> > > > +   v_params[0] = 
> > > > l_ptr_const(op->d.iocoerce.finfo_in,
> > > > + 
> > > > l_ptr(StructFmgrInfo));
> > > > +   v_params[1] = v_output;
> > > > +   v_params[2] = l_oid_const(lc, ioparam);
> > > > +   v_params[3] = l_int32_const(lc, -1);
> > > > +   v_params[4] = 
> > > > l_ptr_const(op->d.iocoerce.escontext,
> > > > +
> > > > l_ptr(StructErrorSaveContext));
> > > >
> > > > -   LLVMBuildStore(b, v_retval, v_resvaluep);
> > > > +   /*
> > > > +* InputFunctionCallSafe() will write directly 
> > > > into
> > > > +* *op->resvalue.
> > > > +*/
> > > > +   v_params[5] = v_resvaluep;
> > > > +
> > > > +   v_success = LLVMBuildCall(b, llvm_pg_func(mod,
> > > > "InputFunctionCallSafe"),
> > > > + v_params, 
> > > > lengthof(v_params),
> > > > + 
> > > > "funccall_iocoerce_in_safe");
> > > > +
> > > > +   /*
> > > > +* Return null if InputFunctionCallSafe() 
> > > > encountered
> > > > +* an error.
> > > > +*/
> > > > +   v_resnullp = LLVMBuildICmp(b, LLVMIntEQ, 
> > > > v_success,
> > > > +  l_sbool_const(0), 
> > > > "");
> > > > +   }
> > >
> ...I haven't yet pinpointed down
> which of the LLVM's asserts it is, nor have I been able to walk
> through LLVM source code using gdb to figure what the new code is
> doing wrong.  Maybe I'm still missing a trick or two...

I finally managed to analyze the crash by getting the correct LLVM build.

So the following bits are the culprits:

1. v_output needed to be converted from being reference to a Datum to
be reference to char * as follows before passing to
InputFunctionCallSafe():

-   v_params[1] = v_output;
+   v_params[1] = LLVMBuildIntToPtr(b, v_output,
+
l_ptr(LLVMInt8TypeInContext(lc)),
+   "");

2. Assignment of op->d.iocoerce.escontext needed to be changed like this:

v_params[4] = l_ptr_const(op->d.iocoerce.escontext,
-
l_ptr(StructErrorSaveContext));
+ l_ptr(StructNode));

3. v_success needed to be "zero-extended" to match in type with
whatever s_bool_const() produces, as follows:

+   v_success = LLVMBuildZExt(b, v_success,
TypeStorageBool, "");
v_resnullp = LLVMBuildICmp(b, LLVMIntEQ, v_success,
   l_sbool_const(0), "");

No more crashes with the above fixes.

Attached shows the delta against the patch I reverted.  I'll push the
fixed up version on Monday.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


iocoerce-llvm-fixes.diff
Description: Binary data


Re: Included xid in restoring reorder buffer changes from disk log message

2023-10-06 Thread vignesh C
On Fri, 30 Apr 2021 at 11:53, Dilip Kumar  wrote:
>
> On Thu, Apr 29, 2021 at 9:45 PM vignesh C  wrote:
> >
> > Hi,
> >
> > While debugging one of the logical decoding issues, I found that xid was 
> > not included in restoring reorder buffer changes from disk log messages.  
> > Attached a patch for it. I felt including the XID will be helpful in 
> > debugging. Thoughts?
> >
>
> It makes sense to include xid in the debug message, earlier I thought
> that will it be a good idea to also print the toplevel_xid.  But I
> think it is for debug purposes and only we have the xid we can fetch
> the other parameters so maybe adding xid is good enough.

While having a look at the reorderbuffer code, I noticed that this
changes were still not committed.
Here is a rebased version of the patch.

Regards,
Vignesh
From 5a27c87eb82676c3889d27aa3bddb93744d40ec5 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 6 Oct 2023 14:21:32 +0530
Subject: [PATCH v2] Include xid in restoring reorder buffer changes from disk
 log message.

Include xid in restoring reorder buffer changes from disk log message.
---
 src/backend/replication/logical/reorderbuffer.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 12edc5772a..907bb0039f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -1442,9 +1442,10 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, ReorderBufferIterTXNState *state)
 dlist_head_element(ReorderBufferChange, node,
    &entry->txn->changes);
 
-			elog(DEBUG2, "restored %u/%u changes from disk",
+			elog(DEBUG2, "restored %u/%u changes of XID %u from disk",
  (uint32) entry->txn->nentries_mem,
- (uint32) entry->txn->nentries);
+ (uint32) entry->txn->nentries,
+ entry->txn->xid);
 
 			Assert(entry->txn->nentries_mem);
 			/* txn stays the same */
-- 
2.34.1



typo in couple of places

2023-10-06 Thread vignesh C
Hi,

I noticed a couple of typos in code. "the the" should have been "the",
attached patch has the changes for the same.

Regards,
Vignesh
From 83d3ccf7e0bb7732c70da825bfcfb8d0e453c207 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 6 Oct 2023 15:24:12 +0530
Subject: [PATCH] Fix typo in couple of places.

"the the" should be "the".
---
 .cirrus.star| 2 +-
 src/backend/executor/nodeHashjoin.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/.cirrus.star b/.cirrus.star
index d2d6ceca20..36233872d1 100644
--- a/.cirrus.star
+++ b/.cirrus.star
@@ -46,7 +46,7 @@ def main():
 
 def config_from(config_src):
 """return contents of config file `config_src`, surrounded by markers
-indicating start / end of the the included file
+indicating start / end of the included file
 """
 
 config_contents = fs.read(config_src)
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index aea44a9d56..25a2d78f15 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -1306,7 +1306,7 @@ ExecParallelHashJoinNewBatch(HashJoinState *hjstate)
  * The data recorded in the file for each tuple is its hash value,
  * then the tuple in MinimalTuple format.
  *
- * fileptr points to a batch file in one of the the hashtable arrays.
+ * fileptr points to a batch file in one of the hashtable arrays.
  *
  * The batch files (and their buffers) are allocated in the spill context
  * created for the hashtable.
-- 
2.34.1



Re: remaining sql/json patches

2023-10-06 Thread Alvaro Herrera
On 2023-Oct-06, Amit Langote wrote:

> 2. Assignment of op->d.iocoerce.escontext needed to be changed like this:
> 
> v_params[4] = l_ptr_const(op->d.iocoerce.escontext,
> -
> l_ptr(StructErrorSaveContext));
> + l_ptr(StructNode));

Oh, so you had to go back to using StructNode in order to get this
fixed?  That's weird.  Is it just because InputFunctionCallSafe is
defined to take fmNodePtr?  (I still fail to see that a pointer to
ErrorSaveContext would differ in any material way from a pointer to
Node).


Another think I thought was weird is that it would only crash in LLVM5
debug and not the other LLVM-enabled animals, but looking closer at the
buildfarm results, I think that may have been only because you reverted
too quickly, and phycodorus and petalura didn't actually run with
7fbc75b26ed8 before you reverted it.  Dragonet did make a run with it,
but it's marked as "LLVM optimized" instead of "LLVM debug".  I suppose
that must be making a difference.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"World domination is proceeding according to plan"(Andrew Morton)




Re: Checks in RegisterBackgroundWorker.()

2023-10-06 Thread Thomas Munro
On Thu, Sep 28, 2023 at 9:46 AM Heikki Linnakangas  wrote:
> Here's a new version of these patches. I fixed one comment and ran
> pgindent, no other changes.

> Subject: [PATCH v2 1/3] Clarify the checks in RegisterBackgroundWorker.

LGTM.  I see it passes on CI, and I also tested locally with
EXEC_BACKEND, with shared_preload_libraries=path/to/pg_prewarm.so
which works fine.

> Subject: [PATCH v2 2/3] Allocate Backend structs in PostmasterContext.

LGTM.  I checked that you preserve the behaviour on OOM (LOG), and you
converted free() to pfree() in code that runs in the postmaster, but
dropped it in the code that runs in the child because all children
should delete PostmasterContext, making per-object pfree redundant.
Good.

> Subject: [PATCH v2 3/3] Fix misleading comment on StartBackgroundWorker().

LGTM.  Hmm, maybe I would have called that function
"BackgroundWorkerMain()" like several other similar things, but that's
not important.

This doesn't quite fix the problem I was complaining about earlier,
but it de-confuses things.  (Namely that if BackgroundWorkerList
weren't a global variable, RegisterWorkerMain() wouldn't be able to
find it, and if it took some kind of context pointer as an argument,
_PG_init() functions wouldn't be able to provide it, unless we changed
_PG_init() to take an argument, which we can't really do.  Oh well.)




Re: Synchronizing slots from primary to standby

2023-10-06 Thread shveta malik
On Fri, Oct 6, 2023 at 2:07 PM Alvaro Herrera  wrote:
>
> On 2023-Sep-27, Peter Smith wrote:
>
> > 3. get_local_synced_slot_names
> >
> > + for (int i = 0; i < max_replication_slots; i++)
> > + {
> > + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
> > +
> > + /* Check if it is logical synchronized slot */
> > + if (s->in_use && SlotIsLogical(s) && s->data.synced)
> > + {
> > + for (int j = 0; j < MySlotSyncWorker->dbcount; j++)
> > + {
> >
> > Loop variables are not declared in the common PG code way.
>
> Note that since we added C99 as a mandatory requirement for compilers in
> commit d9dd406fe281, we've been using declarations in loop initializers
> (see 143290efd079).  We have almost 500 occurrences of this already.
> Older code, obviously, does not use them, but that's no reason not to
> introduce them in new code.  I think they make the code a bit leaner, so
> I suggest to use these liberally.
>

Okay, we will. Thanks for letting us know.

thanks
Shveta




Re: Make --help output fit within 80 columns per line

2023-10-06 Thread Peter Eisentraut

On 25.09.23 08:27, torikoshia wrote:
So in summary, I think 80 is a decent soft limit, but let's not stress 
out about some lines going over that, and make a hard limit of perhaps 
120.


+1. It may be a good compromise.
For enforcing the hard limit, is it better to add a regression test like 
patch 0001?



Agreed. It seems inconsistent with other commands.
Patch 0002 removed environment-variable-based defaults in psql --help.


I have committed 0002 and a different implementation of 0001.  I set the 
maximum line length to 95, which is the current maximum in use.


I'm open to discussing other line lengths, but

1) If we make it longer, then we should also adjust the existing 
wrapping so that we don't have a mix of lines wrapped at 80 and some 
significantly longer lines.


2) There are some general readability guidelines that suggest like 66 or 
72 characters per line.  If you take that and add the option name itself 
and some indentation, then around 90 does seem like a sensible limit.


3) The examples from other tools posted earlier don't convince me.  Some 
of their --help outputs look like junk and poorly taken care of.


So I think nudging people to aim for 80..95 seems like a good target 
right now.  But I'm not against adjustments.






Re: wal recycling problem

2023-10-06 Thread Fabrice Chapuis
Thanks Christoph for your message.
Now I understand why the wals are preserved if logical replication is
configured and enabled. The problem is that when a large volume of data is
loaded into a database, for example during a pg_restore, the wal sender
process associated with the logical replication slot will have to decrypt
all of the wals generated during this operation which will take a long time
and the restart_lsn will not be modified.
>From a conceptual point of view I think that specific wals per subscription
should be used and stored in the pg_replslot folder in order to avoid
working directly on the wals of the instance.

What do you think about this proposal?

Regards

Fabrice


On Mon, Oct 2, 2023 at 12:06 PM Christoph Moench-Tegeder 
wrote:

> Hi,
>
> ## Fabrice Chapuis (fabrice636...@gmail.com):
>
> > on the other hand there are 2 slots for logical replication which display
> > status extended. I don't understand why given that the
> confirmed_flush_lsn
> > field that is up to date. The restart_lsn remains frozen, for what
> reason?
>
> There you have it - "extended" means "holding wal". And as long as the
> restart_lsn does not advance, checkpointer cannot free any wal beyond
> that lsn. My first idea would be some long-running (or huge) transaction
> which is in process (active or still being streamed). I'd recommend
> looking into what the clients on these slots are doing.
>
> Regards,
> Christoph
>
> --
> Spare Space
>


Re: Request for comment on setting binary format output per session

2023-10-06 Thread Peter Eisentraut

On 04.10.23 18:26, Merlin Moncure wrote:
On Wed, Oct 4, 2023 at 9:17 AM Peter Eisentraut > wrote:


I think intuitively, this facility ought to work like client_encoding.
There, the client declares its capabilities, and the server has to
format the output according to the client's capabilities.  That works,
and it also works through connection poolers.  (It is a GUC.)  If we
can
model it like that as closely as possible, then we have a chance of
getting it working reliably.  Notably, the value space for
client_encoding is a globally known fixed list of strings.  We need to
figure out what is the right way to globally identify types, like
either
by fully-qualified name, by base name, some combination, how does it
work with extensions, or do we need a new mechanism like UUIDs.  I
think
that is something we need to work out, no matter which protocol
mechanism we end up using.


  Fantastic write up.

 > globally known fixed list of strings
Are you suggesting that we would have a client/server negotiation such 
as, 'jdbc', 'all', etc where that would identify which types 
are done which way?  If you did that, why would we need to promote 
names/uuid to permanent global space?


No, I don't think I meant anything like that.





Re: Request for comment on setting binary format output per session

2023-10-06 Thread Peter Eisentraut

On 04.10.23 20:30, Dave Cramer wrote:

We need to
figure out what is the right way to globally identify types, like
either
by fully-qualified name, by base name, some combination, how does it
work with extensions, or do we need a new mechanism like UUIDs.  I
think
that is something we need to work out, no matter which protocol
mechanism we end up using.


So how is this different than the GUC that I proposed ?


The last patch I see from you in this thread uses OIDs, which I have 
argued is not the right solution.





Re: Request for comment on setting binary format output per session

2023-10-06 Thread Peter Eisentraut

On 04.10.23 21:10, Robert Haas wrote:

On Wed, Oct 4, 2023 at 10:17 AM Peter Eisentraut  wrote:

I think intuitively, this facility ought to work like client_encoding.


I hadn't really considered client_encoding as a precedent for this
setting. A lot of my discomfort with the proposed mechanism also
applies to client_encoding, namely, suppose you call some function or
procedure or whatever and it changes client_encoding on your behalf
and now your communication with the server is all screwed up. That
seems very unpleasant. Yet it's also existing behavior. I think one
could conclude on these facts either that (a) client_encoding is fine
and the problems with controlling behavior using that kind of
mechanism are mostly theoretical or (b) that we messed up with
client_encoding and shouldn't add any more mistakes of the same ilk or
(c) that we should really be looking at redesigning the way
client_encoding works, too.


Yeah I agree with all three of these points, but I don't have a strong 
opinion which is the best one.






Re: remaining sql/json patches

2023-10-06 Thread Amit Langote
On Fri, Oct 6, 2023 at 19:01 Alvaro Herrera  wrote:

> On 2023-Oct-06, Amit Langote wrote:
>
> > 2. Assignment of op->d.iocoerce.escontext needed to be changed like this:
> >
> > v_params[4] =
> l_ptr_const(op->d.iocoerce.escontext,
> > -
> > l_ptr(StructErrorSaveContext));
> > + l_ptr(StructNode));
>
> Oh, so you had to go back to using StructNode in order to get this
> fixed?  That's weird.  Is it just because InputFunctionCallSafe is
> defined to take fmNodePtr?  (I still fail to see that a pointer to
> ErrorSaveContext would differ in any material way from a pointer to
> Node).


The difference matters to LLVM’s type system, which considers Node to be a
type with 1 sub-type (struct member) and ErrorSaveContext with 4 sub-types.
It doesn’t seem to understand that both share the first member.


Another think I thought was weird is that it would only crash in LLVM5
> debug and not the other LLVM-enabled animals, but looking closer at the
> buildfarm results, I think that may have been only because you reverted
> too quickly, and phycodorus and petalura didn't actually run with
> 7fbc75b26ed8 before you reverted it.  Dragonet did make a run with it,
> but it's marked as "LLVM optimized" instead of "LLVM debug".  I suppose
> that must be making a difference.


AFAICS, only assert-enabled LLVM builds crash.

>


Re: More new SQL/JSON item methods

2023-10-06 Thread Peter Eisentraut

On 29.08.23 09:05, Jeevan Chalke wrote:

v1-0001-Implement-jsonpath-.bigint-.integer-and-.number-m.patch

This commit implements jsonpath .bigint(), .integer(), and .number()
methods.  The JSON string or a numeric value is converted to the
bigint, int4, and numeric type representation.


A comment that applies to all of these: These add various keywords, 
switch cases, documentation entries in some order.  Are we happy with 
that?  Should we try to reorder all of that for better maintainability 
or readability?



v1-0002-Implement-.date-.time-.time_tz-.timestamp-and-.ti.patch

This commit implements jsonpath .date(), .time(), .time_tz(),
.timestamp(), .timestamp_tz() methods.  The JSON string representing
a valid date/time is converted to the specific date or time type
representation.

The changes use the infrastructure of the .datetime() method and
perform the datatype conversion as appropriate.  All these methods
accept no argument and use ISO datetime formats.


These should accept an optional precision argument.  Did you plan to add 
that?



v1-0003-Implement-jsonpath-.boolean-and-.string-methods.patch

This commit implements jsonpath .boolean() and .string() methods.


This contains a compiler warning:

../src/backend/utils/adt/jsonpath_exec.c: In function 
'executeItemOptUnwrapTarget':
../src/backend/utils/adt/jsonpath_exec.c:1162:86: error: 'tmp' may be 
used uninitialized [-Werror=maybe-uninitialized]



v1-0004-Implement-jasonpath-.decimal-precision-scale-meth.patch

This commit implements jsonpath .decimal() method with optional
precision and scale.  If precision and scale are provided, then
it is converted to the equivalent numerictypmod and applied to the
numeric number.


This also contains compiler warnings:

../src/backend/utils/adt/jsonpath_exec.c: In function 
'executeItemOptUnwrapTarget':
../src/backend/utils/adt/jsonpath_exec.c:1403:53: error: declaration of 
'numstr' shadows a previous local [-Werror=shadow=compatible-local]
../src/backend/utils/adt/jsonpath_exec.c:1442:54: error: declaration of 
'elem' shadows a previous local [-Werror=shadow=compatible-local]


There is a typo in the commit message: "Implement jasonpath"

Any reason this patch is separate from 0002?  Isn't number() and 
decimal() pretty similar?


You could also update src/backend/catalog/sql_features.txt in each patch 
(features T865 through T878).






Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-10-06 Thread vignesh C
On Wed, 4 Oct 2023 at 17:01, Shlok Kyal  wrote:
>
> On Wed, 4 Oct 2023 at 16:56, Peter Smith  wrote:
> >
> > On Tue, Oct 3, 2023 at 5:42 PM vignesh C  wrote:
> > >
> > > Thanks for the comments, the attached v6 version patch has the changes
> > > for the same.
> > >
> >
> > v6 LGTM.
> >
> I have verified the patch and it is working fine for me.

I have created the commitfest entry for this at:
https://commitfest.postgresql.org/45/4595/

Regards.
Vignesh




Re: FDW pushdown of non-collated functions

2023-10-06 Thread Ashutosh Bapat
Hi Jean-Christophe,

On Fri, Sep 8, 2023 at 11:30 PM Jean-Christophe Arnu  wrote:
>
> Maybe we could add another condition to the first if statement in order to 
> allow a “no-collation” function to be pushed down even if they have 
> “collatable” parameters. I’m not sure about the possible regressions of 
> behaviour of this change, but it
seems to work fine with date_trunc() and date_part() (which suffers
the same problem).

That may not work since the output of the function may be dependent
upon the collation on the inputs.

There were similar discussions earlier. E.g.
https://www.postgresql.org/message-id/flat/CACowWR1ARWyRepRxGfijMcsw%2BH84Dj8x2o9N3kvz%3Dz1p%2B6b45Q%40mail.gmail.com.

Reading Tom's first reply there you may work around this by declaring
the collation explicitly.

Briefly reading Tom's reply, the problem seems to be trusting whether
the default collation locally and on the foreign server respectively
is same or not. May be a simple fix is to declare a foreign server
level option declaring that the default collation on the foreign
server is same as the local server may be a way to move forward. But
given that the problem remains unsolved for 7 years at least, may be
such a simple fix is not enough.

Another solution would be to attach another attribute to a function
indicating whether the output of that function depends upon the input
collations or not. Doing that just for FDW may not be acceptable
though.

-- 
Best Wishes,
Ashutosh Bapat




Re: ALTER COLUMN ... SET EXPRESSION to alter stored generated column's expression

2023-10-06 Thread Peter Eisentraut

On 28.08.23 11:54, Amul Sul wrote:
Thanks for the review comments, I have fixed those in the attached 
version. In

addition to that, extended syntax to have the STORE keyword as suggested by
Vik.


An additional comment: When you change the generation expression, you 
need to run ON UPDATE triggers on all rows, if there are any triggers 
defined.  That is because someone could have triggers defined on the 
column to either check for valid values or propagate values somewhere 
else, and if the expression changes, that is kind of like an UPDATE.


Similarly, I think we should consider how logical decoding should handle 
this operation.  I'd imagine it should generate UPDATE events on all 
rows.  A test case in test_decoding would be useful.






Re: Two Window aggregate node for logically same over clause

2023-10-06 Thread Ashutosh Bapat
On Thu, Oct 5, 2023 at 8:53 PM "Anitha S"  wrote:
>
>
>
> Hi team,
>
>   We have observed that for logically same over clause two different 
> window aggregate nodes are created in plan.
> The below query contains two window functions. Both Over clause contain the 
> same partition & order clause in it. But in one over clause ordering option 
> is mentioned as ascending but not in another over clause which represents the 
> default option "ascending".
>
>
> Sample Query:
>
> CREATE TEMPORARY TABLE empsalary (
> depname varchar,
> empno bigint,
> salary int,
> enroll_date date
> );
>
> INSERT INTO empsalary VALUES ('develop', 10, 5200, '2007-08-01'), ('sales', 
> 1, 5000, '2006-10-01'),
> ('personnel', 5, 3500, '2007-12-10'),('sales', 4, 4800, 
> '2007-08-08'),('personnel', 2, 3900, '2006-12-23'),
> ('develop', 7, 4200, '2008-01-01'),('develop', 9, 4500, 
> '2008-01-01'),('sales', 3, 4800, '2007-08-01'),
> ('develop', 8, 6000, '2006-10-01'),('develop', 11, 5200, '2007-08-15');
>
> explain verbose select rank() over (partition by depname order by salary), 
> percent_rank() over(partition by depname order by salary asc) from empsalary;
>   QUERY PLAN
> --
>  WindowAgg (cost=174.54..1000114.66 rows=1070 width=52)
>  Output: (rank() OVER (?)), percent_rank() OVER (?), salary, depname
>   -> WindowAgg (cost=174.54..195.94 rows=1070 width=44)
>   Output: salary, depname, rank() OVER (?)
> -> Sort (cost=174.54..177.21 rows=1070 width=36)
> Output: salary, depname
> Sort Key: empsalary.depname, empsalary.salary
>-> Seq Scan on pg_temp_7.empsalary (cost=0.00..20.70 
> rows=1070 width=36)
>Output: salary, depname
>
>
> Ordering option of Sort is represented by enum SortByDir (parsenodes.h).
>
> List of SortBy is present in WindowDef structure which has info of order by 
> clause in Over clause
>
> /* Sort ordering options for ORDER BY and CREATE INDEX */
> typedef enum SortByDir
> {
> SORTBY_DEFAULT,
> SORTBY_ASC,
> SORTBY_DESC,
> SORTBY_USING /* not allowed in CREATE INDEX ... */
> } SortByDir;
> typedef struct SortBy
> {
> NodeTag type;
> Node *node; /* expression to sort on */
> SortByDir sortby_dir; /* ASC/DESC/USING/default */
> SortByNulls sortby_nulls; /* NULLS FIRST/LAST */
> List *useOp; /* name of op to use, if SORTBY_USING */
> int location; /* operator location, or -1 if none/unknown */
> } SortBy;
>
>
> In transformWindowFuncCall API, Equality check of order clause in window 
> definition failed while comparing SortByDir enum of both over clause i.e 
> SORT_DEFAULT  is not equal to SORT_ASC. Hence two window clause are created 
> in parse tree resulting in the creation of two different window aggregate 
> node.
>
> This check can be modified to form a single window aggregate node for the 
> above results in faster query execution. Is there any reason for creating two 
> different window aggregate node?

I don't see any. https://www.postgresql.org/docs/16/sql-select.html
description of ORDER BY clause clearly says that ASC is assumed when
no direction is mentioned. The only place in code which is used to
create the node treats DEFAULT and ASC as same. May be we want to
allow default to be ASC or DESC based on some setting (read GUC) in
some future.

Another angle is to ask: Why would the query add ASC to one window
specification and not the other?

-- 
Best Wishes,
Ashutosh Bapat




Re: Make --help output fit within 80 columns per line

2023-10-06 Thread torikoshia

On 2023-09-25 15:27, torikoshia wrote:
Ugh, regression tests failed and it appears to be due to reasons 
related to meson.

I'm going to investigate it.


ISTM

On 2023-10-06 19:49, Peter Eisentraut wrote:

On 25.09.23 08:27, torikoshia wrote:
So in summary, I think 80 is a decent soft limit, but let's not 
stress out about some lines going over that, and make a hard limit of 
perhaps 120.


+1. It may be a good compromise.
For enforcing the hard limit, is it better to add a regression test 
like patch 0001?



Agreed. It seems inconsistent with other commands.
Patch 0002 removed environment-variable-based defaults in psql --help.


I have committed 0002 and a different implementation of 0001.  I set
the maximum line length to 95, which is the current maximum in use.


Thanks!

BTW as far as I investigated, the original 0002 patch failed because
current meson doesn't accept subtest outputs.

As I commented below thread a few days ago, they once modified to
accept subtest outputs, but not anymore.
https://github.com/mesonbuild/meson/issues/10032


I'm open to discussing other line lengths, but

1) If we make it longer, then we should also adjust the existing
wrapping so that we don't have a mix of lines wrapped at 80 and some
significantly longer lines.

2) There are some general readability guidelines that suggest like 66
or 72 characters per line.  If you take that and add the option name
itself and some indentation, then around 90 does seem like a sensible
limit.

3) The examples from other tools posted earlier don't convince me.
Some of their --help outputs look like junk and poorly taken care of.

So I think nudging people to aim for 80..95 seems like a good target
right now.  But I'm not against adjustments.


--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation




Re: ALTER COLUMN ... SET EXPRESSION to alter stored generated column's expression

2023-10-06 Thread Ashutosh Bapat
On Fri, Oct 6, 2023 at 6:06 PM Peter Eisentraut  wrote:
>
> On 28.08.23 11:54, Amul Sul wrote:
> > Thanks for the review comments, I have fixed those in the attached
> > version. In
> > addition to that, extended syntax to have the STORE keyword as suggested by
> > Vik.
>
> An additional comment: When you change the generation expression, you
> need to run ON UPDATE triggers on all rows, if there are any triggers
> defined.  That is because someone could have triggers defined on the
> column to either check for valid values or propagate values somewhere
> else, and if the expression changes, that is kind of like an UPDATE.
>
> Similarly, I think we should consider how logical decoding should handle
> this operation.  I'd imagine it should generate UPDATE events on all
> rows.  A test case in test_decoding would be useful.
>

Should we treat it the same fashion as ALTER COLUMN ... TYPE which
rewrites the column values? Of course that rewrites the whole table,
but logically they are comparable.

-- 
Best Wishes,
Ashutosh Bapat




Re: RFC: Logging plan of the running query

2023-10-06 Thread torikoshia

On 2023-10-04 03:00, James Coleman wrote:
On Thu, Sep 7, 2023 at 2:09 AM torikoshia  
wrote:


On 2023-09-06 11:17, James Coleman wrote:

>> > I've never been able to reproduce it (haven't tested the new version,
>> > but v28 at least) on my M1 Mac; where I've reproduced it is on Debian
>> > (first buster and now bullseye).
>> >
>> > I'm attaching several stacktraces in the hope that they provide some
>> > clues. These all match the ps output I sent earlier, though note in
>> > that output there is both the regress instance and my test instance
>> > (pid 3213249) running (different ports, of course, and they are from
>> > the exact same compilation run). I've attached ps output for the
>> > postgres processes under the make check process to simplify cross
>> > referencing.
>> >
>> > A few interesting things:
>> > - There's definitely a lock on a relation that seems to be what's
>> > blocking the processes.
>> > - When I try to connect with psql the process forks but then hangs
>> > (see the ps output with task names stuck in "authentication"). I've
>> > also included a trace from one of these.
>>
>> Thanks for sharing them!
>>
>> Many processes are waiting to acquire the LW lock, including the
>> process
>> trying to output the plan(select1.trace).
>>
>> I suspect that this is due to a lock that was acquired prior to being
>> interrupted by ProcessLogQueryPlanInterrupt(), but have not been able
>> to
>> reproduce the same situation..
>>
>
> I don't have time immediately to dig into this, but perhaps loading up
> the core dumps would allow us to see what query is running in each
> backend process (if it hasn't already been discarded by that point)
> and thereby determine what point in each test process led to the error
> condition?

Thanks for the suggestion.
I am concerned that core dumps may not be readable on different
operating systems or other environments. (Unfortunately, I do not have
Debian on hand)

It seems that we can know what queries were running from the stack
traces you shared.
As described above, I suspect a lock which was acquired prior to
ProcessLogQueryPlanInterrupt() caused the issue.
I will try a little more to see if I can devise a way to create the 
same

situation.

> Alternatively we might be able to apply the same trick to the test
> client instead...
>
> BTW, for my own easy reference in this thread: relid 1259 is pg_class
> if I'm not mistaken.

Yeah, and I think it's strange that the lock to 1259 appears twice and
must be avoided.

   #10 0x559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1) 
at

lmgr.c:117
   ..
   #49 0x559d61b4989d in ProcessLogQueryPlanInterrupt () at
explain.c:5158
   ..
   #53 0x559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1) 
at

lmgr.c:117


I chatted with Andres and David about this at PGConf.NYC,

Thanks again for the discussion with hackers!


and I think
what we need to do is explicitly disallow running this code any time
we are inside of lock acquisition code.


Yeah, I think it's a straightforward workaround.
And I'm also concerned that the condition of being in the process
of acquiring some kind of lock is too strict and will make it almost
impossible to output a running plan.

Anyway I'm going to implement it and run pg_log_query_plan()
while the regression test is running to see how successful the plan
output is.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation




RE: [PoC] pg_upgrade: allow to upgrade publisher nodeHayato Kuroda (Fujitsu)

2023-10-06 Thread Hayato Kuroda (Fujitsu)
Dear hackers,

Based on comments, I revised my patch. PSA the file.

> 
> > Today, I discussed this problem with Andres at PGConf NYC and he
> > suggested as following. To verify, if there is any pending unexpected
> > WAL after shutdown, we can have an API like
> > pg_logical_replication_slot_advance() which will simply process
> > records without actually sending anything downstream. In this new API,
> > we will start with each slot's restart_lsn location and try to process
> > till the end of WAL, if we encounter any WAL that needs to be
> > processed (like we need to send the decoded WAL downstream) we can
> > return a false indicating that there is an unexpected WAL. The reason
> > to start with restart_lsn is that it is the location that we use to
> > start scanning the WAL anyway.

I implemented this by using decoding context. The binary upgrade function
processes WALs from the confirmed_flush, and returns false if some meaningful
changes are found.

Internally, I added a new decoding mode - DECODING_MODE_SILENT - and used it.
If the decoding context is in the mode, the output plugin is not loaded, but
any WALs are decoded without skipping. Also, a new flag "did_process" is also
added. This flag is set if wrappers for output plugin callbacks are called 
during
the silent mode. The upgrading function checks both reorder buffer and the new
flag because both (non-)transactional changes should be detected. If we only
check reorder buffer, we miss the non-transactional one.

fast_forward was changed as a variant of decoding mode.

Currently the function is called for all the valid slot. If the approach seems
good, we can refactor like Bharath said [1].

> 
> > Then, we should also try to create slots before invoking pg_resetwal.
> > The idea is that we can write a new binary mode function that will do
> > exactly what pg_resetwal does to compute the next segment and use that
> > location as a new location (restart_lsn) to create the slots in a new
> > node. Then, pass it pg_resetwal by using the existing option '-l
> > walfile'. As we don't have any API that takes restart_lsn as input, we
> > can write a new API probably for binary mode to create slots that do
> > take restart_lsn as input. This will ensure that there is no new WAL
> > inserted by background processes between resetwal and the creation of
> > slots.

Based on that, I added another binary function 
binary_upgrade_create_logical_replication_slot().
This function is similar to pg_create_logical_replication_slot(), but the
restart_lsn and confirmed_flush are set to *next* WAL segment. The pointed
filename is returned and it is passed to pg_resetwal command.

One consideration is that pg_log_standby_snapshot() must be executed before
slots consuming changes. New cluster does not have RUNNING_XACTS records so that
decoding context on new cluster cannot be create a consistent snapshot as-is.
This may lead to discard changes during the upcoming consuming event. To
prevent it the function is called after the final pg_resetwal.

How do you think?

Acknowledgment: I would like to thank Hou for discussing with me.

[1]: 
https://www.postgresql.org/message-id/CALj2ACWAdYxgzOpXrP%3DJMiOaWtAT2VjPiKw7ryGbipkSkocJ%3Dg%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v46-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description:  v46-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch


Re: Opportunistically pruning page before update

2023-10-06 Thread James Coleman
Hi,

Thanks for taking a look!

On Fri, Oct 6, 2023 at 1:18 AM Dilip Kumar  wrote:
>
> On Thu, Oct 5, 2023 at 2:35 AM James Coleman  wrote:
> >
> > I talked to Andres and Peter again today, and out of that conversation
> > I have some observations and ideas for future improvements.
> >
> > 1. The most trivial case where this is useful is INSERT: we have a
> > target page, and it may have dead tuples, so trying to prune may
> > result in us being able to use the target page rather than getting a
> > new page.
> > 2. The next most trivial case is where UPDATE (potentially after
> > failing to find space for a HOT tuple on the source tuple's page);
> > much like the INSERT case our backend's target page may benefit from
> > pruning.
>
> By looking at the patch I believe that v2-0003 is implementing these 2
> ideas.  So my question is are we planning to prune the backend's
> current target page only or if we can not find space in that then we
> are targetting to prune the other target pages as well which we are
> getting from FSM?  Because in the patch you have put a check in a loop
> it will try to prune every page it gets from the FSM not just the
> current target page of the backend.  Just wanted to understand if this
> is intentional.

Yes, just like with our opportunistically pruning on each read during
a select I think we should at least check when we have a new target
page. This seems particularly true since we're hoping to write to the
page anyway, and the cost of additionally making pruning changes to
that page is low. I looked at freespace.c, and it doesn't appear that
getting the block from the FSM does this already, so we're not
duplicating any existing work.

Regards,
James




Re: RFC: Logging plan of the running query

2023-10-06 Thread James Coleman
On Fri, Oct 6, 2023 at 8:58 AM torikoshia  wrote:
>
> On 2023-10-04 03:00, James Coleman wrote:
> > On Thu, Sep 7, 2023 at 2:09 AM torikoshia 
> > wrote:
> >>
> >> On 2023-09-06 11:17, James Coleman wrote:
> >>
> >> >> > I've never been able to reproduce it (haven't tested the new version,
> >> >> > but v28 at least) on my M1 Mac; where I've reproduced it is on Debian
> >> >> > (first buster and now bullseye).
> >> >> >
> >> >> > I'm attaching several stacktraces in the hope that they provide some
> >> >> > clues. These all match the ps output I sent earlier, though note in
> >> >> > that output there is both the regress instance and my test instance
> >> >> > (pid 3213249) running (different ports, of course, and they are from
> >> >> > the exact same compilation run). I've attached ps output for the
> >> >> > postgres processes under the make check process to simplify cross
> >> >> > referencing.
> >> >> >
> >> >> > A few interesting things:
> >> >> > - There's definitely a lock on a relation that seems to be what's
> >> >> > blocking the processes.
> >> >> > - When I try to connect with psql the process forks but then hangs
> >> >> > (see the ps output with task names stuck in "authentication"). I've
> >> >> > also included a trace from one of these.
> >> >>
> >> >> Thanks for sharing them!
> >> >>
> >> >> Many processes are waiting to acquire the LW lock, including the
> >> >> process
> >> >> trying to output the plan(select1.trace).
> >> >>
> >> >> I suspect that this is due to a lock that was acquired prior to being
> >> >> interrupted by ProcessLogQueryPlanInterrupt(), but have not been able
> >> >> to
> >> >> reproduce the same situation..
> >> >>
> >> >
> >> > I don't have time immediately to dig into this, but perhaps loading up
> >> > the core dumps would allow us to see what query is running in each
> >> > backend process (if it hasn't already been discarded by that point)
> >> > and thereby determine what point in each test process led to the error
> >> > condition?
> >>
> >> Thanks for the suggestion.
> >> I am concerned that core dumps may not be readable on different
> >> operating systems or other environments. (Unfortunately, I do not have
> >> Debian on hand)
> >>
> >> It seems that we can know what queries were running from the stack
> >> traces you shared.
> >> As described above, I suspect a lock which was acquired prior to
> >> ProcessLogQueryPlanInterrupt() caused the issue.
> >> I will try a little more to see if I can devise a way to create the
> >> same
> >> situation.
> >>
> >> > Alternatively we might be able to apply the same trick to the test
> >> > client instead...
> >> >
> >> > BTW, for my own easy reference in this thread: relid 1259 is pg_class
> >> > if I'm not mistaken.
> >>
> >> Yeah, and I think it's strange that the lock to 1259 appears twice and
> >> must be avoided.
> >>
> >>#10 0x559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1)
> >> at
> >> lmgr.c:117
> >>..
> >>#49 0x559d61b4989d in ProcessLogQueryPlanInterrupt () at
> >> explain.c:5158
> >>..
> >>#53 0x559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1)
> >> at
> >> lmgr.c:117
> >
> > I chatted with Andres and David about this at PGConf.NYC,
> Thanks again for the discussion with hackers!
>
> > and I think
> > what we need to do is explicitly disallow running this code any time
> > we are inside of lock acquisition code.
>
> Yeah, I think it's a straightforward workaround.
> And I'm also concerned that the condition of being in the process
> of acquiring some kind of lock is too strict and will make it almost
> impossible to output a running plan.

I was concerned about this too, but I was wondering if we might be
able to cheat a bit by making such a case not clear the flag but
instead just skip it for now.

Regards,
James




Add const to values and nulls arguments

2023-10-06 Thread Peter Eisentraut
There are a lot of Datum *values, bool *nulls argument pairs that should 
really be const.  The 0001 patch makes those changes.  Some of these 
hunks depend on each other.


The 0002 patch, which I'm not proposing to commit at this time, makes 
similar changes but in a way that breaks the table and index AM APIs. 
So I'm just including that here in case anyone wonders, why didn't you 
touch those.  And also maybe if we ever change that API incompatibly we 
could throw this one in then.From ee393652d792b7ac11017ad9ebf0c9331c50c27c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 6 Oct 2023 14:57:56 +0200
Subject: [PATCH 1/2] Add const to values and nulls arguments

This excludes any changes that would change the external AM APIs.
---
 src/backend/access/brin/brin.c |  4 ++--
 src/backend/access/common/heaptuple.c  | 26 +++---
 src/backend/access/common/indextuple.c | 12 +-
 src/backend/access/gist/gistutil.c |  4 ++--
 src/backend/access/hash/hashsort.c |  2 +-
 src/backend/access/index/genam.c   |  2 +-
 src/backend/access/spgist/spgutils.c   |  4 ++--
 src/backend/access/table/toast_helper.c|  2 +-
 src/backend/executor/execIndexing.c| 18 +++
 src/backend/executor/execTuples.c  |  2 +-
 src/backend/partitioning/partbounds.c  |  4 ++--
 src/backend/utils/adt/json.c   |  4 ++--
 src/backend/utils/adt/jsonb.c  | 10 -
 src/backend/utils/sort/tuplesortvariants.c |  4 ++--
 src/backend/utils/sort/tuplestore.c|  2 +-
 src/include/access/genam.h |  2 +-
 src/include/access/gist_private.h  |  4 ++--
 src/include/access/hash.h  |  2 +-
 src/include/access/htup_details.h  | 20 -
 src/include/access/itup.h  |  4 ++--
 src/include/access/spgist_private.h|  4 ++--
 src/include/access/toast_helper.h  |  2 +-
 src/include/executor/executor.h|  4 ++--
 src/include/partitioning/partbounds.h  |  4 ++--
 src/include/utils/json.h   |  8 +++
 src/include/utils/jsonb.h  |  8 +++
 src/include/utils/tuplesort.h  |  2 +-
 src/include/utils/tuplestore.h |  2 +-
 28 files changed, 83 insertions(+), 83 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index a7538f32c2..af392bc032 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -80,7 +80,7 @@ static void union_tuples(BrinDesc *bdesc, BrinMemTuple *a,
 BrinTuple *b);
 static void brin_vacuum_scan(Relation idxrel, BufferAccessStrategy strategy);
 static bool add_values_to_range(Relation idxRel, BrinDesc *bdesc,
-   BrinMemTuple 
*dtup, Datum *values, bool *nulls);
+   BrinMemTuple 
*dtup, const Datum *values, const bool *nulls);
 static bool check_null_keys(BrinValues *bval, ScanKey *nullkeys, int 
nnullkeys);
 
 /*
@@ -1774,7 +1774,7 @@ brin_vacuum_scan(Relation idxrel, BufferAccessStrategy 
strategy)
 
 static bool
 add_values_to_range(Relation idxRel, BrinDesc *bdesc, BrinMemTuple *dtup,
-   Datum *values, bool *nulls)
+   const Datum *values, const bool *nulls)
 {
int keyno;
 
diff --git a/src/backend/access/common/heaptuple.c 
b/src/backend/access/common/heaptuple.c
index ef246c901e..d6a4ddfd51 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -205,8 +205,8 @@ getmissingattr(TupleDesc tupleDesc,
  */
 Size
 heap_compute_data_size(TupleDesc tupleDesc,
-  Datum *values,
-  bool *isnull)
+  const Datum *values,
+  const bool *isnull)
 {
Sizedata_length = 0;
int i;
@@ -390,7 +390,7 @@ fill_val(Form_pg_attribute att,
  */
 void
 heap_fill_tuple(TupleDesc tupleDesc,
-   Datum *values, bool *isnull,
+   const Datum *values, const bool *isnull,
char *data, Size data_size,
uint16 *infomask, bits8 *bit)
 {
@@ -1106,8 +1106,8 @@ heap_copy_tuple_as_datum(HeapTuple tuple, TupleDesc 
tupleDesc)
  */
 HeapTuple
 heap_form_tuple(TupleDesc tupleDescriptor,
-   Datum *values,
-   bool *isnull)
+   const Datum *values,
+   const bool *isnull)
 {
HeapTuple   tuple;  /* return tuple */
HeapTupleHeader td; 

Re: ALTER COLUMN ... SET EXPRESSION to alter stored generated column's expression

2023-10-06 Thread Peter Eisentraut

On 06.10.23 14:57, Ashutosh Bapat wrote:

On Fri, Oct 6, 2023 at 6:06 PM Peter Eisentraut  wrote:


On 28.08.23 11:54, Amul Sul wrote:

Thanks for the review comments, I have fixed those in the attached
version. In
addition to that, extended syntax to have the STORE keyword as suggested by
Vik.


An additional comment: When you change the generation expression, you
need to run ON UPDATE triggers on all rows, if there are any triggers
defined.  That is because someone could have triggers defined on the
column to either check for valid values or propagate values somewhere
else, and if the expression changes, that is kind of like an UPDATE.

Similarly, I think we should consider how logical decoding should handle
this operation.  I'd imagine it should generate UPDATE events on all
rows.  A test case in test_decoding would be useful.


Should we treat it the same fashion as ALTER COLUMN ... TYPE which
rewrites the column values? Of course that rewrites the whole table,
but logically they are comparable.


I don't know.  What are the semantics of that command with respect to 
triggers and logical decoding?






Re: Allow tests to pass in OpenSSL FIPS mode

2023-10-06 Thread Peter Eisentraut

On 05.10.23 22:04, Tom Lane wrote:

On the way to testing this, I discovered that we have a usability
regression with recent OpenSSL releases.  The Fedora 35 installation
I used to use for testing FIPS-mode behavior would produce errors like



+ERROR:  could not compute MD5 hash: disabled for FIPS



In the shiny new Fedora 38 installation I just set up for the
same purpose, I'm seeing



+ERROR:  could not compute MD5 hash: unsupported


This makes sense, because the older OpenSSL works basically like

if (FIPS_mode()) {
specific_error();
}

while the new one has all crypto methods in modules, and if you load the 
fips module, then some crypto methods just don't exist.






Re: Allow tests to pass in OpenSSL FIPS mode

2023-10-06 Thread Peter Eisentraut

On 05.10.23 22:55, Tom Lane wrote:

I found another bit of fun we'll need to deal with: on my F38
platform, pgcrypto/3des fails as attached.  Some googling finds
this relevant info:

https://github.com/pyca/cryptography/issues/6875

That is, FIPS deprecation of 3DES is happening even as we speak.
So apparently we'll have little choice but to deal with two
different behaviors for that.


Hmm, interesting, so maybe there should be a new openssl 3.x release at 
the end of the year that addresses this?





Custom tstzrange with importance factored in

2023-10-06 Thread Rares Pop (Treelet)
Hi,

I want to make a custom range (and multirange type) that carries a few useful 
details. 
Looks like I may have to implement the operators, custom functions and my own 
aggregator.
Where can I find the code SQL code for anything that relates to the tstzrange 
range type?

Or would you recommend a different approach?
I essentially want to be able to aggregate multiple tstzranges - each range 
with its own importance. The aggregation would be like a a join/intersect where 
ranges with higher importance override the ones with lower importance.

Thanks!
Rares



Re: Build the docs if there are changes in docs and don't run other tasks if the changes are only in docs

2023-10-06 Thread Tom Lane
Peter Eisentraut  writes:
> I don't have a good sense of what you are trying to optimize for.  If 
> it's the mainline build-on-every-commit type, then I wonder how many 
> commits would really be affected by this.  Like, how many commits touch 
> only a README file.  If it's for things like the cfbot, then I think the 
> time-triggered builds would be more frequent than new patch versions, so 
> I don't know if these kinds of optimizations would affect anything.

As a quick cross-check, I searched our commit log to see how many
README-only commits there were so far this year.  I found 11 since
January.  (Several were triggered by the latest round of pgindent
code and process changes, so maybe this is more than typical.)

Not sure what that tells us about the value of changing the CI
logic, but it does seem like it could be worth the one-liner
change needed to teach buildfarm animals to ignore READMEs.

-   trigger_exclude => qr[^doc/|\.po$],
+   trigger_exclude => qr[^doc/|/README$|\.po$],

regards, tom lane




Problem, partition pruning for prepared statement with IS NULL clause.

2023-10-06 Thread Sergei Glukhov

Hello postgres hackers,

I noticed that combination of prepared statement with generic plan and
'IS NULL' clause could lead partition pruning to crash.

Affected versions start from 12 it seems.

'How to repeat' below and an attempt to fix it is in attachment.


Data set:
--
create function part_hashint4_noop(value int4, seed int8)
    returns int8 as $$
    select value + seed;
    $$ language sql strict immutable parallel safe;

create operator class part_test_int4_ops for type int4 using hash as
    operator 1 =,
    function 2 part_hashint4_noop(int4, int8);

create function part_hashtext_length(value text, seed int8)
    returns int8 as $$
    select length(coalesce(value, ''))::int8
    $$ language sql strict immutable parallel safe;

create operator class part_test_text_ops for type text using hash as
    operator 1 =,
    function 2 part_hashtext_length(text, int8);


create table hp (a int, b text, c int)
  partition by hash (a part_test_int4_ops, b part_test_text_ops);
create table hp0 partition of hp for values with (modulus 4, remainder 0);
create table hp3 partition of hp for values with (modulus 4, remainder 3);
create table hp1 partition of hp for values with (modulus 4, remainder 1);
create table hp2 partition of hp for values with (modulus 4, remainder 2);

insert into hp values (null, null, 0);
insert into hp values (1, null, 1);
insert into hp values (1, 'xxx', 2);
insert into hp values (null, 'xxx', 3);
insert into hp values (2, 'xxx', 4);
insert into hp values (1, 'abcde', 5);
--

Test case:
--
set plan_cache_mode to force_generic_plan;
prepare stmt AS select * from hp where a is null and b = $1;
explain execute stmt('xxx');
--


Regargs,
Gluh
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 7179b22a05..0fbc2dba33 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -3383,10 +3383,21 @@ perform_pruning_base_step(PartitionPruneContext *context,
 			bool		isnull;
 			Oid			cmpfn;
 
+			/*
+			 * IS NULL clause is not added into array of ExprStates
+			 * and thus 'stateidx' based on 'keyno' cannot be used to get
+			 * ExprStates array element. Use the index based on 'nvalues'
+			 * in order to obtain proper datum from the array.
+			 */
+			int			exprstateidx =
+PruneCxtStateIdx(context->partnatts, opstep->step.step_id,
+ nvalues);
+
 			expr = lfirst(lc1);
 			stateidx = PruneCxtStateIdx(context->partnatts,
 		opstep->step.step_id, keyno);
-			partkey_datum_from_expr(context, expr, stateidx,
+
+			partkey_datum_from_expr(context, expr, exprstateidx,
 	&datum, &isnull);
 
 			/*
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index bb1223e2b1..a16e08a405 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -1921,6 +1921,18 @@ explain (costs off) select * from hp where (a = 1 and b = 'abcde') or (a = 2 and
  Filter: (((a = 1) AND (b = 'abcde'::text)) OR ((a = 2) AND (b = 'xxx'::text)) OR ((a IS NULL) AND (b IS NULL)))
 (7 rows)
 
+-- test partition pruning with prepared statement and IS NULL clause.
+prepare stmt AS select * from hp where a is null and b = $1;
+explain (costs off) execute stmt('xxx');
+ QUERY PLAN 
+
+ Append
+   Subplans Removed: 3
+   ->  Seq Scan on hp2 hp_1
+ Filter: ((a IS NULL) AND (b = $1))
+(4 rows)
+
+deallocate stmt;
 -- test pruning when not all the partitions exist
 drop table hp1;
 drop table hp3;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 83fed54b8c..9cafbdd538 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -374,6 +374,11 @@ explain (costs off) select * from hp where a = 2 and b = 'xxx';
 explain (costs off) select * from hp where a = 1 and b = 'abcde';
 explain (costs off) select * from hp where (a = 1 and b = 'abcde') or (a = 2 and b = 'xxx') or (a is null and b is null);
 
+-- test partition pruning with prepared statement and IS NULL clause.
+prepare stmt AS select * from hp where a is null and b = $1;
+explain (costs off) execute stmt('xxx');
+deallocate stmt;
+
 -- test pruning when not all the partitions exist
 drop table hp1;
 drop table hp3;


Re: Two Window aggregate node for logically same over clause

2023-10-06 Thread Tom Lane
Ashutosh Bapat  writes:
> On Thu, Oct 5, 2023 at 8:53 PM "Anitha S"  wrote:
>> We have observed that for logically same over clause two different window 
>> aggregate nodes are created in plan.
>> The below query contains two window functions. Both Over clause contain the 
>> same partition & order clause in it. But in one over clause ordering option 
>> is mentioned as ascending but not in another over clause which represents 
>> the default option "ascending".

> Another angle is to ask: Why would the query add ASC to one window
> specification and not the other?

Yeah.  I can't get excited about doing anything about this.  We
promise to merge identical window clauses, but these aren't identical.
If you say "let's merge semantically equivalent clauses", that's
opening a fairly large can of worms --- for example, ought we to
recognize that "x + 1.0" and "x + 1.00" are equivalent?  Or even
"x" and "x + 0"?  (I'm pretty sure I've seen query hacks recommended
that depend on our *not* detecting that.)

Also, it would be an extremely bad idea IMO to change the way
equal() deals with this, which means that transformWindowFuncCall
would have to use bespoke code not equal() to check for matches.
That'd be ugly and a permanent maintenance gotcha.

regards, tom lane




Re: Add const to values and nulls arguments

2023-10-06 Thread Aleksander Alekseev
Hi,

> There are a lot of Datum *values, bool *nulls argument pairs that should
> really be const.  The 0001 patch makes those changes.  Some of these
> hunks depend on each other.
>
> The 0002 patch, which I'm not proposing to commit at this time, makes
> similar changes but in a way that breaks the table and index AM APIs.
> So I'm just including that here in case anyone wonders, why didn't you
> touch those.  And also maybe if we ever change that API incompatibly we
> could throw this one in then.

0001 looks good to me and passes the tests in several environments,
not surprisingly. Let's merge it.

-- 
Best regards,
Aleksander Alekseev




Re: pg16: invalid page/page verification failed

2023-10-06 Thread Justin Pryzby
On Fri, Oct 06, 2023 at 09:20:05AM +0900, Michael Paquier wrote:
> On Thu, Oct 05, 2023 at 11:45:18AM -0500, Justin Pryzby wrote:
> > This table is what it sounds like: a partition into which CSV logs are
> > COPY'ed.  It would've been created around 8am.  There's no special
> > params set for the table nor for autovacuum.
> 
> This may be an important bit of information.  31966b151e6a is new as
> of Postgres 16, has changed the way relations are extended and COPY
> was one area touched.  I am adding Andres in CC.

Also, I realized that someone kicked off a process just after 9am which
would've done a lot of INSERT ON CONFLICT DO UPDATE, VACUUM FULL, and
VACUUM.  Which consumed and dirtied buffers about 100x faster than normal.

log_time | 2023-10-05 10:00:55.794-05
pid  | 31754
left | duration: 51281.001 ms  statement: VACUUM (FULL,FREEZE) 
othertable...

log_time | 2023-10-05 10:01:01.784-05
backend_type | checkpointer
left | checkpoint starting: time

log_time | 2023-10-05 10:01:02.935-05
pid  | 10023
left | page verification failed, calculated checksum 5074 but 
expected 5050
context  | COPY postgres_log, line 947
left | COPY postgres_log FROM 
'/var/log/postgresql/postgresql-2023-10-05_095600.csv' WITH csv

log_time | 2023-10-05 10:01:02.935-05
pid  | 10023
left | invalid page in block 119 of relation base/16409/801594131
context  | COPY postgres_log, line 947
left | COPY postgres_log FROM 
'/var/log/postgresql/postgresql-2023-10-05_095600.csv' WITH csv

log_time | 2023-10-05 10:01:11.636-05
pid  | 31754
left | duration: 15838.374 ms  statement: VACUUM (FREEZE) 
othertable...

I meant to point out that the issue is on the last block.

postgres=# SELECT 
pg_relation_size('"BROKEN_postgres_log_2023_10_05_0900"')/8192;
?column? | 120

It sounds like there may be an issue locking (pinning?) a page, or
rather not locking it, or releasing the lock too early.

-- 
Justin




Re: typo in couple of places

2023-10-06 Thread Dagfinn Ilmari Mannsåker
vignesh C  writes:

> Hi,
>
> I noticed a couple of typos in code. "the the" should have been "the",
> attached patch has the changes for the same.

This made me curious about other duplicate word occurrences, and after a
few minutes of increasingly-elaborate regexing to exclude false
postives, I found a few more (plus a bonus a _missing_ "the").  See the
attached patch (which includes your originl one, for completeness).

> Regards,
> Vignesh

- ilmari

>From 758e890b3c259f242f3ff6fe4b0bfc48d8d48dcd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Fri, 6 Oct 2023 12:55:52 +0100
Subject: [PATCH] Fix duplicate words in docs and code comments

And add a missing "the" in a couple of places.
---
 .cirrus.star  | 2 +-
 contrib/citext/expected/citext_utf8.out   | 2 +-
 contrib/citext/expected/citext_utf8_1.out | 2 +-
 contrib/citext/sql/citext_utf8.sql| 2 +-
 doc/src/sgml/logical-replication.sgml | 2 +-
 doc/src/sgml/ref/create_subscription.sgml | 2 +-
 src/backend/access/nbtree/nbtsearch.c | 8 
 src/backend/access/transam/xlogreader.c   | 2 +-
 src/backend/executor/nodeHashjoin.c   | 2 +-
 src/test/regress/expected/tuplesort.out   | 4 ++--
 src/test/regress/sql/tuplesort.sql| 4 ++--
 11 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/.cirrus.star b/.cirrus.star
index d2d6ceca20..36233872d1 100644
--- a/.cirrus.star
+++ b/.cirrus.star
@@ -46,7 +46,7 @@ def main():
 
 def config_from(config_src):
 """return contents of config file `config_src`, surrounded by markers
-indicating start / end of the the included file
+indicating start / end of the included file
 """
 
 config_contents = fs.read(config_src)
diff --git a/contrib/citext/expected/citext_utf8.out b/contrib/citext/expected/citext_utf8.out
index 6630e09a4d..5d988dcd48 100644
--- a/contrib/citext/expected/citext_utf8.out
+++ b/contrib/citext/expected/citext_utf8.out
@@ -2,7 +2,7 @@
  * This test must be run in a database with UTF-8 encoding
  * and a Unicode-aware locale.
  *
- * Also disable this file for ICU, because the test for the the
+ * Also disable this file for ICU, because the test for the
  * Turkish dotted I is not correct for many ICU locales. citext always
  * uses the default collation, so it's not easy to restrict the test
  * to the "tr-TR-x-icu" collation where it will succeed.
diff --git a/contrib/citext/expected/citext_utf8_1.out b/contrib/citext/expected/citext_utf8_1.out
index 3caa7a00d4..7065a5da19 100644
--- a/contrib/citext/expected/citext_utf8_1.out
+++ b/contrib/citext/expected/citext_utf8_1.out
@@ -2,7 +2,7 @@
  * This test must be run in a database with UTF-8 encoding
  * and a Unicode-aware locale.
  *
- * Also disable this file for ICU, because the test for the the
+ * Also disable this file for ICU, because the test for the
  * Turkish dotted I is not correct for many ICU locales. citext always
  * uses the default collation, so it's not easy to restrict the test
  * to the "tr-TR-x-icu" collation where it will succeed.
diff --git a/contrib/citext/sql/citext_utf8.sql b/contrib/citext/sql/citext_utf8.sql
index 1f51df134b..34b232d64e 100644
--- a/contrib/citext/sql/citext_utf8.sql
+++ b/contrib/citext/sql/citext_utf8.sql
@@ -2,7 +2,7 @@
  * This test must be run in a database with UTF-8 encoding
  * and a Unicode-aware locale.
  *
- * Also disable this file for ICU, because the test for the the
+ * Also disable this file for ICU, because the test for the
  * Turkish dotted I is not correct for many ICU locales. citext always
  * uses the default collation, so it's not easy to restrict the test
  * to the "tr-TR-x-icu" collation where it will succeed.
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index fbf8ad669e..3b2fa1129e 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1788,7 +1788,7 @@
   
 
   
-   To create a subscription, the user must have the privileges of the
+   To create a subscription, the user must have the privileges of
the pg_create_subscription role, as well as
CREATE privileges on the database.
   
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 71652fd918..c1bafbfa06 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -51,7 +51,7 @@
   
 
   
-   To be able to create a subscription, you must have the privileges of the
+   To be able to create a subscription, you must have the privileges of
the pg_create_subscription role, as well as
CREATE privileges on the current database.
   
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index c47eaed0e9..efc5284e5b 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1664,8 +1664,8 @@ _bt_readpage(IndexScanDesc scan, ScanDirection di

Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE

2023-10-06 Thread Dmitry Dolgov
> On Wed, Sep 20, 2023 at 03:42:43PM +0200, David Geier wrote:
> Another, not yet discussed, option I can see work is:
>
> 4. Allocate a fixed amount of memory for the instrumentation stats based on
> MAX_PARALLEL_WORKER_LIMIT: MAX_PARALLEL_WORKER_LIMIT is 1024 and used as the
> limit of the max_parallel_workers GUC. This way MAX_PARALLEL_WORKER_LIMIT *
> sizeof(BitmapHeapScanInstrumentation) = 1024 * 8 = 8192 bytes would be
> allocated. To cut this down in half we could additionally change the type of
> lossy_pages and exact_pages from long to uint32. Only possibly needed memory
> would have to get initialized, the remaining unused memory would remain
> untouched to not waste cycles.

I'm not sure that it would be acceptable -- if I understand correctly it
would be 8192 bytes per parallel bitmap heap scan node, and could be
noticeable in the worst case scenario with too many connections and too
many such nodes in every query.

I find the original approach with an offset not that bad, after all
there is something similar going on in other places, e.g. parallel heap
scan also has phs_snapshot_off (although the rest is fixed sized). My
commentary above in the thread was mostly about the cosmetic side.
Giving snapshot_and_stats a decent name and maybe even ditching the
access functions, using instead only the offset field couple of times,
and it would look better to me.




Re: pg16: invalid page/page verification failed

2023-10-06 Thread Andres Freund
Hi,

On 2023-10-06 09:20:05 +0900, Michael Paquier wrote:
> On Thu, Oct 05, 2023 at 11:45:18AM -0500, Justin Pryzby wrote:
> > This table is what it sounds like: a partition into which CSV logs are
> > COPY'ed.  It would've been created around 8am.  There's no special
> > params set for the table nor for autovacuum.
> 
> This may be an important bit of information.  31966b151e6a is new as
> of Postgres 16, has changed the way relations are extended and COPY
> was one area touched.  I am adding Andres in CC.

Hm, is there any chance the COPY targets more than one partition? If so, this
sounds like it might be the issue described here
https://postgr.es/m/20230925213746.fwqauhhifjgefyzk%40alap3.anarazel.de

I think at this stage the easiest fix might be just to copy the approach of
calling ReleaseBulkInsertStatePin(), even though I think that's
architecturally wrong.

Greetings,

Andres Freund




Re: RFC: Logging plan of the running query

2023-10-06 Thread Andres Freund
Hi,

On 2023-10-06 21:58:46 +0900, torikoshia wrote:
> Yeah, I think it's a straightforward workaround.
> And I'm also concerned that the condition of being in the process
> of acquiring some kind of lock is too strict and will make it almost
> impossible to output a running plan.

How so? We shouldn't commonly acquire relevant locks while executing a query?
With a few exceptions, they should instead be acquired t the start of query
processing. We do acquire a lot of lwlocks, obviously, but we don't process
interrupts during the acquisition / holding of lwlocks.

And presumably the interrupt would just be processed the next time interrupt
processing is happening?

Greetings,

Andres Freund




FDW LIM IT pushdown

2023-10-06 Thread Michał Kłeczek
Hello hackers,

First timer here with a question:

I’ve searched through various e-mail threads and documents and could not find 
if pushdown of LIMIT clauses to FDW is implemented.
Seen some conversation about that a couple of years ago when that was treated 
as a feature request but nothing  since then.
Citus does that according to its documentation but for now we are trying to 
base our solution on “plain” Postgres.

If it is not implemented - could you, guys, provide me with some pointers to 
source code where I could look at and try to implement that?


Thanks,
Michal



Re: document the need to analyze partitioned tables

2023-10-06 Thread Bruce Momjian
On Mon, Oct  2, 2023 at 04:48:20AM +0200, Laurenz Albe wrote:
> On Fri, 2023-09-29 at 22:34 -0400, Bruce Momjian wrote:
> > Very good point!   Updated patch attached.
> 
> Thanks!  Some small corrections:
> 
> > diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
> > index 9cf9d030a8..be1c522575 100644
> > --- a/doc/src/sgml/maintenance.sgml
> > +++ b/doc/src/sgml/maintenance.sgml
> > @@ -861,10 +861,16 @@ analyze threshold = analyze base threshold + analyze 
> > scale factor * number of tu
> > 
> >  
> > 
> > -Partitioned tables are not processed by autovacuum.  Statistics
> > -should be collected by running a manual ANALYZE 
> > when it is
> > -first populated, and again whenever the distribution of data in its
> > -partitions changes significantly.
> > +Partitioned tables do not directly store tuples and consequently
> > +autovacuum does not VACUUM them.  (Autovacuum does
> 
> ... does not VACUUM or ANALYZE them.
> 
> Perhaps it would be shorter to say "does not process them" like the
> original wording.
> 
> > +perform VACUUM on table partitions just like other
> 
> Just like *on* other tables, right?

Good points, updated patch attached.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 9cf9d030a8..3018d8f64c 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -861,10 +861,15 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu

 

-Partitioned tables are not processed by autovacuum.  Statistics
-should be collected by running a manual ANALYZE when it is
-first populated, and again whenever the distribution of data in its
-partitions changes significantly.
+Partitioned tables do not directly store tuples and consequently
+are not processed by autovacuum.  (Autovacuum does process table
+partitions just like other tables.)  Unfortunately, this means that
+autovacuum does  not run ANALYZE on partitioned
+tables, and this can cause suboptimal plans for queries that reference
+partitioned table statistics.  You can work around this problem by
+manually running ANALYZE on partitioned tables
+when they are first populated, and again whenever the distribution
+of data in their partitions changes significantly.

 



Re: Synchronizing slots from primary to standby

2023-10-06 Thread Amit Kapila
On Wed, Oct 4, 2023 at 5:34 PM Drouvot, Bertrand
 wrote:
>
> On 10/4/23 1:50 PM, shveta malik wrote:
> > On Wed, Oct 4, 2023 at 5:00 PM Amit Kapila  wrote:
> >>
> >> On Wed, Oct 4, 2023 at 11:55 AM Drouvot, Bertrand
> >>  wrote:
> >>>
> >>> On 10/4/23 6:26 AM, shveta malik wrote:
>  On Wed, Oct 4, 2023 at 5:36 AM Amit Kapila  
>  wrote:
> >
> >
> > How about an alternate scheme where we define sync_slot_names on
> > standby but then store the physical_slot_name in the corresponding
> > logical slot (ReplicationSlotPersistentData) to be synced? So, the
> > standby will send the list of 'sync_slot_names' and the primary will
> > add the physical standby's slot_name in each of the corresponding
> > sync_slot. Now, if we do this then even after restart, we should be
> > able to know for which physical slot each logical slot needs to wait.
> > We can even provide an SQL API to reset the value of
> > standby_slot_names in logical slots as a way to unblock decoding in
> > case of emergency (for example, corresponding when physical standby
> > never comes up).
> >
> 
> 
>  Looks like a better approach to me. It solves most of the pain points 
>  like:
>  1) Avoids the need of multiple GUCs
>  2) Primary and standby need not to worry to be in sync if we maintain
>  sync-slot-names GUC on both
> >>
> >> As per my understanding of this approach, we don't want
> >> 'sync-slot-names' to be set on the primary. Do you have a different
> >> understanding?
> >>
> >
> > Same understanding. We do not need it to be set on primary by user. It
> > will be GUC on standby and standby will convey it to primary.
>
> +1, same understanding here.
>

At PGConf NYC, I had a brief discussion on this topic with Andres
where yet another approach to achieve this came up. Have a parameter
like enable_failover at the slot level (this will be persistent
information). Users can set it during the create/alter subscription or
via pg_create_logical_replication_slot(). Also, on physical standby,
there will be a parameter like enable_syncslot. All the physical
standbys that have set enable_syncslot will receive all the logical
slots that are marked as enable_failover. To me, whether to sync a
particular slot is a slot-level property, so defining it in this new
way seems reasonable.

I think this will simplify the scheme a bit but still, the list of
physical standby's for which logical slots wait during decoding needs
to be maintained as we thought. But, how about with the above two
parameters (enable_failover and enable_syncslot), we have
standby_slot_names defined on the primary. That avoids the need to
store the list of standby_slot_names in logical slots and simplifies
the implementation quite a bit, right? Now, one can think if we have a
parameter like 'standby_slot_names' then why do we need
enable_syncslot on physical standby but that will be required to
invoke sync worker which will pull logical slot's information? The
advantage of having standby_slot_names defined on primary is that we
can selectively wait on the subset of physical standbys where we are
syncing the slots. I think this will be something similar to
'synchronous_standby_names' in the sense that the physical standbys
mentioned in standby_slot_names will behave as synchronous copies with
respect to slots and after failover user can switch to one of these
physical standby and others can start following new master/publisher.

Thoughts?

-- 
With Regards,
Amit Kapila.




Re: document the need to analyze partitioned tables

2023-10-06 Thread Laurenz Albe
On Fri, 2023-10-06 at 12:20 -0400, Bruce Momjian wrote:
> Good points, updated patch attached.

That patch is good to go, as far as I am concerned.

Yours,
Laurenz Albe




Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Jeff Davis
On Fri, 2023-10-06 at 09:58 +0200, Peter Eisentraut wrote:
> If you want to be rigid about it, you also need to consider whether
> the 
> Unicode version used by the ICU library in use matches the one used
> by 
> the in-core tables.

What problem are you concerned about here? I thought about it and I
didn't see an obvious issue.

If the ICU unicode version is ahead of the Postgres unicode version,
and no unassigned code points are used according to the Postgres
version, then there's no problem.

And in the other direction, there might be some code points that are
assigned according to the postgres unicode version but unassigned
according to the ICU version. But that would be tracked by the
collation version as you pointed out earlier, so upgrading ICU would be
like any other ICU upgrade (with the same risks). Right?

Regards,
Jeff Davis





Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Robert Haas
On Thu, Oct 5, 2023 at 3:15 PM Nico Williams  wrote:
> Text+encoding can be just like bytea with a one- or two-byte prefix
> indicating what codeset+encoding it's in.  That'd be how to encode
> such text values on the wire, though on disk the column's type should
> indicate the codeset+encoding, so no need to add a prefix to the value.

Well, that would be making the encoding a per-value property, rather
than a per-column property like collation as I proposed. I can't see
that working out very nicely, because encodings are
collation-specific. It wouldn't make any sense if the column collation
were en_US.UTF8 or ko_KR.eucKR or en_CA.ISO8859-1 (just to pick a few
values that are legal on my machine) while data stored in the column
was from a whole bunch of different encodings, at most one of which
could be the one to which the column's collation applied. That would
end up meaning, for example, that such a column was very hard to sort.

For that and other reasons, I suspect that the utility of storing data
from a variety of different encodings in the same database column is
quite limited. What I think people really want is a whole column in
some encoding that isn't the normal one for that database. That's not
to say we should add such a feature, but if we do, I think it should
be that, not a different encoding for every individual value.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Nico Williams
On Fri, Oct 06, 2023 at 01:33:06PM -0400, Robert Haas wrote:
> On Thu, Oct 5, 2023 at 3:15 PM Nico Williams  wrote:
> > Text+encoding can be just like bytea with a one- or two-byte prefix
> > indicating what codeset+encoding it's in.  That'd be how to encode
> > such text values on the wire, though on disk the column's type should
> > indicate the codeset+encoding, so no need to add a prefix to the value.
> 
> Well, that would be making the encoding a per-value property, rather
> than a per-column property like collation as I proposed. I can't see

On-disk it would be just a property of the type, not part of the value.

Nico
-- 




Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Jeff Davis
On Thu, 2023-10-05 at 14:52 -0500, Nico Williams wrote:
> This is just how you encode the type of the string.  You have any
> number
> of options.  The point is that already PG can encode binary data, so
> if
> how to encode text of disparate encodings on the wire, building on
> top
> of the encoding of bytea is an option.

There's another significant discussion going on here:

https://www.postgresql.org/message-id/ca+tgmoz8r8xb_73wzkhgb00cv3tphv_u0rhuzzmfkvlepdu...@mail.gmail.com

about how to handle binary formats better, so it's not clear to me that
it's a great precedent to expand upon. At least not yet.

I think it would be interesting to think more generally about these
representational issues in a way that accounds for binary formats,
extra_float_digits, client_encoding, etc. But I see that as more of an
issue with how the client expects to receive the data -- nobody has a
presented a reason in this thread that we need per-column encodings on
the server.

Regards,
Jeff Davis





Re: New WAL record to detect the checkpoint redo location

2023-10-06 Thread Robert Haas
On Thu, Oct 5, 2023 at 2:34 PM Andres Freund  wrote:
> If I add an unlikely around if (rechdr->xl_rmid == RM_XLOG_ID), the
> performance does improve. But that "only" brings it up to 322.406. Not sure
> what the rest is.

I don't really think this is worth worrying about. A sub-one-percent
regression on a highly artificial test case doesn't seem like a big
deal. Anybody less determined than you would have been unable to
measure that there even is a regression in the first place, and that's
basically everyone.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: FDW LIM IT pushdown

2023-10-06 Thread Michał Kłeczek
Sorry, I wasn’t precise - my question is about foreign partitions - LIMIT on 
ordinary tables is supported.

Thanks,
Michal

> On 6 Oct 2023, at 18:02, Michał Kłeczek  wrote:
> 
> Hello hackers,
> 
> First timer here with a question:
> 
> I’ve searched through various e-mail threads and documents and could not find 
> if pushdown of LIMIT clauses to FDW is implemented.
> Seen some conversation about that a couple of years ago when that was treated 
> as a feature request but nothing  since then.
> Citus does that according to its documentation but for now we are trying to 
> base our solution on “plain” Postgres.
> 
> If it is not implemented - could you, guys, provide me with some pointers to 
> source code where I could look at and try to implement that?
> 
> 
> Thanks,
> Michal





Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Robert Haas
On Fri, Oct 6, 2023 at 1:38 PM Nico Williams  wrote:
> On Fri, Oct 06, 2023 at 01:33:06PM -0400, Robert Haas wrote:
> > On Thu, Oct 5, 2023 at 3:15 PM Nico Williams  wrote:
> > > Text+encoding can be just like bytea with a one- or two-byte prefix
> > > indicating what codeset+encoding it's in.  That'd be how to encode
> > > such text values on the wire, though on disk the column's type should
> > > indicate the codeset+encoding, so no need to add a prefix to the value.
> >
> > Well, that would be making the encoding a per-value property, rather
> > than a per-column property like collation as I proposed. I can't see
>
> On-disk it would be just a property of the type, not part of the value.

I mean, that's not how it works.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Nico Williams
On Fri, Oct 06, 2023 at 02:17:32PM -0400, Robert Haas wrote:
> On Fri, Oct 6, 2023 at 1:38 PM Nico Williams  wrote:
> > On Fri, Oct 06, 2023 at 01:33:06PM -0400, Robert Haas wrote:
> > > On Thu, Oct 5, 2023 at 3:15 PM Nico Williams  
> > > wrote:
> > > > Text+encoding can be just like bytea with a one- or two-byte prefix
> > > > indicating what codeset+encoding it's in.  That'd be how to encode
> > > > such text values on the wire, though on disk the column's type should
> > > > indicate the codeset+encoding, so no need to add a prefix to the value.
> > >
> > > Well, that would be making the encoding a per-value property, rather
> > > than a per-column property like collation as I proposed. I can't see
> >
> > On-disk it would be just a property of the type, not part of the value.
> 
> I mean, that's not how it works.

Sure, because TEXT in PG doesn't have codeset+encoding as part of it --
it's whatever the database's encoding is.  Collation can and should be a
porperty of a column, since for Unicode it wouldn't be reasonable to
make that part of the type.  But codeset+encoding should really be a
property of the type if PG were to support more than one.  IMO.




Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Robert Haas
On Fri, Oct 6, 2023 at 2:25 PM Nico Williams  wrote:
> > > > Well, that would be making the encoding a per-value property, rather
> > > > than a per-column property like collation as I proposed. I can't see
> > >
> > > On-disk it would be just a property of the type, not part of the value.
> >
> > I mean, that's not how it works.
>
> Sure, because TEXT in PG doesn't have codeset+encoding as part of it --
> it's whatever the database's encoding is.  Collation can and should be a
> porperty of a column, since for Unicode it wouldn't be reasonable to
> make that part of the type.  But codeset+encoding should really be a
> property of the type if PG were to support more than one.  IMO.

No, what I mean is, you can't just be like "oh, the varlena will be
different in memory than on disk" as if that were no big deal.

I agree that, as an alternative to encoding being a column property,
it could instead be completely a type property, meaning that if you
want to store, say, LATIN1 text in your UTF-8 database, you first
create a latint1text data type and then use it, rather than, as in the
model I proposed, creating a text column and then applying a setting
like ENCODING latin1 to it. I think that there might be some problems
with that model, but it could also have some benefits. If someone were
going to make a run at implementing this, they might want to consider
both designs and evaluate the tradeoffs.

But, even if we were all convinced that this kind of feature was good
to add, I think it would almost certainly be wrong to invent new
varlena features along the way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Index range search optimization

2023-10-06 Thread Konstantin Knizhnik



On 04/10/2023 3:00 am, Alexander Korotkov wrote:

On Wed, Oct 4, 2023 at 12:59 AM Pavel Borisov  wrote:

I've looked through the patch v8. I think it's good enough to be
pushed if Peter has no objections.

Thank you, Pavel.
I'll push this if there are no objections.

--
Regards,
Alexander Korotkov



Sorry, can you please also mention that original idea of this 
optimization belongs to Ilya Anfimov (it was discussed in @pgsql 
Telegram chat).






Re: Remove distprep

2023-10-06 Thread Andres Freund
Hi,

On 2023-10-05 17:46:46 +0200, Peter Eisentraut wrote:
> The attached updated patch is also split up like Andres suggested nearby.

Thanks.


> (Not sure if it would be good to commit it that way, but it's easier to look
> at for now for sure.)

I'd push together, but I think the split is useful when looking later as well.

I played around with these for a bit without finding an issue.


The only thing I wonder is whether we ought to keep a maintainer-clean target
(as an alias to distclean), so that extensions that added things to
maintainer-clean continue to work.

Greetings,

Andres Freund




Re: Index range search optimization

2023-10-06 Thread Pavel Borisov
Hi, Konstantin!

On Fri, 6 Oct 2023 at 22:44, Konstantin Knizhnik  wrote:
>
>
> On 04/10/2023 3:00 am, Alexander Korotkov wrote:
> > On Wed, Oct 4, 2023 at 12:59 AM Pavel Borisov  
> > wrote:
> >> I've looked through the patch v8. I think it's good enough to be
> >> pushed if Peter has no objections.
> > Thank you, Pavel.
> > I'll push this if there are no objections.
> >
> > --
> > Regards,
> > Alexander Korotkov
>
>
> Sorry, can you please also mention that original idea of this
> optimization belongs to Ilya Anfimov (it was discussed in @pgsql
> Telegram chat).

While it's no doubt correct to mention all authors of the patch, I
looked through the thread and saw no mentions of Ilya's
contributions/ideas before the patch became pushed. I'm not up to the
current policy for processing these requests, but I suppose it's
complicated to introduce back changes into the main branch that is
already ahead of patch e0b1ee17dc3a38.

Regards,
Pavel




Re: Annoying build warnings from latest Apple toolchain

2023-10-06 Thread Andres Freund
Hi,

On 2023-10-05 13:37:38 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I think you can just pass c_args directly to executable() here, I think 
> > adding
> > c_args to default_bin_args would be a bad idea.
> 
> Hm.  IIUC that would result in an error if someone did try to
> put some c_args into default_bin_args, and I didn't think it would
> be appropriate for this code to fail in such a case.  Still, I see
> there are a bunch of other ways to inject globally-used compilation
> flags, so maybe you're right that it'd never need to happen.

I think the other ways of injecting c_args compose better (and indeed other
options are either injected globally, or via declare_dependency()).

Greetings,

Andres Freund




Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Jeff Davis
On Fri, 2023-10-06 at 13:33 -0400, Robert Haas wrote:
> What I think people really want is a whole column in
> some encoding that isn't the normal one for that database.

Do people really want that? I'd be curious to know why.

A lot of modern projects are simply declaring UTF-8 to be the "one true
way". I am not suggesting that we do that, but it seems odd to go in
the opposite direction and have greater flexibility for many encodings.

Regards,
Jeff Davis





Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Isaac Morland
On Fri, 6 Oct 2023 at 15:07, Jeff Davis  wrote:

> On Fri, 2023-10-06 at 13:33 -0400, Robert Haas wrote:
> > What I think people really want is a whole column in
> > some encoding that isn't the normal one for that database.
>
> Do people really want that? I'd be curious to know why.
>
> A lot of modern projects are simply declaring UTF-8 to be the "one true
> way". I am not suggesting that we do that, but it seems odd to go in
> the opposite direction and have greater flexibility for many encodings.
>

And even if they want it, we can give it to them when we send/accept the
data from the client; just because they want to store ISO-8859-1 doesn't
mean the actual bytes on the disk need to be that. And by "client" maybe I
mean the client end of the network connection, and maybe I mean the program
that is calling in to libpq.

If they try to submit data that cannot possibly be encoded in the stated
encoding because the bytes they submit don't correspond to any string in
that encoding, then that is unambiguously an error, just as trying to put
February 30 in a date column is an error.

Is there a single other data type where anybody is even discussing letting
the client tell us how to write the data on disk?


Re: [PoC/RFC] Multiple passwords, interval expirations

2023-10-06 Thread Nathan Bossart
On Thu, Oct 05, 2023 at 01:09:36PM -0700, Jeff Davis wrote:
> On Thu, 2023-10-05 at 14:04 -0500, Nathan Bossart wrote:
>> That way, we needn't restrict this feature to 2 passwords for
>> everyone.  Perhaps 2 should be the default, but in any case, IMO we
>> shouldn't design to only support 2.
> 
> Are there use cases for lots of passwords, or is it just a matter of
> not introducing an artificial limitation?

I guess it's more of the latter.  Perhaps one potential use case would be
short-lived credentials that are created on demand.  Such a password might
only be valid for something like 15 minutes, and many users might have the
ability to request a password for the database role.  I don't know whether
there is a ton of demand for such a use case, and it might already be
solvable by just creating separate roles.  In any case, if there's general
agreement that we only want to target the rotation use case, that's fine by
me.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: should frontend tools use syncfs() ?

2023-10-06 Thread Nathan Bossart
On Fri, Oct 06, 2023 at 10:50:11AM +0300, Maxim Orlov wrote:
> Back to the patch v11. I don’t understand a bit, what we should do next?
> Make a separate thread or put this one on commitfest?

>From a quick skim, this one looks pretty good to me.  Would you mind adding
it to the commitfest so that it doesn't get lost?  I will aim to take a
closer look at it next week.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Where can I find the doxyfile?

2023-10-06 Thread Bogdan Mart
On 2008-06-02 18:33:42 Stefan Kaltenbrunner wrote:
>
> Zdenek Kotala wrote:
> > Xin Wang napsal(a):
> >> Hi,
> >> I don't know where I can find the doxyfile which generate
> >> "doxygen.postgresql.org" web site. I found that when reading code the
> >> doxygen source code is quite helpful. However, I want to generate an
> >> off-line copy of doxygen docs myself, but I can't find the doxyfile in
> >> the lastest source release.
> >
> > I think it is good idea. Stefan, what's about put it on the wiki?
>
> don't think the wiki is a good place (we would have to maintain it in
> addition to the main file) so I simply added a link on
> http:/doxygen.postgresql.org that contains the current copy of the
> configuration file that was used to generate a particular set of docs.
> However some of the things there are specific to our install so a bit
> (like some of the paths) of manual change will be required anyway.

Hello, sorry for resurrecting this old discussion, and necroposting,
but this was relevant.

It turns out that way back in 2008 it was possible to download Doxyfile
from https://doxygen.postgresql.org/ , which can be seen at:
https://web.archive.org/web/20080915132924/https://doxygen.postgresql.org/

I was able to download this Doxyfile from the web archive, and it actually
worked for me, after I changed the path to postgres sources and removed
header template. It even produced same result as on current web-site.

It would be nice if this doxygen file would be placed somewhere
on the current site.

P.S. I've tried to run doxygen on postgres sources without config,
and it didn't find any source file, this archived thread helped.

Perhaps doxygen file can be just placed into postgres git repository?

Best regards,
Bohdan Mart.




Re: Good News Everyone! + feature proposal

2023-10-06 Thread Bruce Momjian
On Thu, Oct  5, 2023 at 05:10:37AM -0700, Gurjeet Singh wrote:
> I wish there was a directory of IRC identities that pointed to real
> identities (at least for folks who don't mind this mapping available
> in the open), so that when we converse in IRC, we have a face to go
> with the IRC handles. As a human I feel that necessity.

There is:

https://wiki.postgresql.org/wiki/IRC2RWNames

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Restoring default privileges on objects

2023-10-06 Thread Laurenz Albe
On Wed, 2023-08-30 at 12:00 +0200, Peter J. Holzer wrote:
> On 2023-08-29 14:44:48 -0600, Stuart McGraw wrote:
> > On 8/29/23 13:27, Tom Lane wrote:
> > > Fixing \dp to honor "\pset null" for this might be a reasonable
> > > thing to do too.  I'm actually a bit surprised that that doesn't
> > > work already.
> > 
> > That change would still require someone using \dp to realize that
> > the "Access privileges" value could be either '' or NULL (I guess
> > that could be pointed out more obviously in the psql doc), and then
> > do a '\pset null' before doing \dp?  That seems a little inconvenient.
> 
> Or just always do a \pset null. For me printing NULL the same as an
> empty string is just as confusing in normal tables, so that's the first
> line in my ~/.psqlrc. YMMV, of course.
> 
> But I guess the point is that people who do \pset null expect to be able
> to distinguish '' and NULL visually and might be surprised if that
> doesn't work everywhere, while people who don't \pset null know that ''
> and NULL are visually indistinguishable and that they may need some
> other way to distinguish them if the difference matters.
> 
> So +1 for me fixing \dp to honor "\pset null".

+1

Here is a patch that does away with the special handling of NULL values
in psql backslash commands.

Yours,
Laurenz Albe




Re: Restoring default privileges on objects

2023-10-06 Thread Laurenz Albe
On Fri, 2023-10-06 at 22:16 +0200, Laurenz Albe wrote:
> Here is a patch that does away with the special handling of NULL values
> in psql backslash commands.

Erm, I forgot to attach the patch.

Yours,
Laurenz Albe
From 6c67f15f011ddf1e309cb7e84580b266d674a1e2 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Fri, 6 Oct 2023 22:12:33 +0200
Subject: [PATCH] psql: honor "\pset null" in backslash commands

In the output of backslash commands, NULL was always displayed
as empty string, rather than honoring the values set with
"\pset null".

This was surprising, and in some cases it was downright confusing:
for example, default privileges (stored as NULL) were displayed
just like an empty aclitem[], making these cases undistinguishable
in the output.

Discussion: https://postgr.es/m/96d6885a-5e25-9ae8-4a1a-d7e557a5fe9c%40mtneva.com
---
 src/bin/psql/describe.c | 43 -
 1 file changed, 43 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index bac94a338c..224aa22575 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -124,7 +124,6 @@ describeAggregates(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of aggregate functions");
 	myopt.translate_header = true;
 
@@ -197,7 +196,6 @@ describeAccessMethods(const char *pattern, bool verbose)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of access methods");
 	myopt.translate_header = true;
 	myopt.translate_columns = translate_columns;
@@ -262,7 +260,6 @@ describeTablespaces(const char *pattern, bool verbose)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of tablespaces");
 	myopt.translate_header = true;
 
@@ -585,7 +582,6 @@ describeFunctions(const char *functypes, const char *func_pattern,
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of functions");
 	myopt.translate_header = true;
 	if (pset.sversion >= 90600)
@@ -702,7 +698,6 @@ describeTypes(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of data types");
 	myopt.translate_header = true;
 
@@ -893,7 +888,6 @@ describeOperators(const char *oper_pattern,
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of operators");
 	myopt.translate_header = true;
 
@@ -995,7 +989,6 @@ listAllDbs(const char *pattern, bool verbose)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of databases");
 	myopt.translate_header = true;
 
@@ -1146,7 +1139,6 @@ permissionsList(const char *pattern, bool showSystem)
 	if (!res)
 		goto error_return;
 
-	myopt.nullPrint = NULL;
 	printfPQExpBuffer(&buf, _("Access privileges"));
 	myopt.title = buf.data;
 	myopt.translate_header = true;
@@ -1218,7 +1210,6 @@ listDefaultACLs(const char *pattern)
 	if (!res)
 		goto error_return;
 
-	myopt.nullPrint = NULL;
 	printfPQExpBuffer(&buf, _("Default access privileges"));
 	myopt.title = buf.data;
 	myopt.translate_header = true;
@@ -1417,7 +1408,6 @@ objectDescription(const char *pattern, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("Object descriptions");
 	myopt.translate_header = true;
 	myopt.translate_columns = translate_columns;
@@ -3852,7 +3842,6 @@ listDbRoleSettings(const char *pattern, const char *pattern2)
 	}
 	else
 	{
-		myopt.nullPrint = NULL;
 		myopt.title = _("List of settings");
 		myopt.translate_header = true;
 
@@ -3926,7 +3915,6 @@ describeRoleGrants(const char *pattern, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of role grants");
 	myopt.translate_header = true;
 
@@ -4122,7 +4110,6 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 	}
 	else
 	{
-		myopt.nullPrint = NULL;
 		myopt.title = _("List of relations");
 		myopt.translate_header = true;
 		myopt.translate_columns = translate_columns;
@@ -4332,7 +4319,6 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
 	initPQExpBuffer(&title);
 	appendPQExpBufferStr(&title, tabletitle);
 
-	myopt.nullPrint = NULL;
 	myopt.title = title.data;
 	myopt.translate_header = true;
 	myopt.translate_columns = translate_columns;
@@ -4412,7 +4398,6 @@ listLanguages(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of languages");
 	myopt.translate_header = true;
 
@@ -4497,7 +4482,6 @@ listDomains(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of domains");
 	myopt.translate_header = true;
 
@@ -4576,7 +4560,6 @@ listConversions(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.ti

Re: [PoC/RFC] Multiple passwords, interval expirations

2023-10-06 Thread Jeff Davis
On Fri, 2023-10-06 at 14:26 -0500, Nathan Bossart wrote:
> I guess it's more of the latter.  Perhaps one potential use case
> would be
> short-lived credentials that are created on demand.  Such a password
> might
> only be valid for something like 15 minutes, and many users might
> have the
> ability to request a password for the database role.  I don't know
> whether
> there is a ton of demand for such a use case, and it might already be
> solvable by just creating separate roles.  In any case, if there's
> general
> agreement that we only want to target the rotation use case, that's
> fine by
> me.

The basic problem, as I see it, is: how do we keep users from
accidentally dropping the wrong password? Generated unique names or
numbers don't solve that problem. Auto-incrementing or a created-at
timestamp solves it in the sense that you can at least look at a system
view and see if there's a newer one, but it's a little awkward. A
validity period is a good fit if all passwords have a validity period
and we don't change it, but gets awkward otherwise.

I'm also worried about two kinds of clutter:

* old passwords not being garbage-collected
* the identifier of the current password always changing (perhaps fine
if it'a a "created at" ID?)

Regards,
Jeff Davis





Re: Restoring default privileges on objects

2023-10-06 Thread Laurenz Albe
On Fri, 2023-10-06 at 22:18 +0200, Laurenz Albe wrote:
> On Fri, 2023-10-06 at 22:16 +0200, Laurenz Albe wrote:
> > Here is a patch that does away with the special handling of NULL values
> > in psql backslash commands.
> 
> Erm, I forgot to attach the patch.

I just realize that there is a conflicting proposal.  I'll reply to that thread.
Sorry for the noise.

Yours,
Laurenz Albe




Re: [PoC/RFC] Multiple passwords, interval expirations

2023-10-06 Thread Jeff Davis
On Thu, 2023-10-05 at 14:28 -0700, Gurjeet Singh wrote:

> This way there's a notion of a 'new' and 'old' passwords.

IIUC, you are proposing that there are exactly two slots, NEW and OLD.
When adding a password, OLD must be unset and it moves NEW to OLD, and
adds the new password in NEW. DROP only works on OLD. Is that right?

It's close to the idea of deprecation, except that adding a new
password implicitly deprecates the existing one. I'm not sure about
that -- it could be confusing.

We could also try using a verb like "expire" that could be coupled with
a date, and that way all old passwords would always have some validity
period. That might make it a bit easier to manage if we do need more
than two passwords.

Regards,
Jeff Davis





Re: Fix output of zero privileges in psql

2023-10-06 Thread Laurenz Albe
On Sun, 2023-09-17 at 21:31 +0200, Erik Wienhold wrote:
> I wrote a patch to change psql's display of zero privileges after a user's
> reported confusion with the psql output for zero vs. default privileges [1].
> Admittedly, zero privileges is a rare use case [2] but I think psql should not
> confuse the user in the off chance that this happens.
> 
> With this change psql now prints "(none)" for zero privileges instead of
> nothing.  This affects the following meta commands:
> 
>     \db+ \dD+ \df+ \dL+ \dl+ \dn+ \dp \dT+ \l
> 
> Default privileges start as NULL::aclitem[] in various catalog columns but
> revoking the default privileges leaves an empty aclitem array.  Using
> \pset null '(null)' as a workaround to spot default privileges does not work
> because the meta commands ignore this setting.
> 
> The privileges shown by \dconfig+ and \ddp as well as the column privileges
> shown by \dp are not affected by this change because those privileges are 
> reset
> to NULL instead of leaving empty arrays.
> 
> Commands \des+ and \dew+ are not covered in src/test/regress because no 
> foreign
> data wrapper is available at this point to create a foreign server.
> 
> [1] 
> https://www.postgresql.org/message-id/efdd465d-a795-6188-7f71-7cdb4b2be031%40mtneva.com
> [2] https://www.postgresql.org/message-id/31246.1693337238%40sss.pgh.pa.us

Reading that thread, I had the impression that there was more support for
honoring "\pset null" rather than unconditionally displaying "(none)".

The simple attached patch does it like that.  What do you think?

Yours,
Laurenz Albe
From 6c67f15f011ddf1e309cb7e84580b266d674a1e2 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Fri, 6 Oct 2023 22:12:33 +0200
Subject: [PATCH] psql: honor "\pset null" in backslash commands

In the output of backslash commands, NULL was always displayed
as empty string, rather than honoring the values set with
"\pset null".

This was surprising, and in some cases it was downright confusing:
for example, default privileges (stored as NULL) were displayed
just like an empty aclitem[], making these cases undistinguishable
in the output.

Discussion: https://postgr.es/m/96d6885a-5e25-9ae8-4a1a-d7e557a5fe9c%40mtneva.com
---
 src/bin/psql/describe.c | 43 -
 1 file changed, 43 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index bac94a338c..224aa22575 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -124,7 +124,6 @@ describeAggregates(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of aggregate functions");
 	myopt.translate_header = true;
 
@@ -197,7 +196,6 @@ describeAccessMethods(const char *pattern, bool verbose)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of access methods");
 	myopt.translate_header = true;
 	myopt.translate_columns = translate_columns;
@@ -262,7 +260,6 @@ describeTablespaces(const char *pattern, bool verbose)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of tablespaces");
 	myopt.translate_header = true;
 
@@ -585,7 +582,6 @@ describeFunctions(const char *functypes, const char *func_pattern,
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of functions");
 	myopt.translate_header = true;
 	if (pset.sversion >= 90600)
@@ -702,7 +698,6 @@ describeTypes(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of data types");
 	myopt.translate_header = true;
 
@@ -893,7 +888,6 @@ describeOperators(const char *oper_pattern,
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of operators");
 	myopt.translate_header = true;
 
@@ -995,7 +989,6 @@ listAllDbs(const char *pattern, bool verbose)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of databases");
 	myopt.translate_header = true;
 
@@ -1146,7 +1139,6 @@ permissionsList(const char *pattern, bool showSystem)
 	if (!res)
 		goto error_return;
 
-	myopt.nullPrint = NULL;
 	printfPQExpBuffer(&buf, _("Access privileges"));
 	myopt.title = buf.data;
 	myopt.translate_header = true;
@@ -1218,7 +1210,6 @@ listDefaultACLs(const char *pattern)
 	if (!res)
 		goto error_return;
 
-	myopt.nullPrint = NULL;
 	printfPQExpBuffer(&buf, _("Default access privileges"));
 	myopt.title = buf.data;
 	myopt.translate_header = true;
@@ -1417,7 +1408,6 @@ objectDescription(const char *pattern, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("Object descriptions");
 	myopt.translate_header = true;
 	myopt.translate_columns = translate_columns;
@@ -3852,7 +3842,6 @@ listDbRoleSettings(const char *pattern, const char *pattern2)
 	}
 	else
 	{
-		myopt.nullPrint = NULL;
 		myopt.title = _("List of settings");
 		myopt.translate_header = true;
 
@@ -3926,

Re: [PoC/RFC] Multiple passwords, interval expirations

2023-10-06 Thread Bruce Momjian
On Fri, Oct  6, 2023 at 01:20:03PM -0700, Jeff Davis wrote:
> The basic problem, as I see it, is: how do we keep users from
> accidentally dropping the wrong password? Generated unique names or

I thought we could auto-remove old password if the valid-until date is
in the past.  You would need a separate ALTER command to sets its date
in the past without that.  Also, defining a new password could require
setting the expiration date of the old password to make future additions
easier.

For pg_authid, I was thinking of columns:

ADD rolpassword_old
ADD rolvaliduntil_old
EXISTS  rolpassword
EXISTS  rolvaliduntil

I did blog about the password rotation problem and suggested
certificates:

https://momjian.us/main/blogs/pgblog/2020.html#July_17_2020

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Problem, partition pruning for prepared statement with IS NULL clause.

2023-10-06 Thread Bruce Momjian
On Fri, Oct  6, 2023 at 06:09:45PM +0400, Sergei Glukhov wrote:
> Test case:
> --
> set plan_cache_mode to force_generic_plan;
> prepare stmt AS select * from hp where a is null and b = $1;
> explain execute stmt('xxx');
> --

I can confirm the crash in git master.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Problem, partition pruning for prepared statement with IS NULL clause.

2023-10-06 Thread Bruce Momjian
On Fri, Oct  6, 2023 at 05:00:54PM -0400, Bruce Momjian wrote:
> On Fri, Oct  6, 2023 at 06:09:45PM +0400, Sergei Glukhov wrote:
> > Test case:
> > --
> > set plan_cache_mode to force_generic_plan;
> > prepare stmt AS select * from hp where a is null and b = $1;
> > explain execute stmt('xxx');
> > --
> 
> I can confirm the crash in git master.

There were some UTF8 non-space whitespace characters in the email so
attached is a clean reproducable SQL crash file.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.


crash.sql
Description: application/sql


Re: Where can I find the doxyfile?

2023-10-06 Thread Bruce Momjian
On Fri, Oct  6, 2023 at 10:50:25PM +0300, Bogdan Mart wrote:
> On 2008-06-02 18:33:42 Stefan Kaltenbrunner wrote:
> >
> > Zdenek Kotala wrote:
> > > Xin Wang napsal(a):
> > >> Hi,
> > >> I don't know where I can find the doxyfile which generate
> > >> "doxygen.postgresql.org" web site. I found that when reading code the
> > >> doxygen source code is quite helpful. However, I want to generate an
> > >> off-line copy of doxygen docs myself, but I can't find the doxyfile in
> > >> the lastest source release.
> > >
> > > I think it is good idea. Stefan, what's about put it on the wiki?
> >
> > don't think the wiki is a good place (we would have to maintain it in
> > addition to the main file) so I simply added a link on
> > http:/doxygen.postgresql.org that contains the current copy of the
> > configuration file that was used to generate a particular set of docs.
> > However some of the things there are specific to our install so a bit
> > (like some of the paths) of manual change will be required anyway.
> 
> Hello, sorry for resurrecting this old discussion, and necroposting,
> but this was relevant.
> 
> It turns out that way back in 2008 it was possible to download Doxyfile
> from https://doxygen.postgresql.org/ , which can be seen at:
> https://web.archive.org/web/20080915132924/https://doxygen.postgresql.org/
> 
> I was able to download this Doxyfile from the web archive, and it actually
> worked for me, after I changed the path to postgres sources and removed
> header template. It even produced same result as on current web-site.
> 
> It would be nice if this doxygen file would be placed somewhere
> on the current site.
> 
> P.S. I've tried to run doxygen on postgres sources without config,
> and it didn't find any source file, this archived thread helped.
> 
> Perhaps doxygen file can be just placed into postgres git repository?

We do link to a Doxygen build from our website:

https://www.postgresql.org/developer/coding/

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Where can I find the doxyfile?

2023-10-06 Thread mart . bogdan




On 07.10.23 00:06, Bruce Momjian wrote:

On Fri, Oct  6, 2023 at 10:50:25PM +0300, Bogdan Mart wrote:
> On 2008-06-02 18:33:42 Stefan Kaltenbrunner wrote:
>>
>> Zdenek Kotala wrote:
>>> Xin Wang napsal(a):
 Hi,
 I don't know where I can find the doxyfile which generate
 "doxygen.postgresql.org" web site. I found that when reading code the
 doxygen source code is quite helpful. However, I want to generate an
 off-line copy of doxygen docs myself, but I can't find the doxyfile in
 the lastest source release.
>>>
>>> I think it is good idea. Stefan, what's about put it on the wiki?
>>
>> don't think the wiki is a good place (we would have to maintain it in
>> addition to the main file) so I simply added a link on
>> http:/doxygen.postgresql.org that contains the current copy of the
>> configuration file that was used to generate a particular set of docs.
>> However some of the things there are specific to our install so a bit
>> (like some of the paths) of manual change will be required anyway.
>
> Hello, sorry for resurrecting this old discussion, and necroposting,
> but this was relevant.
>
> It turns out that way back in 2008 it was possible to download Doxyfile
> from https://doxygen.postgresql.org/ , which can be seen at:
> https://web.archive.org/web/20080915132924/https://doxygen.postgresql.org/
>
> I was able to download this Doxyfile from the web archive, and it actually
> worked for me, after I changed the path to postgres sources and removed
> header template. It even produced same result as on current web-site.
>
> It would be nice if this doxygen file would be placed somewhere
> on the current site.
>
> P.S. I've tried to run doxygen on postgres sources without config,
> and it didn't find any source file, this archived thread helped.
>
> Perhaps doxygen file can be just placed into postgres git repository?

We do link to a Doxygen build from our website:

https://www.postgresql.org/developer/coding/


Yes, there is link to deployed documentation generated by Doxygen
(unless I'm missing something), but I am talking about config file
for Doxygen to generate same documentation locally. 
There are no entries on Wiki regarding doxygen,
and not in build instructions. 


I was able to locate needed config file in Web Archive:
https://web.archive.org/web/20081210081808if_/http://doxygen.postgresql.org:80/pgsql.conf

But it would be nice if it was readily available if anybody other would like
to build docs locally. 





RE: Where can I find the doxyfile?

2023-10-06 Thread postgres
Sometime earlier, I created a filter to annotate regular C comments as doxy
comments.  I'll attach it along with a sample doxyfile for running it.  Just
in case it looks useful.

I've never been a big fan of Doxygen, but it seems to have gotten better
over time. Now that some IDEs display doxy comments on hover, I'm beginning
to appreciate it.




DoxygenFilter.l
Description: Binary data


Doxyfile
Description: Binary data


Re: pg16: invalid page/page verification failed

2023-10-06 Thread Justin Pryzby
On Fri, Oct 06, 2023 at 08:47:39AM -0700, Andres Freund wrote:
> Hi,
> 
> On 2023-10-06 09:20:05 +0900, Michael Paquier wrote:
> > On Thu, Oct 05, 2023 at 11:45:18AM -0500, Justin Pryzby wrote:
> > > This table is what it sounds like: a partition into which CSV logs are
> > > COPY'ed.  It would've been created around 8am.  There's no special
> > > params set for the table nor for autovacuum.
> > 
> > This may be an important bit of information.  31966b151e6a is new as
> > of Postgres 16, has changed the way relations are extended and COPY
> > was one area touched.  I am adding Andres in CC.
> 
> Hm, is there any chance the COPY targets more than one partition? If so, this
> sounds like it might be the issue described here
> https://postgr.es/m/20230925213746.fwqauhhifjgefyzk%40alap3.anarazel.de

The first error was from:
log_time | 2023-10-05 09:57:01.939-05
left | COPY postgres_log FROM 
'/var/log/postgresql/postgresql-2023-10-05_095200.csv' WITH csv

Unfortunately, I no longer have the CSV files which caused errors.
After I moved the broken table out of the way and created a new
partition, they would've been imported successfully, and then removed.

Also, it's sad, but the original 2023_10_05_0900 partition I created was
itself rotated out of existence a few hours ago (I still have the most
interesting lines, though).

I've seen that it's possible for a CSV to include some data that ideally
would've gone into the "next" CSV:  2023-01-01_18.csv might include a line
of data after 6pm.  For example, with log_rotation_age=2min,
postgresql-2023-10-06_120800.csv had a row after 12:10:
2023-10-06 12:10:00.101 
CDT,"pryzbyj","pryzbyj",5581,"[local]",65203f66.15cd,2,...

But I'm not sure how that can explain this issue, because this was
095600.csv, and not 095800.csv.  My script knows to create the "next"
partition, to handle the case that the file includes some data that
should've gone to the next logfile.  I'm handling that case with the
anticipation that there might be a few tenths of a second or even a few
seconds of logs in the wrong file - typically 0 lines and sometimes 1
line.  I don't know if it's even possible to have multiple lines in the
"wrong" file.  In any case, I'm not not expecting log rotation to be 2
minutes behind.

Also, not only was the data in the CSV earlier than 10am, but the error
*itself* was also earlier.  The error importing the CSV was at 9:57, so
the CSV couldn't have had data after 10:00.  Not that it matters, but my
script doesn't import the most recent logfile, and also avoids importing
files written within the last minute.

I don't see how a CSV with a 2 minute interval of data beginning at 9:56
could straddle hourly partitions.

log_time | 2023-10-05 09:57:01.939-05
left | invalid page in block 119 of relation base/16409/801594131
left | COPY postgres_log FROM 
'/var/log/postgresql/postgresql-2023-10-05_095200.csv' WITH csv

log_time | 2023-10-05 09:57:01.939-05
left | page verification failed, calculated checksum 5074 but expected 50
left | COPY postgres_log FROM 
'/var/log/postgresql/postgresql-2023-10-05_095200.csv' WITH csv

-- 
Justin




RE: Where can I find the doxyfile?

2023-10-06 Thread john.morris
Sometime earlier, I created a filter to annotate regular C comments as doxy 
comments. 
I'll attach it along with a sample doxyfile for running it.  Just in case it 
looks useful.

I've never been a big fan of Doxygen, but it seems to have gotten better over 
time.
Now that some editors are displaying doxygen comments on hover, I'm beginning 
to appreciate it.




d


DoxygenFilter.l
Description: Binary data


Doxyfile
Description: Binary data


Re: remaining sql/json patches

2023-10-06 Thread Andres Freund
Hi,

On 2023-09-29 13:57:46 +0900, Amit Langote wrote:
> Thanks.  I will push the attached 0001 shortly.

Sorry for not looking at this earlier.

Have you done benchmarking to verify that 0001 does not cause performance
regressions? I'd not be suprised if it did. I'd split the soft-error path into
a separate opcode. For JIT it can largely be implemented using the same code,
eliding the check if it's the non-soft path. Or you can just put it into an
out-of-line function.

I don't like adding more stuff to ExprState. This one seems particularly
awkward, because it might be used by more than one level in an expression
subtree, which means you really need to save/restore old values when
recursing.


> @@ -1579,25 +1582,13 @@ ExecInitExprRec(Expr *node, ExprState *state,
>
>   /* lookup the result type's input function */
>   scratch.d.iocoerce.finfo_in = 
> palloc0(sizeof(FmgrInfo));
> - scratch.d.iocoerce.fcinfo_data_in = 
> palloc0(SizeForFunctionCallInfo(3));
> -
>   getTypeInputInfo(iocoerce->resulttype,
> -  &iofunc, 
> &typioparam);
> +  &iofunc, 
> &scratch.d.iocoerce.typioparam);
>   fmgr_info(iofunc, scratch.d.iocoerce.finfo_in);
>   fmgr_info_set_expr((Node *) node, 
> scratch.d.iocoerce.finfo_in);
> - 
> InitFunctionCallInfoData(*scratch.d.iocoerce.fcinfo_data_in,
> - 
>  scratch.d.iocoerce.finfo_in,
> - 
>  3, InvalidOid, NULL, NULL);
>
> - /*
> -  * We can preload the second and third 
> arguments for the input
> -  * function, since they're constants.
> -  */
> - fcinfo_in = scratch.d.iocoerce.fcinfo_data_in;
> - fcinfo_in->args[1].value = 
> ObjectIdGetDatum(typioparam);
> - fcinfo_in->args[1].isnull = false;
> - fcinfo_in->args[2].value = Int32GetDatum(-1);
> - fcinfo_in->args[2].isnull = false;
> + /* Use the ErrorSaveContext passed by the 
> caller. */
> + scratch.d.iocoerce.escontext = state->escontext;
>
>   ExprEvalPushStep(state, &scratch);
>   break;

I think it's likely that removing the optimization of not needing to set these
arguments ahead of time will result in a performance regression. Not to speak
of initializing the fcinfo from scratch on every evaluation of the expression.

I think this shouldn't not be merged as is.



>  src/backend/parser/gram.y   |  348 +-

This causes a nontrivial increase in the size of the parser (~5% in an
optimized build here), I wonder if we can do better.


> +/*
> + * Push steps to evaluate a JsonExpr and its various subsidiary expressions.
> + */
> +static void
> +ExecInitJsonExpr(JsonExpr *jexpr, ExprState *state,
> +  Datum *resv, bool *resnull,
> +  ExprEvalStep *scratch)
> +{
> + JsonExprState *jsestate = palloc0(sizeof(JsonExprState));
> + JsonExprPreEvalState *pre_eval = &jsestate->pre_eval;
> + ListCell   *argexprlc;
> + ListCell   *argnamelc;
> + int skip_step_off = -1;
> + int passing_args_step_off = -1;
> + int coercion_step_off = -1;
> + int coercion_finish_step_off = -1;
> + int behavior_step_off = -1;
> + int onempty_expr_step_off = -1;
> + int onempty_jump_step_off = -1;
> + int onerror_expr_step_off = -1;
> + int onerror_jump_step_off = -1;
> + int result_coercion_jump_step_off = -1;
> + List   *adjust_jumps = NIL;
> + ListCell   *lc;
> + ExprEvalStep *as;
> +
> + jsestate->jsexpr = jexpr;
> +
> + /*
> +  * Add steps to compute formatted_expr, pathspec, and PASSING arg
> +  * expressions as things that must be evaluated *before* the actual JSON
> +  * path expression.
> +  */
> + ExecInitExprRec((Expr *) jexpr->formatted_expr, state,
> + &pre_eval->formatted_expr.value,
> + &pre_eval->formatted_expr.isnull);
> + ExecInitExprRec((Expr *) jexpr->path_spec, state,
> + &pre_eval->pathspec.value,
> + &pre_eva

Re: [PoC] pg_upgrade: allow to upgrade publisher nodeHayato Kuroda (Fujitsu)

2023-10-06 Thread Amit Kapila
On Fri, Oct 6, 2023 at 6:30 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> Based on comments, I revised my patch. PSA the file.
>
> >
> > > Today, I discussed this problem with Andres at PGConf NYC and he
> > > suggested as following. To verify, if there is any pending unexpected
> > > WAL after shutdown, we can have an API like
> > > pg_logical_replication_slot_advance() which will simply process
> > > records without actually sending anything downstream. In this new API,
> > > we will start with each slot's restart_lsn location and try to process
> > > till the end of WAL, if we encounter any WAL that needs to be
> > > processed (like we need to send the decoded WAL downstream) we can
> > > return a false indicating that there is an unexpected WAL. The reason
> > > to start with restart_lsn is that it is the location that we use to
> > > start scanning the WAL anyway.
>
> I implemented this by using decoding context. The binary upgrade function
> processes WALs from the confirmed_flush, and returns false if some meaningful
> changes are found.
>
> Internally, I added a new decoding mode - DECODING_MODE_SILENT - and used it.
> If the decoding context is in the mode, the output plugin is not loaded, but
> any WALs are decoded without skipping.
>

I think it may be okay not to load the output plugin as we are not
going to process any record in this case but is that the only reason
or you have something else in mind as well?

> Also, a new flag "did_process" is also
> added. This flag is set if wrappers for output plugin callbacks are called 
> during
> the silent mode.
>

Isn't it sufficient to add a test for silent mode in
begin/stream_start/begin_prepare kind of APIs and set
ctx->did_process? In all other APIs, we can assert that did_process
shouldn't be set and we never reach there when decoding mode is
silent.

> The upgrading function checks both reorder buffer and the new
> flag because both (non-)transactional changes should be detected. If we only
> check reorder buffer, we miss the non-transactional one.
>

+ /* Check whether the meaningful change was found */
+ found = (ctx->reorder->by_txn_last_xid != InvalidTransactionId ||
+ ctx->did_process);

Are you talking about this check in the patch? If so, can you please
explain when does the first check help?

> fast_forward was changed as a variant of decoding mode.
>
> Currently the function is called for all the valid slot. If the approach seems
> good, we can refactor like Bharath said [1].
>
> >
> > > Then, we should also try to create slots before invoking pg_resetwal.
> > > The idea is that we can write a new binary mode function that will do
> > > exactly what pg_resetwal does to compute the next segment and use that
> > > location as a new location (restart_lsn) to create the slots in a new
> > > node. Then, pass it pg_resetwal by using the existing option '-l
> > > walfile'. As we don't have any API that takes restart_lsn as input, we
> > > can write a new API probably for binary mode to create slots that do
> > > take restart_lsn as input. This will ensure that there is no new WAL
> > > inserted by background processes between resetwal and the creation of
> > > slots.
>
> Based on that, I added another binary function 
> binary_upgrade_create_logical_replication_slot().
> This function is similar to pg_create_logical_replication_slot(), but the
> restart_lsn and confirmed_flush are set to *next* WAL segment. The pointed
> filename is returned and it is passed to pg_resetwal command.
>

I am not sure if it is a good idea that a
binary_upgrade_create_logical_replication_slot() API does the logfile
name calculation.

> One consideration is that pg_log_standby_snapshot() must be executed before
> slots consuming changes. New cluster does not have RUNNING_XACTS records so 
> that
> decoding context on new cluster cannot be create a consistent snapshot as-is.
> This may lead to discard changes during the upcoming consuming event. To
> prevent it the function is called after the final pg_resetwal.
>
> How do you think?
>

+ /*
+ * Also, we mu execute pg_log_standby_snapshot() when logical replication
+ * slots are migrated. Because RUNNING_XACTS record is required to create
+ * a consistent snapshot.
+ */
+ if (count_old_cluster_logical_slots())
+ create_consistent_snapshot();

We shouldn't do this separately. Instead
binary_upgrade_create_logical_replication_slot() should ensure that
corresponding WAL is reserved similar to what we do in
ReplicationSlotReserveWal() and then similarly invoke
LogStandbySnapshot() to ensure that we have enough information to
start.

Few minor comments:
==
1. The commit message and other comments like atop
get_old_cluster_logical_slot_infos() needs to be adjusted as per
recent changes.
2.
@@ -1268,7 +1346,11 @@ stream_start_cb_wrapper(ReorderBuffer *cache,
ReorderBufferTXN *txn,
  LogicalErrorCallbackState state;
  ErrorContextCallback errcallback;

- Assert(!ctx->fast_forward);
+ /*
+ * In silent mode all the

Re: Where can I find the doxyfile?

2023-10-06 Thread Bohdan Mart



On 07.10.23 00:29, postg...@coyotebush.net wrote:

Sometime earlier, I created a filter to annotate regular C comments as doxy
comments.  I'll attach it along with a sample doxyfile for running it.  Just
in case it looks useful.

I've never been a big fan of Doxygen, but it seems to have gotten better
over time. Now that some IDEs display doxy comments on hover, I'm beginning
to appreciate it.


Thank you for providing this `flex` filter! It is actually useful. I've 
tested it on one


file from postures source base, and it actually converted regular

comments to doc-comments! If this filter could be used by official

Doxygen generation of Postgres, it would highly increase quality

of online documentation of Postgres!


My initial idea was to create patches to upstream with changing comments

to documented comments, because current online dock is lacking comments,

but with this filter it would be unneeded, as documentation would be 
able to


be generated from current sources!


To illustrate the point:

~~~

This function has Doxy style comment

```

src/interfaces/ecpg/compatlib/informix.c
672:/**
673- * initialize the struct, which holds the different forms
674- * of the long value
675- */
676-static int
677-initValue(long lng_val)

```

And it is rendered properly:

https://doxygen.postgresql.org/informix_8c.html#a0dad4c2ee52a831d3aa1bf1133e0e17d


But  function like this

```

src/backend/foreign/foreign.c

 /*
  * get_foreign_server_oid - given a server name, look up the OID
  *
  * If missing_ok is false, throw an error if name not found.  If true, 
just

  * return InvalidOid.
  */
 Oid
 get_foreign_server_oid(const char *servername, bool missing_ok)

```

https://doxygen.postgresql.org/foreign_8c.html#a7959c56969be440c25b62c3c98ce2a78

doesn't have rendered documentation.


P.S. Was this original message from postg...@coyotebush.net intended to 
be sent on John's Morris behalf?






Re: Pre-proposal: unicode normalized text

2023-10-06 Thread Matthias van de Meent
On Fri, 6 Oct 2023, 21:08 Jeff Davis,  wrote:

> On Fri, 2023-10-06 at 13:33 -0400, Robert Haas wrote:
> > What I think people really want is a whole column in
> > some encoding that isn't the normal one for that database.
>
> Do people really want that? I'd be curious to know why.
>

One reason someone would like this is because a database cluster may have
been initialized with something like --no-locale (thus getting defaulted to
LC_COLLATE=C, which is desired behaviour and gets fast strcmp operations
for indexing, and LC_CTYPE=SQL_ASCII, which is not exactly expected but can
be sufficient for some workloads), but now that the data has grown they
want to use utf8.EN_US collations in some of their new and modern table's
fields?
Or, a user wants to maintain literal translation tables, where different
encodings would need to be used for different languages to cover the full
script when Unicode might not cover the full character set yet.
Additionally, I'd imagine specialized encodings like Shift_JIS could be
more space efficient than UTF-8 for e.g. japanese text, which might be
useful for someone who wants to be a bit more frugal with storage when they
know text is guaranteed to be in some encoding's native language:
compression can do the same work, but also adds significant overhead.

I've certainly experienced situations where I forgot to explicitly include
the encoding in initdb --no-locale and then only much later noticed that my
big data load is useless due to an inability to create UTF-8 collated
indexes.
I often use --no-locale to make string indexing fast (locales/collation are
not often important to my workload) and to block any environment variables
from being carried over into the installation. An ability to set or update
the encoding of columns would help reduce the pain: I would no longer have
to re-initialize the database or cluster from 0.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)


Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-06 Thread Amit Kapila
On Thu, Oct 5, 2023 at 6:43 PM Bharath Rupireddy
 wrote:
>
> On Thu, Oct 5, 2023 at 1:48 AM Amit Kapila  wrote:
> >
>
> > The other potential problem Andres pointed out is that during shutdown
> > if due to some reason, the walreceiver goes down, we won't be able to
> > send the required WAL and users won't be able to ensure that because
> > even after restart the same situation can happen. The ideal way is to
> > have something that puts the system in READ ONLY state during shutdown
> > and then we can probably allow walreceivers to reconnect and receive
> > the required WALs. As we don't have such functionality available and
> > it won't be easy to achieve the same, we can leave this for now.
> >
> > Thoughts?
>
> You mean walreceiver for streaming replication? Or the apply workers
> going down for logical replication?
>

Apply workers.

>
> If there's yet-to-be-sent-out WAL,
> pg_upgrade will fail no? How does the above scenario a problem for
> pg_upgrade of a cluster with just logical replication slots?
>

Even, if there is a WAL yet to be sent, the walsender will simply exit
as it will receive PqMsg_Terminate ('X') from standby. See
ProcessRepliesIfAny(). After that shutdown checkpoint will finish. So,
in this case upgrade can fail due to slots. But, I think the server
should be able to succeed in consecutive runs. Does this make sense?

-- 
With Regards,
Amit Kapila.




Re: typo in couple of places

2023-10-06 Thread Amit Kapila
On Fri, Oct 6, 2023 at 8:52 PM Dagfinn Ilmari Mannsåker
 wrote:
>
> vignesh C  writes:
>
> > Hi,
> >
> > I noticed a couple of typos in code. "the the" should have been "the",
> > attached patch has the changes for the same.
>
> This made me curious about other duplicate word occurrences, and after a
> few minutes of increasingly-elaborate regexing to exclude false
> postives, I found a few more (plus a bonus a _missing_ "the").  See the
> attached patch (which includes your originl one, for completeness).
>

LGTM.

-- 
With Regards,
Amit Kapila.




Re: Add support for AT LOCAL

2023-10-06 Thread Vik Fearing

On 10/6/23 07:05, Michael Paquier wrote:


I haven't yet finished my review of the patch, still looking at it.


I realized that my regression tests are not exercising what I originally 
intended them to after this change.  They are supposed to show that 
calling the function explicitly or using AT LOCAL is correctly 
reproduced by ruleutils.c.


The attached v4 changes the regression tests (and nothing else).
--
Vik Fearing
From 03273214f0320e347a0b012763dc82cd91ae6775 Mon Sep 17 00:00:00 2001
From: Vik Fearing 
Date: Wed, 4 Oct 2023 15:46:38 +0100
Subject: [PATCH v4] Add support for AT LOCAL

When converting a timestamp to/from with/without time zone, the SQL
Standard specifies an AT LOCAL variant of AT TIME ZONE which uses the
session's time zone.
---
 doc/src/sgml/func.sgml| 13 +++
 src/backend/parser/gram.y |  7 
 src/backend/utils/adt/ruleutils.c |  9 +
 src/backend/utils/adt/timestamp.c | 20 ++
 src/include/catalog/pg_proc.dat   |  6 +++
 src/test/regress/expected/timestamptz.out | 47 +++
 src/test/regress/sql/timestamptz.sql  | 21 ++
 7 files changed, 123 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index f1ad64c3d6..6bc61705a9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10619,12 +10619,16 @@ SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-0

 

 AT TIME ZONE

 
+   
+AT LOCAL
+   
+

 The AT TIME ZONE operator converts time
 stamp without time zone to/from
 time stamp with time zone, and
 time with time zone values to different time
 zones.  shows its
@@ -10707,24 +10711,33 @@ SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-0
 In the text case, a time zone name can be specified in any of the ways
 described in .
 The interval case is only useful for zones that have fixed offsets from
 UTC, so it is not very common in practice.

 
+   
+The syntax AT LOCAL may be used as shorthand for AT TIME ZONE
+local, where local is the
+session's TimeZone value.
+   
+

 Examples (assuming the current  setting
 is America/Los_Angeles):
 
 SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'America/Denver';
 Result: 2001-02-16 19:38:40-08
 
 SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT TIME ZONE 'America/Denver';
 Result: 2001-02-16 18:38:40
 
 SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'Asia/Tokyo' AT TIME ZONE 'America/Chicago';
 Result: 2001-02-16 05:38:40
+
+SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT LOCAL;
+Result: 2001-02-16 17:38:40
 
 The first example adds a time zone to a value that lacks it, and
 displays the value using the current TimeZone
 setting.  The second example shifts the time stamp with time zone value
 to the specified time zone, and returns the value without a time zone.
 This allows storage and display of values different from the current
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e56cbe77cb..50ed504e5a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -14505,12 +14505,19 @@ a_expr:		c_expr	{ $$ = $1; }
 {
 	$$ = (Node *) makeFuncCall(SystemFuncName("timezone"),
 			   list_make2($5, $1),
 			   COERCE_SQL_SYNTAX,
 			   @2);
 }
+			| a_expr AT LOCAL		%prec AT
+{
+	$$ = (Node *) makeFuncCall(SystemFuncName("timezone"),
+			   list_make1($1),
+			   COERCE_SQL_SYNTAX,
+			   -1);
+}
 		/*
 		 * These operators must be called out explicitly in order to make use
 		 * of bison's automatic operator-precedence handling.  All other
 		 * operator names are handled by the generic productions using "Op",
 		 * below; and all those operators will have the same precedence.
 		 *
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 442205382e..9d1b0b13b1 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -10344,12 +10344,21 @@ get_func_sql_syntax(FuncExpr *expr, deparse_context *context)
 			appendStringInfoString(buf, " AT TIME ZONE ");
 			get_rule_expr_paren((Node *) linitial(expr->args), context, false,
 (Node *) expr);
 			appendStringInfoChar(buf, ')');
 			return true;
 
+		case F_TIMEZONE_TIMESTAMP:
+		case F_TIMEZONE_TIMESTAMPTZ:
+			/* AT LOCAL */
+			appendStringInfoChar(buf, '(');
+			get_rule_expr_paren((Node *) linitial(expr->args), context, false,
+(Node *) expr);
+			appendStringInfoString(buf, " AT LOCAL)");
+			return true;
+
 		case F_OVERLAPS_TIMESTAMPTZ_INTERVAL_TIMESTAMPTZ_INTERVAL:
 		case F_OVERLAPS_TIMESTAMPTZ_INTERVAL_TIMESTAMPTZ_TIMESTAMPTZ:
 		case F_OVERLAPS_TIMESTAMPTZ_TIMESTAMPTZ_TIMESTAMPTZ_INTERVAL:
 		case F_OVERLAPS_TIMESTAMPTZ_TIMESTAMPTZ_TIMESTAMPTZ_TIMESTAMPTZ:

Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-10-06 Thread Amit Kapila
On Tue, Oct 3, 2023 at 12:12 PM vignesh C  wrote:
>
> Thanks for the comments, the attached v6 version patch has the changes
> for the same.
>

Few comments:
=
1.
/* Is the use of a password mandatory? */
  must_use_password = MySubscription->passwordrequired &&
- !superuser_arg(MySubscription->owner);
+ !MySubscription->ownersuperuser;

- /* Note that the superuser_arg call can access the DB */
  CommitTransactionCommand();

We can call CommitTransactionCommand() before the above check now. It
was done afterward to invoke superuser_arg(), so, if that requirement
is changed, we no longer need to keep the transaction open for a
longer time. Please check other places for similar changes.

2.
+ ereport(LOG,
+ errmsg("logical replication worker for subscription \"%s\" will
restart because the subscription owner has become a non-superuser",

How about something on the below lines?
logical replication worker for subscription \"%s\" will restart
because superuser privileges have been revoked for the subscription
owner
OR
logical replication worker for subscription \"%s\" will restart
because the subscription owner's superuser privileges have been
revoked

3.
- /* Keep us informed about subscription changes. */
+ /*
+ * Keep us informed about subscription changes or pg_authid rows.
+ * (superuser can become non-superuser.)
+ */

Let's slightly change the comment to: "Keep us informed about
subscription or role changes. Note that role's superuser privilege can
be revoked."

-- 
With Regards,
Amit Kapila.




Re: typo in couple of places

2023-10-06 Thread vignesh C
On Fri, 6 Oct 2023 at 20:50, Dagfinn Ilmari Mannsåker  wrote:
>
> vignesh C  writes:
>
> > Hi,
> >
> > I noticed a couple of typos in code. "the the" should have been "the",
> > attached patch has the changes for the same.
>
> This made me curious about other duplicate word occurrences, and after a
> few minutes of increasingly-elaborate regexing to exclude false
> postives, I found a few more (plus a bonus a _missing_ "the").  See the
> attached patch (which includes your originl one, for completeness).

Thanks, Looks good.

Regards,
Vignesh




Re: Fix output of zero privileges in psql

2023-10-06 Thread Erik Wienhold
On 2023-10-06 22:32 +0200, Laurenz Albe write:
> On Sun, 2023-09-17 at 21:31 +0200, Erik Wienhold wrote:
> > I wrote a patch to change psql's display of zero privileges after a user's
> > reported confusion with the psql output for zero vs. default privileges [1].
> > Admittedly, zero privileges is a rare use case [2] but I think psql should 
> > not
> > confuse the user in the off chance that this happens.
> > 
> > With this change psql now prints "(none)" for zero privileges instead of
> > nothing.  This affects the following meta commands:
> > 
> >     \db+ \dD+ \df+ \dL+ \dl+ \dn+ \dp \dT+ \l
> > 
> > Default privileges start as NULL::aclitem[] in various catalog columns but
> > revoking the default privileges leaves an empty aclitem array.  Using
> > \pset null '(null)' as a workaround to spot default privileges does not work
> > because the meta commands ignore this setting.
> > 
> > The privileges shown by \dconfig+ and \ddp as well as the column privileges
> > shown by \dp are not affected by this change because those privileges are 
> > reset
> > to NULL instead of leaving empty arrays.
> > 
> > Commands \des+ and \dew+ are not covered in src/test/regress because no 
> > foreign
> > data wrapper is available at this point to create a foreign server.
> > 
> > [1] 
> > https://www.postgresql.org/message-id/efdd465d-a795-6188-7f71-7cdb4b2be031%40mtneva.com
> > [2] https://www.postgresql.org/message-id/31246.1693337238%40sss.pgh.pa.us
> 
> Reading that thread, I had the impression that there was more support for
> honoring "\pset null" rather than unconditionally displaying "(none)".

I took Tom's response in the -general thread to mean that we could fix
\pset null also as a "nice to have" but not as a solution to the display
of zero privileges.

Only fixing \pset null has one drawback IMO because it only affects how
default privileges (more common) are printed.  The edge case of zero
privileges (less common) gets lost in a bunch of NULL output.  And I
assume most users change the default \pset null to some non-empty string
in their psqlrc (I do).

For example with your patch applied:

create table t1 (a int);
create table t2 (a int);
create table t3 (a int);

revoke all on t2 from :USER;

\pset null 
\dp t1|t2|t3
Access privileges
 Schema | Name | Type  | Access privileges | Column privileges | 
Policies

+--+---+---+---+--
 public | t1   | table | |   |
 public | t2   | table |   |   |
 public | t3   | table | |   |
(3 rows)

Instead of only displaying the zero privileges with my patch and default
\pset null:

\pset null ''
\dp t1|t2|t3
Access privileges
 Schema | Name | Type  | Access privileges | Column privileges | 
Policies

+--+---+---+---+--
 public | t1   | table |   |   |
 public | t2   | table | (none)|   |
 public | t3   | table |   |   |
(3 rows)

I guess if most tables have any non-default privileges then both
solutions are equally good.

> The simple attached patch does it like that.  What do you think?

LGTM.

-- 
Erik




CREATE DATABASE with filesystem cloning

2023-10-06 Thread Thomas Munro
Hello hackers,

Here is an experimental POC of fast/cheap database cloning.  For
clones from little template databases, no one cares much, but it might
be useful to be able to create a snapshot or fork of very large
database for testing/experimentation like this:

  create database foodb_snapshot20231007 template=foodb strategy=file_clone

It should be a lot faster, and use less physical disk, than the two
existing strategies on recent-ish XFS, BTRFS, very recent OpenZFS,
APFS (= macOS), and it could in theory be extended to other systems
that invented different system calls for this with more work (Solaris,
Windows).  Then extra physical disk space will be consumed only as the
two clones diverge.

It's just like the old strategy=file_copy, except it asks the OS to do
its best copying trick.  If you try it on a system that doesn't
support copy-on-write, then copy_file_range() should fall back to
plain old copy, but it might still be better than we could do, as it
can push copy commands to network storage or physical storage.

Therefore, the usual caveats from strategy=file_copy also apply here.
Namely that it has to perform checkpoints which could be very
expensive, and there are some quirks/brokenness about concurrent
backups and PITR.  Which makes me wonder if it's worth pursuing this
idea.  Thoughts?

I tested on bleeding edge FreeBSD/ZFS, where you need to set sysctl
vfs.zfs.bclone_enabled=1 to enable the optimisation, as it's still a
very new feature that is still being rolled out.  The system call
succeeds either way, but that controls whether the new database
initially shares blocks on disk, or get new copies.  I also tested on
a Mac.  In both cases I could clone large databases in a fraction of a
second.
From 341c0287237d59b0723bc88244dc1c48ed06c5a5 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sat, 2 Sep 2023 22:21:49 +1200
Subject: [PATCH] WIP: CREATE DATABASE ... STRATEGY=FILE_CLONE.

Similar to STRATEGY=FILE_COPY, but using facilities that tell the OS
explicitly that we're copying, so that it has the opportunity to share
block ranges in copy-on-write file systems, or maybe push down the copy
to network file systems and storage devices.

Currently works on Linux, FreeBSD and macOS.  More systems could be
supported.

XXX need docs
XXX need a test
XXX need redo -- what to do if unsupported during redo, plain copy?

diff --git a/configure b/configure
index d47e0f8b26..2076b19a1b 100755
--- a/configure
+++ b/configure
@@ -15578,7 +15578,7 @@ fi
 LIBS_including_readline="$LIBS"
 LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 
-for ac_func in backtrace_symbols copyfile getifaddrs getpeerucred inet_pton kqueue mbstowcs_l memset_s posix_fallocate ppoll pthread_is_threaded_np setproctitle setproctitle_fast strchrnul strsignal syncfs sync_file_range uselocale wcstombs_l
+for ac_func in backtrace_symbols copyfile copy_file_range getifaddrs getpeerucred inet_pton kqueue mbstowcs_l memset_s posix_fallocate ppoll pthread_is_threaded_np setproctitle setproctitle_fast strchrnul strsignal syncfs sync_file_range uselocale wcstombs_l
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.ac b/configure.ac
index 440b08d113..d0d31dd91e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1767,6 +1767,7 @@ LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 AC_CHECK_FUNCS(m4_normalize([
 	backtrace_symbols
 	copyfile
+	copy_file_range
 	getifaddrs
 	getpeerucred
 	inet_pton
diff --git a/meson.build b/meson.build
index 862c955453..20e7327e9e 100644
--- a/meson.build
+++ b/meson.build
@@ -2415,6 +2415,7 @@ func_checks = [
   ['backtrace_symbols', {'dependencies': [execinfo_dep]}],
   ['clock_gettime', {'dependencies': [rt_dep], 'define': false}],
   ['copyfile'],
+  ['copy_file_range'],
   # gcc/clang's sanitizer helper library provides dlopen but not dlsym, thus
   # when enabling asan the dlopen check doesn't notice that -ldl is actually
   # required. Just checking for dlsym() ought to suffice.
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 307729ab7e..9bbcabbb6f 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -79,11 +79,14 @@
  * CREATEDB_FILE_COPY will simply perform a file system level copy of the
  * database and log a single record for each tablespace copied. To make this
  * safe, it also triggers checkpoints before and after the operation.
+ *
+ * CREATEDB_FILE_CLONE is the same, but uses faster file cloning system calls.
  */
 typedef enum CreateDBStrategy
 {
 	CREATEDB_WAL_LOG,
-	CREATEDB_FILE_COPY
+	CREATEDB_FILE_COPY,
+	CREATEDB_FILE_CLONE
 } CreateDBStrategy;
 
 typedef struct
@@ -137,7 +140,8 @@ static CreateDBRelInfo *ScanSourceDatabasePgClassTuple(HeapTupleData *tuple,
 static void CreateDirAndVersionFile(char *dbpath, Oid dbid, Oid tsid,
 	bool isRedo);
 static void CreateDatabaseUsingFileCopy(