Do we need to pass down nonnullable_vars when reducing outer joins?

2022-09-13 Thread Richard Guo
AFAICS, the Vars forced nonnullable by given clause are only used to
check if we can reduce JOIN_LEFT to JOIN_ANTI, and it is checking the
join's own quals there. It seems to me we do not need to pass down
nonnullable_vars by upper quals to the children of a join.

Attached is a patch to remove the pass-down of nonnullable_vars.

Thanks
Richard


v1-0001-Don-t-pass-down-nonnullable_vars-when-reducing-ou.patch
Description: Binary data


Re: Error "initial slot snapshot too large" in create replication slot

2022-09-13 Thread Kyotaro Horiguchi
At Tue, 13 Sep 2022 12:08:18 +0530, Dilip Kumar  wrote 
in 
> On Tue, Sep 13, 2022 at 11:52 AM Kyotaro Horiguchi
>  wrote:
> > That function is called after the SnapBuild reaches
> > SNAPBUILD_CONSISTENT state ,or SnapBuildInitialSnapshot() rejects
> > other than that state. That is, IIUC the top-sub relationship of all
> > the currently running transactions is fully known to reorder buffer.
> > We need a comment about that.
> 
> I don't think this assumption is true, any xid started after switching
> to the SNAPBUILD_FULL_SNAPSHOT and before switching to the
> SNAPBUILD_CONSISTENT, might still be in progress so we can not
> identify whether they are subxact or not from reorder buffer.

Yeah, I misunderstood that the relationship is recorded earlier
(how?).  Thus it is not reliable in the first place.

I agree that the best way is oversized xip. 


By the way, I feel that "is >= than" is redundant or plain wrong..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: pg_basebackup's --gzip switch misbehaves

2022-09-13 Thread Michael Paquier
On Sun, Sep 04, 2022 at 02:20:52PM +0900, Michael Paquier wrote:
> ffd5365 has missed that wal_compress_level should be set to
> Z_DEFAULT_COMPRESSION if there is nothing set in the compression
> spec for a zlib build.  pg_receivewal.c enforces that already.

So, I have looked at this one.  And it seems to me that the confusion
comes down to the existence of PG_COMPRESSION_OPTION_LEVEL.  I have
considered a couple of approaches here, like introducing an extra
routine in compression.c to assign a default compression level, but
my conclusion is at the end simpler: we always finish by setting up a
level even if the caller wants nothing, in which can we can just use
each library's default.  And lz4, zstd and zlib are able to handle the
case where a default is given down to their internal routines just
fine.

Attached is the patch I am finishing with, consisting of:
- the removal of PG_COMPRESSION_OPTION_LEVEL.
- assigning a default compression level when nothing is specified in
the spec.
- a couple of complifications in pg_receivewal, pg_basebackup and the
backend code as there is no need to worry about the compression
level.

A nice effect of this approach is that we can centralize the checks on
lz4, zstd and zlib when a build does not support any of these
options, as well as centralize the place where the default compression
levels are set.  This passes all the regression tests, and it fixes
the issue reported.  (Note that I have yet to run tests with all the
libraries disabled in ./configure.)

Thoughts?
--
Michael
From 8752cb283d65e40fc4096f84d99bcb93d97c59bc Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 13 Sep 2022 16:06:29 +0900
Subject: [PATCH] Simplify compression level handling in compression
 specifications

---
 src/include/common/compression.h |   3 +-
 src/backend/backup/basebackup_gzip.c |  10 +-
 src/backend/backup/basebackup_lz4.c  |   9 +-
 src/backend/backup/basebackup_zstd.c |  11 +-
 src/common/compression.c | 105 +++
 src/bin/pg_basebackup/bbstreamer_gzip.c  |   4 +-
 src/bin/pg_basebackup/bbstreamer_lz4.c   |   3 +-
 src/bin/pg_basebackup/bbstreamer_zstd.c  |  13 +--
 src/bin/pg_basebackup/pg_basebackup.c|  19 +---
 src/bin/pg_basebackup/pg_receivewal.c|  35 +--
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |   3 -
 11 files changed, 107 insertions(+), 108 deletions(-)

diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 436b48104e..5d680058ed 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -22,8 +22,7 @@ typedef enum pg_compress_algorithm
 	PG_COMPRESSION_ZSTD
 } pg_compress_algorithm;
 
-#define PG_COMPRESSION_OPTION_LEVEL			(1 << 0)
-#define PG_COMPRESSION_OPTION_WORKERS		(1 << 1)
+#define PG_COMPRESSION_OPTION_WORKERS		(1 << 0)
 
 typedef struct pg_compress_specification
 {
diff --git a/src/backend/backup/basebackup_gzip.c b/src/backend/backup/basebackup_gzip.c
index a965866ff2..1ec42791f1 100644
--- a/src/backend/backup/basebackup_gzip.c
+++ b/src/backend/backup/basebackup_gzip.c
@@ -72,13 +72,9 @@ bbsink_gzip_new(bbsink *next, pg_compress_specification *compress)
 
 	Assert(next != NULL);
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) == 0)
-		compresslevel = Z_DEFAULT_COMPRESSION;
-	else
-	{
-		compresslevel = compress->level;
-		Assert(compresslevel >= 1 && compresslevel <= 9);
-	}
+	compresslevel = compress->level;
+	Assert((compresslevel >= 1 && compresslevel <= 9) ||
+		   compresslevel == Z_DEFAULT_COMPRESSION);
 
 	sink = palloc0(sizeof(bbsink_gzip));
 	*((const bbsink_ops **) &sink->base.bbs_ops) = &bbsink_gzip_ops;
diff --git a/src/backend/backup/basebackup_lz4.c b/src/backend/backup/basebackup_lz4.c
index d919e3dec7..986272ab9a 100644
--- a/src/backend/backup/basebackup_lz4.c
+++ b/src/backend/backup/basebackup_lz4.c
@@ -72,13 +72,8 @@ bbsink_lz4_new(bbsink *next, pg_compress_specification *compress)
 
 	Assert(next != NULL);
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) == 0)
-		compresslevel = 0;
-	else
-	{
-		compresslevel = compress->level;
-		Assert(compresslevel >= 1 && compresslevel <= 12);
-	}
+	compresslevel = compress->level;
+	Assert(compresslevel >= 0 && compresslevel <= 12);
 
 	sink = palloc0(sizeof(bbsink_lz4));
 	*((const bbsink_ops **) &sink->base.bbs_ops) = &bbsink_lz4_ops;
diff --git a/src/backend/backup/basebackup_zstd.c b/src/backend/backup/basebackup_zstd.c
index 865067f8dc..409bf88101 100644
--- a/src/backend/backup/basebackup_zstd.c
+++ b/src/backend/backup/basebackup_zstd.c
@@ -96,14 +96,11 @@ bbsink_zstd_begin_backup(bbsink *sink)
 	if (!mysink->cctx)
 		elog(ERROR, "could not create zstd compression context");
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) != 0)
-	{
-		ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
+	ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
 	 compress-

Re: Error "initial slot snapshot too large" in create replication slot

2022-09-13 Thread Kyotaro Horiguchi
At Tue, 13 Sep 2022 15:45:07 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Mon, 12 Sep 2022 14:51:56 -0700, Andres Freund  wrote 
> in 
> > This sees a tad misleading - the previous snapshot wasn't borken, right?
> 
> I saw it kind of broken that ->xip contains sub transactions.  But I
> didn't meant it's broken by "correct". Is "proper" suitable there?

No. It's not broken if it is takenDuringRecovery.  So this flag can be
used to notify that xip can be oversized.

I realized that rbtxn_is_known_subxact is not reliable. I'm
redirecting to oversized xip. Pleas wait for a while.

regareds.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Removed unused param isSlice of function transformAssignmentSubscripts

2022-09-13 Thread Richard Guo
On Tue, Sep 13, 2022 at 11:35 AM Zhang Mingli  wrote:

> Param isSlice was once used to identity targetTypeId for
> transformAssignmentIndirection.
>
> In commit c7aba7c14e, the evaluation was pushed down to
> transformContainerSubscripts.
>
> No need to keep isSlice around transformAssignmentSubscripts.
>
> Attach a patch to remove it.
>

+1. Good catch.

Thanks
Richard


Re: Error "initial slot snapshot too large" in create replication slot

2022-09-13 Thread Kyotaro Horiguchi
At Tue, 13 Sep 2022 16:10:59 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Tue, 13 Sep 2022 12:08:18 +0530, Dilip Kumar  wrote 
> in 
> > On Tue, Sep 13, 2022 at 11:52 AM Kyotaro Horiguchi
> >  wrote:
> > > That function is called after the SnapBuild reaches
> > > SNAPBUILD_CONSISTENT state ,or SnapBuildInitialSnapshot() rejects
> > > other than that state. That is, IIUC the top-sub relationship of all
> > > the currently running transactions is fully known to reorder buffer.
> > > We need a comment about that.
> > 
> > I don't think this assumption is true, any xid started after switching
> > to the SNAPBUILD_FULL_SNAPSHOT and before switching to the
> > SNAPBUILD_CONSISTENT, might still be in progress so we can not
> > identify whether they are subxact or not from reorder buffer.
> 
> Yeah, I misunderstood that the relationship is recorded earlier
> (how?).  Thus it is not reliable in the first place.
> 
> I agree that the best way is oversized xip. 
> 
> 
> By the way, I feel that "is >= than" is redundant or plain wrong..

By the way GetSnapshotData() does this:

>   snapshot->subxip = (TransactionId *)
>   malloc(GetMaxSnapshotSubxidCount() * 
> sizeof(TransactionId));
...
>   if (!snapshot->takenDuringRecovery)
...
>   else
>   {
>   subcount = KnownAssignedXidsGetAndSetXmin(snapshot->subxip, 
> &xmin,
>   
>   xmax);

It is possible that the subxip is overrun. We need to expand the array
somehow. Or assign the array of the size (GetMaxSnapshotXidCount() +
GetMaxSnapshotSubxidCount()) for takenDuringRecovery snapshots.

(I feel deja vu..)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Error "initial slot snapshot too large" in create replication slot

2022-09-13 Thread Kyotaro Horiguchi
Sigh..

At Tue, 13 Sep 2022 16:30:06 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> It is possible that the subxip is overrun. We need to expand the array
> somehow. Or assign the array of the size (GetMaxSnapshotXidCount() +
> GetMaxSnapshotSubxidCount()) for takenDuringRecovery snapshots.

And I found that this is already done.  What we should is doing the
same thing in snapbuild.

Sorry for the noise..

regards.
-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Tuples inserted and deleted by the same transaction

2022-09-13 Thread Laurenz Albe
Shouldn't such tuples be considered dead right away, even if the inserting
transaction is still active?  That would allow cleaning them up even before
the transaction is done.

There is this code in HeapTupleSatisfiesVacuumHorizon:

else if 
(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple)))
{
[...]
/* inserted and then deleted by same xact */
if 
(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetUpdateXid(tuple)))
return HEAPTUPLE_DELETE_IN_PROGRESS;

Why HEAPTUPLE_DELETE_IN_PROGRESS and not HEAPTUPLE_DEAD?

Yours,
Laurenz Albe




Re: Reducing the chunk header sizes on all memory context types

2022-09-13 Thread David Rowley
On Fri, 9 Sept 2022 at 11:33, David Rowley  wrote:
> I really think the Assert only form of MemoryContextContains() is the
> best move, and if it's doing Assert only, then we can do the
> loop-over-the-blocks idea as you described and I drafted in [1].
>
> If the need comes up that we're certain we always have a pointer to
> some allocated chunk, but need to know if it's in some memory context,
> then the proper form of expressing that, I think, should be:
>
> if (GetMemoryChunkContext(pointer) == somecontext)
>
> If we're worried about getting that wrong, we can beef up the
> MemoryChunk struct with a magic_number field in
> MEMORY_CONTEXT_CHECKING builds to ensure we catch any code which
> passes invalid pointers.

I've attached a patch series which is my proposal on what we should do
about MemoryContextContains. In summary, this basically changes
MemoryContextContains() so it's only available in
MEMORY_CONTEXT_CHECKING builds and removes 4 current usages of the
function.

0001: Makes MemoryContextContains work again with the
loop-over-the-blocks method mentioned by Tom.
0002: Adds a new "chunk_magic" field to MemoryChunk.  My thoughts are
that it might be a good idea to do this so that we get Assert failures
if we use functions like GetMemoryChunkContext() on a pointer that's
not a MemoryChunk.
0003: This adjusts aggregate final functions and window functions so
that any byref Datum they return is allocated in CurrentMemoryContext
0004: Makes MemoryContextContains only available in
MEMORY_CONTEXT_CHECKING builds and adjusts our usages of the function
to use GetMemoryChunkContext() instead.

An alternative to 0004, would be more along the lines of what was
mentioned by Andres and just Assert that the returned value is in the
memory context that we expect. I don't think we need to do anything
special with aggregates that take an internal state.  I think the rule
is just as simple as;  all final functions and window functions must
return any byref values in CurrentMemoryContext.  For aggregates
without a finalfn, we can just datumCopy() the returned byref value.
There's no chance for those to be in CurrentMemoryContext anyway. The
return value must be in the aggregate state's context.  The attached
assert.patch shows that this holds true in make check after applying
each of the other patches.

I see that one of the drawbacks from not using MemoryContextContains()
is that window functions such as lead(), lag(), first_value(),
last_value() and nth_value() may now do the datumCopy() when it's not
needed. For example, with a window function call such as
lead(byref_col ), the expression evaluation code in
WinGetFuncArgInPartition() will just return the address in the
tuplestore tuple for "byref_col".  The datumCopy() is needed for that.
However, if the function call was lead(byref_col || 'something') then
we'd have ended up with a new allocation in CurrentMemoryContext to
concatenate the two values.  We'll now do a datumCopy() where we
previously wouldn't have. I don't really see any way around that
without doing some highly invasive surgery to the expression
evaluation code.

None of the attached patches are polished. I can do that once we agree
on the best way to fix the issue.

David
From a9fcf078bc8a87742860abee527199f772f39f5b Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Thu, 8 Sep 2022 23:32:03 +1200
Subject: [PATCH v1 1/4] Reinstate MemoryContextContains to work with arbitrary
 pointers

---
 src/backend/utils/mmgr/aset.c| 42 ++
 src/backend/utils/mmgr/generation.c  | 45 +++
 src/backend/utils/mmgr/mcxt.c| 56 ++--
 src/backend/utils/mmgr/slab.c| 44 +++
 src/include/nodes/memnodes.h |  4 +-
 src/include/utils/memutils_internal.h|  3 ++
 src/include/utils/memutils_memorychunk.h | 21 +
 7 files changed, 191 insertions(+), 24 deletions(-)

diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index ec423375ae..24ab7399a3 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -1333,6 +1333,48 @@ AllocSetGetChunkContext(void *pointer)
return &set->header;
 }
 
+/*
+ * AllocSetContains
+ * Attempt to determine if 'pointer' is memory which was allocated 
by
+ * 'context'.  Return true if it is, otherwise false.
+ */
+bool
+AllocSetContains(MemoryContext context, void *pointer)
+{
+   MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
+   AllocBlock  block;
+   AllocSetset = (AllocSet) context;
+
+   /*
+* We must use MemoryChunkIsExternalUnSafePointer() instead of
+* MemoryChunkIsExternal() here as 'chunk' may not be a MemoryChunk if
+* 'pointer' is not pointing to palloced memory.  Below we're careful
+* never to dereference 'block' as it could point to memory not owned by
+* this process.
+*/
+   if (MemoryChunkIsEx

Re: is_superuser is not documented

2022-09-13 Thread bt22kawamotok



Thanks for the patch!


+ 
+  is_superuser (boolean)

You need to add this entry just after that of "in_hot_standby" because
the descriptions of preset parameters should be placed in alphabetical
order in the docs.


+   
+Reports whether the user is superuser or not.

Isn't it better to add "current" before "user", e.g.,
"Reports whether the current user is a superuser"?


/* Not for general use --- used by SET SESSION AUTHORIZATION */
-   {"is_superuser", PGC_INTERNAL, UNGROUPED,
+   {"is_superuser", PGC_INTERNAL, PRESET_OPTIONS,

This comment should be rewritten or removed because "Not for general
use" part is not true.


-   GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL |
GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | 
GUC_DISALLOW_IN_FILE


As Tom commented upthread, GUC_NO_RESET_ALL flag should be removed
because it's not necessary when PGC_INTERNAL context (i.e., in this 
context,

RESET ALL is prohibit by defaulted) is used.


With the patch, make check failed. You need to update
src/test/regress/expected/guc.out.


   
IS_SUPERUSER

 
  True if the current role has superuser privileges.
 

I found that the docs of SET command has the above description about
is_superuser.
This description seems useless if we document the is_superuser GUC
itself. So isn't
it better to remove this description?


Thank you for your review.
I create new patch add_document_is_superuser_v2.
please check it.

Regards,
Kotaro Kawamotodiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a5cd4e44c7..6358f5e887 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -10584,6 +10584,19 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
   
  
 
+ 
+  is_superuser (boolean)
+  
+   is_superuser configuration parameter
+  
+  
+  
+   
+Shows whether the current user is a superuser or not.
+   
+  
+ 
+
  
   lc_collate (string)
   
diff --git a/doc/src/sgml/ref/show.sgml b/doc/src/sgml/ref/show.sgml
index b3747b119f..d8721be805 100644
--- a/doc/src/sgml/ref/show.sgml
+++ b/doc/src/sgml/ref/show.sgml
@@ -100,15 +100,6 @@ SHOW ALL
  
 

-
-   
-IS_SUPERUSER
-
- 
-  True if the current role has superuser privileges.
- 
-
-   
   
 

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c336698ad5..e49e6f9646 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1235,11 +1235,10 @@ static struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 	{
-		/* Not for general use --- used by SET SESSION AUTHORIZATION */
-		{"is_superuser", PGC_INTERNAL, UNGROUPED,
+		{"is_superuser", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows whether the current user is a superuser."),
 			NULL,
-			GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_REPORT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
 		},
 		&session_auth_is_superuser,
 		false,
@@ -4344,7 +4343,6 @@ static struct config_string ConfigureNamesString[] =
 	},
 
 	{
-		/* Not for general use --- used by SET SESSION AUTHORIZATION */
 		{"session_authorization", PGC_USERSET, UNGROUPED,
 			gettext_noop("Sets the session user name."),
 			NULL,


Re: Error "initial slot snapshot too large" in create replication slot

2022-09-13 Thread Kyotaro Horiguchi
At Tue, 13 Sep 2022 16:15:34 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Tue, 13 Sep 2022 15:45:07 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 
> > At Mon, 12 Sep 2022 14:51:56 -0700, Andres Freund  
> > wrote in 
> > > This sees a tad misleading - the previous snapshot wasn't borken, right?
> > 
> > I saw it kind of broken that ->xip contains sub transactions.  But I
> > didn't meant it's broken by "correct". Is "proper" suitable there?
> 
> No. It's not broken if it is takenDuringRecovery.  So this flag can be
> used to notify that xip can be oversized.
> 
> I realized that rbtxn_is_known_subxact is not reliable. I'm
> redirecting to oversized xip. Pleas wait for a while.

However, the reader of saved snapshots (ImportSnapshot) has the
restriction that

>   if (xcnt < 0 || xcnt > GetMaxSnapshotXidCount())
>   ereport(ERROR,

and

>   if (xcnt < 0 || xcnt > GetMaxSnapshotSubxidCount())
>   ereport(ERROR,
 (this xid is subxcnt)

And ExportSnapshot repalces oversized subxip with overflowed.

So even when GetSnapshotData() returns a snapshot with oversized
subxip, it is saved as just "overflowed" on exporting. I don't think
this is the expected behavior since such (no xip and overflowed)
snapshot no longer works.

On the other hand, it seems to me that snapbuild doesn't like
takenDuringRecovery snapshots.

So snapshot needs additional flag signals that xip is oversized and
all xid are holded there. And also need to let takenDuringRecovery
suggest subxip is oversized.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Fix broken link to FreeBSD DocProj in docs

2022-09-13 Thread Daniel Gustafsson
> On 12 Sep 2022, at 20:46, Daniel Gustafsson  wrote:
> 
>> On 12 Sep 2022, at 18:13, James Coleman  wrote:
> 
>> See attached simple patch to fix $SUBJECT; the old link generates a Not 
>> Found.
> 
> According to archive.org the freebsd.org site changed sometime in early 2021
> with a 301 redirect to docproj/docproj which then ends up with a 404.  I'll
> apply this back through v10 to get a working link and will report it to the
> FreeBSD web team. Thanks for the fix!

Committed and redirect bug reported to FreeBSD [0], thanks!

--
Daniel Gustafsson   https://vmware.com/

[0] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266393





Re: proposal: possibility to read dumped table's name from file

2022-09-13 Thread John Naylor
On Mon, Sep 12, 2022 at 8:10 PM Pavel Stehule  wrote:
>
> po 12. 9. 2022 v 9:59 odesílatel Daniel Gustafsson  napsal:
>> I don't the capabilities of the tool is all that interesting compared to the
>> long term maintainability and readability of the source code.

With make distprep and maintainer-clean, separate makefile and MSVC
build logic a short time before converting to Meson, I'm not sure that
even the short term maintainability here is a good trade off for what
we're getting.

> The parser in bison/flex does the same work and it is true, so code is more 
> readable. Although for this case, a handy written parser was trivial too.

If the hand-written version is trivial, then we should prefer it.
--
John Naylor
EDB: http://www.enterprisedb.com




Re: Tuples inserted and deleted by the same transaction

2022-09-13 Thread Nikita Malakhov
Hi!

Please correct me if I'm wrong, despite tuples being inserted and deleted
by the same
transaction - they are visible inside the transaction and usable by it, so
considering them
dead and cleaning up during execution is a bad idea until the
transaction is ended.

On Tue, Sep 13, 2022 at 11:06 AM Laurenz Albe 
wrote:

> Shouldn't such tuples be considered dead right away, even if the inserting
> transaction is still active?  That would allow cleaning them up even before
> the transaction is done.
>
> There is this code in HeapTupleSatisfiesVacuumHorizon:
>
> else if
> (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple)))
> {
> [...]
> /* inserted and then deleted by same xact */
> if
> (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetUpdateXid(tuple)))
> return HEAPTUPLE_DELETE_IN_PROGRESS;
>
> Why HEAPTUPLE_DELETE_IN_PROGRESS and not HEAPTUPLE_DEAD?
>
> Yours,
> Laurenz Albe
>
>
> --
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: Splitting up guc.c

2022-09-13 Thread Alvaro Herrera
On 2022-Sep-12, Tom Lane wrote:

> Yeah, I suspect people will have to manually reapply any changes in
> the GUC tables to guc_tables.c.  That'll be the same amount of work
> for them whenever we commit this patch (unless theirs lands first,
> in which case I have to deal with it).  The issue I think is
> whether it's politer to make that happen during a CF or between
> CFs.

Personally I would prefer that this kind of thing is done quickly rather
than delay it to some uncertain future.  That way I can deal with it
straight ahead rather than living with the anxiety that it will land
later and I will have to deal with it then.  I see no benefit in
waiting.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: broken table formatting in psql

2022-09-13 Thread John Naylor
On Thu, Sep 8, 2022 at 12:39 PM John Naylor
 wrote:
>
> On Fri, Sep 2, 2022 at 3:19 PM Kyotaro Horiguchi
>  wrote:
> >
> > At Fri, 2 Sep 2022 13:43:50 +0700, John Naylor 
> >  wrote in
> > > If there is any doubt about including all of Cf, we could also just
> > > add a branch in wchar.c to hard-code the 200B-200F range.
> >
> > If every way has defect to the similar extent, I think we will choose
> > to use authoritative data at least for the first step. We might want
> > to have additional filtering on it but it would be another issue,
> > maybe.
> >
> > Attached is the first cut of that. (The commit messages is not great,
> > though.)
>
> Okay, the patch looks good to me overall.

As discussed, I pushed to master only, with only one additional
comment in the perl script to describe Me/Mn/Cf.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: proposal: possibility to read dumped table's name from file

2022-09-13 Thread Pavel Stehule
út 13. 9. 2022 v 10:46 odesílatel John Naylor 
napsal:

> On Mon, Sep 12, 2022 at 8:10 PM Pavel Stehule 
> wrote:
> >
> > po 12. 9. 2022 v 9:59 odesílatel Daniel Gustafsson 
> napsal:
> >> I don't the capabilities of the tool is all that interesting compared
> to the
> >> long term maintainability and readability of the source code.
>
> With make distprep and maintainer-clean, separate makefile and MSVC
> build logic a short time before converting to Meson, I'm not sure that
> even the short term maintainability here is a good trade off for what
> we're getting.
>
> > The parser in bison/flex does the same work and it is true, so code is
> more readable. Although for this case, a handy written parser was trivial
> too.
>
> If the hand-written version is trivial, then we should prefer it.
>

Please, can you check and compare both versions? My view is subjective.

Regards

Pavel




> --
> John Naylor
> EDB: http://www.enterprisedb.com
>


Re: Perform streaming logical transactions by background workers and parallel apply

2022-09-13 Thread Amit Kapila
On Fri, Sep 9, 2022 at 2:31 PM houzj.f...@fujitsu.com
 wrote:
>
> On Friday, September 9, 2022 3:02 PM Peter Smith  
> wrote:
> >
>
> > 3.
> >
> > max_logical_replication_workers (integer)
> > Specifies maximum number of logical replication workers. This
> > includes apply leader workers, parallel apply workers, and table
> > synchronization workers.
> > Logical replication workers are taken from the pool defined by
> > max_worker_processes.
> > The default value is 4. This parameter can only be set at server start.
> >
> > ~
> >
> > I did not really understand why the default is 4. Because the  default
> > tablesync workers is 2, and the default parallel workers is 2, but
> > what about accounting for the apply worker? Therefore, shouldn't
> > max_logical_replication_workers default be 5 instead of 4?
>
> The parallel apply is disabled by default, so it's not a must to increase this
> global default value as discussed[1]
>
> [1] 
> https://www.postgresql.org/message-id/CAD21AoCwaU8SqjmC7UkKWNjDg3Uz4FDGurMpis3zw5SEC%2B27jQ%40mail.gmail.com
>

Okay, but can we document to increase this value when the parallel
apply is enabled?

-- 
With Regards,
Amit Kapila.




Re: START_REPLICATION SLOT causing a crash in an assert build

2022-09-13 Thread Kyotaro Horiguchi
Nice finding.

At Tue, 13 Sep 2022 00:39:45 -0500, Jaime Casanova 
 wrote in 
> and the problem seems to be that after zero'ing the stats that includes
> the name of the replication slot, this simple patch fixes it... not sure
> if it's the right fix though...

That doesn't work. since what that function clears is not the name in
the slot struct but that in stats entry.

The cause is what pg_stat_reset_replslot wants to do does not match
what pgstat feature thinks as reset.

Unfortunately, we don't have a function to leave a portion alone after
a reset. However, fetching the stats entry in pgstat_reset_replslot is
ugly..

I'm not sure this is less uglier but it works if
pgstat_report_replslot sets the name if it is found to be wiped
out... But that hafly nullify the protction by the assertion just
after.


--- a/src/backend/utils/activity/pgstat_replslot.c
+++ b/src/backend/utils/activity/pgstat_replslot.c
@@ -83,9 +83,11 @@ pgstat_report_replslot(ReplicationSlot *slot, const 
PgStat_StatReplSlotEntry *re
statent = &shstatent->stats;
 
/*
-* Any mismatch should have been fixed in pgstat_create_replslot() or
-* pgstat_acquire_replslot().
+* pgstat_create_replslot() and pgstat_acquire_replslot() enters the 
name,
+* but pgstat_reset_replslot() may clear it.
 */
+   if (statent->slotname.data[0] == 0)
+   namestrcpy(&shstatent->stats.slotname, 
NameStr(slot->data.name));
Assert(namestrcmp(&statent->slotname, NameStr(slot->data.name)) == 0);


Another measure would be to add the region to wipe-out on reset to
PgStat_KindInfo, but it seems too much.. (attached)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/utils/activity/pgstat.c 
b/src/backend/utils/activity/pgstat.c
index 6224c498c2..ab88ebbc64 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -263,6 +263,8 @@ static const PgStat_KindInfo 
pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_Database),
.shared_data_off = offsetof(PgStatShared_Database, stats),
.shared_data_len = sizeof(((PgStatShared_Database *) 0)->stats),
+   .reset_off = offsetof(PgStatShared_Database, stats),
+   .reset_len = sizeof(((PgStatShared_Database *) 0)->stats),
.pending_size = sizeof(PgStat_StatDBEntry),
 
.flush_pending_cb = pgstat_database_flush_cb,
@@ -277,6 +279,8 @@ static const PgStat_KindInfo 
pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
.shared_data_len = sizeof(((PgStatShared_Relation *) 0)->stats),
+   .reset_off = offsetof(PgStatShared_Relation, stats),
+   .reset_len = sizeof(((PgStatShared_Relation *) 0)->stats),
.pending_size = sizeof(PgStat_TableStatus),
 
.flush_pending_cb = pgstat_relation_flush_cb,
@@ -291,6 +295,8 @@ static const PgStat_KindInfo 
pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_Function),
.shared_data_off = offsetof(PgStatShared_Function, stats),
.shared_data_len = sizeof(((PgStatShared_Function *) 0)->stats),
+   .reset_off = offsetof(PgStatShared_Function, stats),
+   .reset_len = sizeof(((PgStatShared_Function *) 0)->stats),
.pending_size = sizeof(PgStat_BackendFunctionEntry),
 
.flush_pending_cb = pgstat_function_flush_cb,
@@ -307,6 +313,9 @@ static const PgStat_KindInfo 
pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_ReplSlot),
.shared_data_off = offsetof(PgStatShared_ReplSlot, stats),
.shared_data_len = sizeof(((PgStatShared_ReplSlot *) 0)->stats),
+   .reset_off = offsetof(PgStatShared_ReplSlot, stats.spill_txns),
+   .reset_len = sizeof(PgStatShared_ReplSlot) -
+   offsetof(PgStatShared_ReplSlot, stats.spill_txns),
 
.reset_timestamp_cb = pgstat_replslot_reset_timestamp_cb,
.to_serialized_name = pgstat_replslot_to_serialized_name_cb,
@@ -323,6 +332,8 @@ static const PgStat_KindInfo 
pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_Subscription),
.shared_data_off = offsetof(PgStatShared_Subscription, stats),
.shared_data_len = sizeof(((PgStatShared_Subscription *) 
0)->stats),
+   .reset_off = offsetof(PgStatShared_Subscription, stats),
+   .reset_len = sizeof(((PgStatShared_Subscription *) 0)->stats),
.pending_size = sizeof(PgStat_BackendSubEntry),
 
.flush_pending_cb = pgstat_subscription_flush_cb,
diff --git a/src/backend/utils/activity/pgstat_shmem

Re: [PATCH v1] fix potential memory leak in untransformRelOptions

2022-09-13 Thread Alvaro Herrera
On 2022-Sep-09, Alvaro Herrera wrote:

> Keeping 's' and removing the pstrdups better uses memory, because we
> have a single palloc'ed chunk per option rather than two.

Pushed.  This is pretty much cosmetic, so no backpatch.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"The Gord often wonders why people threaten never to come back after they've
been told never to return" (www.actsofgord.com)




Re: Tuples inserted and deleted by the same transaction

2022-09-13 Thread Laurenz Albe
On Tue, 2022-09-13 at 11:47 +0300, Nikita Malakhov wrote:
> On Tue, Sep 13, 2022 at 11:06 AM Laurenz Albe  
> wrote:
> > Shouldn't such tuples be considered dead right away, even if the inserting
> > transaction is still active?  That would allow cleaning them up even before
> > the transaction is done.
> > 
> > There is this code in HeapTupleSatisfiesVacuumHorizon:
> > 
> >         else if 
> > (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple)))
> >         {
> >             [...]
> >             /* inserted and then deleted by same xact */
> >             if 
> > (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetUpdateXid(tuple)))
> >                 return HEAPTUPLE_DELETE_IN_PROGRESS;
> > 
> > Why HEAPTUPLE_DELETE_IN_PROGRESS and not HEAPTUPLE_DEAD?
>
> Please correct me if I'm wrong, despite tuples being inserted and deleted by 
> the same 
> transaction - they are visible inside the transaction and usable by it, so 
> considering them
> dead and cleaning up during execution is a bad idea until the transaction is 
> ended.

But once they are deleted or updated, even the transaction that created them 
cannot
see them any more, right?

Yours,
Laurenz Albe




Re: pgsql: Prefetch data referenced by the WAL, take II.

2022-09-13 Thread Alvaro Herrera
On 2022-Sep-13, Thomas Munro wrote:

> I'd go for your second suggestion.  'therein' doesn't convey much (we
> already said 'in the WAL', and therein is just a slightly formal and
> somehow more Germanic way of saying 'in it' which is kinda
> duplicative; maybe 'by it' is what we want but that's still kinda
> implied already), whereas adding 'data' makes it slightly clearer that
> it's blocks from relations and not, say, the WAL itself.
> 
> Also, hmm, it's not really the 'Maximum buffer size', it's the 'Buffer size'.

Sounds reasonable.  Pushed these changes.  Thank you!

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La libertad es como el dinero; el que no la sabe emplear la pierde" (Alvarez)




Re: Perform streaming logical transactions by background workers and parallel apply

2022-09-13 Thread Amit Kapila
On Fri, Sep 9, 2022 at 12:32 PM Peter Smith  wrote:
>
> 29. src/backend/replication/logical/worker.c - TransactionApplyAction
>
> /*
>  * What action to take for the transaction.
>  *
>  * TA_APPLY_IN_LEADER_WORKER means that we are in the leader apply worker and
>  * changes of the transaction are applied directly in the worker.
>  *
>  * TA_SERIALIZE_TO_FILE means that we are in leader apply worker and changes
>  * are written to temporary files and then applied when the final commit
>  * arrives.
>  *
>  * TA_APPLY_IN_PARALLEL_WORKER means that we are in the parallel apply worker
>  * and changes of the transaction are applied directly in the worker.
>  *
>  * TA_SEND_TO_PARALLEL_WORKER means that we are in the leader apply worker and
>  * need to send the changes to the parallel apply worker.
>  */
> typedef enum
> {
> /* The action for non-streaming transactions. */
> TA_APPLY_IN_LEADER_WORKER,
>
> /* Actions for streaming transactions. */
> TA_SERIALIZE_TO_FILE,
> TA_APPLY_IN_PARALLEL_WORKER,
> TA_SEND_TO_PARALLEL_WORKER
> } TransactionApplyAction;
>
> ~
>
> 29a.
> I think if you change all those enum names slightly (e.g. like below)
> then they can be more self-explanatory:
>
> TA_NOT_STREAMING_LEADER_APPLY
> TA_STREAMING_LEADER_SERIALIZE
> TA_STREAMING_LEADER_SEND_TO_PARALLEL
> TA_STREAMING_PARALLEL_APPLY
>
> ~
>

I also think we can improve naming but adding streaming in the names
makes them slightly difficult to read. As you have suggested, it will
be better to add comments for streaming and non-streaming cases. How
about naming them as below:

typedef enum
{
TRANS_LEADER_APPLY
TRANS_LEADER_SERIALIZE
TRANS_LEADER_SEND_TO_PARALLEL
TRANS_PARALLEL_APPLY
} TransApplyAction;

-- 
With Regards,
Amit Kapila.




Re: Assertion failure in WaitForWALToBecomeAvailable state machine

2022-09-13 Thread Kyotaro Horiguchi
At Tue, 13 Sep 2022 11:56:16 +0530, Bharath Rupireddy 
 wrote in 
> On Tue, Sep 13, 2022 at 8:52 AM Noah Misch  wrote:
> >
> > > > > [1] - 
> > > > > https://www.postgresql.org/message-id/flat/20220909.172949.2223165886970819060.horikyota.ntt%40gmail.com
> >
> > I plan to use that message's patch, because it guarantees WALRCV_STOPPED at
> > the code location being changed.  Today, in the unlikely event of
> > !WalRcvStreaming() due to WALRCV_WAITING or WALRCV_STOPPING, that code
> > proceeds without waiting for WALRCV_STOPPED.
> 
> Hm. That was the original fix [2] proposed and it works. The concern

(Mmm. sorry for omitting that.)

> is that XLogShutdownWalRcv() does a bunch of work via ShutdownWalRcv()
> - it calls ConditionVariablePrepareToSleep(),

Anyway the code path is executed in almost all cases because the same
assertion fires otherwise. So I don't see a problem if we do the bunch
of synchronization things also in that rare case.  I'm not sure we
want to do [3].

> > If WALRCV_WAITING or WALRCV_STOPPING can happen at that patch's code site, I
> > perhaps should back-patch the change to released versions.  Does anyone know
> > whether one or both can happen?
> 
> IMO, we must back-patch to the version where
> cc2c7d65fc27e877c9f407587b0b92d46cd6dd16  got introduced irrespective
> of any of the above happening.

That is, PG15? I agree to that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Estimating bloat for very large tables: what is the state of art?

2022-09-13 Thread Dmitry Astapov
Hi!

I am trying to solve the problem of estimating the table bloat (and index
bloat, though I am mostly focusing on tables at the moment).

After searching far and wide, it seems that the choice is to be made
between two methods:
1. Slow, but very precise pgstattuple
2. Fast, but somewhat imprecise "bloat query" which is attributed to
check_postgres  project, though there
are numerous

variations  in
existence.

pgstattuple is beautiful and accurate but rather slow. If tables are large,
pgstattuple_approx could easily take 5-10 minutes, and if that were the
case, you can see pgstattuple to take 30-60 minutes on the same table
easily.

"Bloat query", on the other hand, is wonderfully fast, but rather
imprecise. It tries to estimate the table data size as pg_class.reltuples *
row_width, where row_width is taken, roughly, to be (24 bytes for the
header + size of NULL map + (sum( (1 - null_frac)*avg_width ) for all
columns in the table, as reported by pg_statistics)).

This, of course, completely ignores the question of padding and so on
tables with a large number of columns the query tends to underestimate the
size of live data by some 10-20% (unless schema was explicitly created to
minimize padding).

I'd like to ask you:
1. Are these indeed two approaches the only options on the table, or am I
missing something?

2. I am considering my own approach where, after looking at pg_attributes
and pg_stats, I am constructing "an example row from this table with no
nulls" (so, max amount of data + max amount of padding) and "an example row
from the table with all the NULLs" (so, as little padding as possible), do
pg_column_size() on both these rows (so that pg_column_size could compute
size+padding for me) and then take an average between them, perhaps
weighted somehow by examining null_frac of table columns. Quick experiments
show that this yields a more accurate estimate of row size for tables with
large numbers of columns than what the "bloat query" does.  Question: can I
do anything better/easier here without sacrificing speed?

-- 
D. Astapov


Permissions denied for the database file system on Windows during restore

2022-09-13 Thread Joel Mariadasan (jomariad)
Hi,

When I try to restore an old backup on Windows, the restore always fails with a 
message that permission denied for "base/19513/21359".

Note this file name provided by postgres is random. When I check the properties 
of this file, this file doesn't have any owner at all. Windows says that no one 
is owning this file.
This is only happening to certain files in the base directory.

Due to these reasons, the restore operation always fails.

Queries:

  1.  How does postgres assign in conjunction with Windows assign ownership of 
the database files post a restore operation?
  2.  Are there any guesses on why the issue might have occured?

Regards,
Joel


Re: Tuples inserted and deleted by the same transaction

2022-09-13 Thread Pantelis Theodosiou
On Tue, Sep 13, 2022 at 11:04 AM Laurenz Albe  wrote:
>
> On Tue, 2022-09-13 at 11:47 +0300, Nikita Malakhov wrote:
> > On Tue, Sep 13, 2022 at 11:06 AM Laurenz Albe  
> > wrote:
> > > Shouldn't such tuples be considered dead right away, even if the inserting
> > > transaction is still active?  That would allow cleaning them up even 
> > > before
> > > the transaction is done.
> > >
> > > There is this code in HeapTupleSatisfiesVacuumHorizon:
> > >
> > > else if 
> > > (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple)))
> > > {
> > > [...]
> > > /* inserted and then deleted by same xact */
> > > if 
> > > (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetUpdateXid(tuple)))
> > > return HEAPTUPLE_DELETE_IN_PROGRESS;
> > >
> > > Why HEAPTUPLE_DELETE_IN_PROGRESS and not HEAPTUPLE_DEAD?
> >
> > Please correct me if I'm wrong, despite tuples being inserted and deleted 
> > by the same
> > transaction - they are visible inside the transaction and usable by it, so 
> > considering them
> > dead and cleaning up during execution is a bad idea until the transaction 
> > is ended.
>
> But once they are deleted or updated, even the transaction that created them 
> cannot
> see them any more, right?

Forgive me if this is not related but if there is a savepoint between
the insertion and deletion, wouldn't it be possible for the
transaction to recover the deleted tuples?

Best regards
Pantelis Theodosiou




Re: Perform streaming logical transactions by background workers and parallel apply

2022-09-13 Thread Amit Kapila
On Mon, Sep 12, 2022 at 4:27 PM kuroda.hay...@fujitsu.com
 wrote:
>
> Dear Hou-san,
>
> Thank you for updating the patch! Followings are comments for v28-0001.
> I will dig your patch more, but I send partially to keep the activity of the 
> thread.
>
> ===
> For applyparallelworker.c
>
> 01. filename
> The word-ordering of filename seems not good
> because you defined the new worker as "parallel apply worker".
>

I think in the future we may have more files for apply work (like
applyddl.c for DDL apply work), so it seems okay to name all apply
related files in a similar way.

>
> ===
> For worker.c
>
> 07. general
>
> In many lines if-else statement is used for apply_action, but I think they 
> should rewrite as switch-case statement.
>

Sounds reasonable to me.

> 08. global variable
>
> ```
> -static bool in_streamed_transaction = false;
> +bool in_streamed_transaction = false;
> ```
>
> a.
>
> It seems that in_streamed_transaction is used only in the worker.c, so we can 
> change to stati variable.
>

Yeah, I don't know why it has been changed in the first place.

> b.
>
> That flag is set only when an apply worker spill the transaction to the disk.
> How about "in_streamed_transaction" -> "in_spilled_transaction"?
>

Isn't this an existing variable? If so, it doesn't seem like a good
idea to change the name unless we are changing its meaning.

-- 
With Regards,
Amit Kapila.




Re: minimum perl version

2022-09-13 Thread John Naylor
On Sat, Sep 3, 2022 at 2:11 AM Tom Lane  wrote:
>
> Andres Freund  writes:
> > 5.14 would be a trivial lift as far as the buildfarm is concerned.
>
> Yeah, that seems like a reasonable new minimum for Perl.  I might
> see about setting up an animal running 5.14.0, just so we can say
> "5.14" in the docs without fine print.

Until such time as that happens, here is a draft to require 5.14.2.

-- 
John Naylor
EDB: http://www.enterprisedb.com
 config/perl.m4   |  4 ++--
 configure|  6 +++---
 doc/src/sgml/install-windows.sgml|  2 +-
 doc/src/sgml/installation.sgml   |  4 ++--
 src/pl/plperl/plc_perlboot.pl|  1 -
 src/test/perl/PostgreSQL/Test/Cluster.pm |  2 +-
 src/test/perl/README | 10 +++---
 src/tools/msvc/gendef.pl |  1 -
 src/tools/pgindent/pgindent  |  1 -
 9 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/config/perl.m4 b/config/perl.m4
index c9fd91397c..29f54bbb79 100644
--- a/config/perl.m4
+++ b/config/perl.m4
@@ -11,11 +11,11 @@ if test "$PERL"; then
   pgac_perl_version=`$PERL -v 2>/dev/null | sed -n ['s/This is perl.*v[a-z ]*\([0-9]\.[0-9][0-9.]*\).*$/\1/p']`
   AC_MSG_NOTICE([using perl $pgac_perl_version])
   if echo "$pgac_perl_version" | sed ['s/[.a-z_]/ /g'] | \
-$AWK '{ if ([$]1 == 5 && ([$]2 > 8 || ($[2] == 8 && [$]3 >= 3))) exit 1; else exit 0;}'
+$AWK '{ if ([$]1 == 5 && ([$]2 > 14 || ($[2] == 14 && [$]3 >= 2))) exit 1; else exit 0;}'
   then
 AC_MSG_WARN([
 *** The installed version of Perl, $PERL, is too old to use with PostgreSQL.
-*** Perl version 5.8.3 or later is required, but this is $pgac_perl_version.])
+*** Perl version 5.14.2 or later is required, but this is $pgac_perl_version.])
 PERL=""
   fi
 fi
diff --git a/configure b/configure
index fd2a782454..160c181441 100755
--- a/configure
+++ b/configure
@@ -10491,14 +10491,14 @@ if test "$PERL"; then
   { $as_echo "$as_me:${as_lineno-$LINENO}: using perl $pgac_perl_version" >&5
 $as_echo "$as_me: using perl $pgac_perl_version" >&6;}
   if echo "$pgac_perl_version" | sed 's/[.a-z_]/ /g' | \
-$AWK '{ if ($1 == 5 && ($2 > 8 || ($2 == 8 && $3 >= 3))) exit 1; else exit 0;}'
+$AWK '{ if ($1 == 5 && ($2 > 14 || ($2 == 14 && $3 >= 2))) exit 1; else exit 0;}'
   then
 { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING:
 *** The installed version of Perl, $PERL, is too old to use with PostgreSQL.
-*** Perl version 5.8.3 or later is required, but this is $pgac_perl_version." >&5
+*** Perl version 5.14.2 or later is required, but this is $pgac_perl_version." >&5
 $as_echo "$as_me: WARNING:
 *** The installed version of Perl, $PERL, is too old to use with PostgreSQL.
-*** Perl version 5.8.3 or later is required, but this is $pgac_perl_version." >&2;}
+*** Perl version 5.14.2 or later is required, but this is $pgac_perl_version." >&2;}
 PERL=""
   fi
 fi
diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml
index c00ab2b4b2..2e41d75db0 100644
--- a/doc/src/sgml/install-windows.sgml
+++ b/doc/src/sgml/install-windows.sgml
@@ -190,7 +190,7 @@ $ENV{MSBFLAGS}="/m";
   or Cygwin Perl will not work. It must also be present in the PATH.
   Binaries can be downloaded from
   https://www.activestate.com";>
-  (Note: version 5.8.3 or later is required,
+  (Note: version 5.14.2 or later is required,
   the free Standard Distribution is sufficient).
  
 
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 7c79608e55..5d7c573729 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -165,7 +165,7 @@ su - postgres
   PL/Perl you need a full
   Perl installation, including the
   libperl library and the header files.
-  The minimum required version is Perl 5.8.3.
+  The minimum required version is Perl 5.14.2.
   Since PL/Perl will be a shared
   library, the libperl
   libperl library must be a shared library
@@ -325,7 +325,7 @@ su - postgres
perl
   
 
-  Perl 5.8.3 or later is needed to build from a Git checkout,
+  Perl 5.14.2 or later is needed to build from a Git checkout,
   or if you changed the input files for any of the build steps that
   use Perl scripts.  If building on Windows you will need
   Perl in any case.  Perl is
diff --git a/src/pl/plperl/plc_perlboot.pl b/src/pl/plperl/plc_perlboot.pl
index 8fd7f998bc..72cb53f6e3 100644
--- a/src/pl/plperl/plc_perlboot.pl
+++ b/src/pl/plperl/plc_perlboot.pl
@@ -6,7 +6,6 @@
 use strict;
 use warnings;
 
-use 5.008001;
 use vars qw(%_SHARED $_TD);
 
 PostgreSQL::InServer::Util::bootstrap();
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 27fa607da4..d678f65d88 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2780,7 +2780,7 @@ all value

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

2022-09-13 Thread Simon Riggs
On Tue, 6 Sept 2022 at 13:14, Simon Riggs  wrote:

> I will update the patch, thanks for your scrutiny.

I attach a diff showing what has changed between v8 and v9, and will
reattach a full set of new patches in the next post, so patchtester
doesn't squeal.

-- 
Simon Riggshttp://www.EnterpriseDB.com/


subx.v8--v9.diff
Description: Binary data


Re: SUBTRANS: Minimizing calls to SubTransSetParent()

2022-09-13 Thread Simon Riggs
On Tue, 13 Sept 2022 at 11:56, Simon Riggs  wrote:
>
> On Tue, 6 Sept 2022 at 13:14, Simon Riggs  
> wrote:
>
> > I will update the patch, thanks for your scrutiny.
>
> I attach a diff showing what has changed between v8 and v9, and will
> reattach a full set of new patches in the next post, so patchtester
> doesn't squeal.

Full set of v9 patches

-- 
Simon Riggshttp://www.EnterpriseDB.com/


001_new_isolation_tests_for_subxids.v3.patch
Description: Binary data


002_minimize_calls_to_SubTransSetParent.v9.patch
Description: Binary data


Re: minimum perl version

2022-09-13 Thread John Naylor
On Tue, Sep 13, 2022 at 5:53 PM John Naylor
 wrote:
>
> Until such time as that happens, here is a draft to require 5.14.2.

As soon as I hit send, it occurred to me that we don't check the perl
version on Windows, since (I seem to recall) 5.8.3 was too old to be
an option on that platform. We'll have to add a new check somewhere.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: First draft of the PG 15 release notes

2022-09-13 Thread Alvaro Herrera
On 2022-Sep-12, Jonathan S. Katz wrote:

> +
> + 
> +  Column-level and row-level filtering on
> +  logical replication
> +  publications.
> + 
> +

-column-level filtering
+the ability to specify column lists

> +  Row-level filtering and the ability to specify column lists on
> +  logical replication
> +  publications.


-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"No tengo por qué estar de acuerdo con lo que pienso"
 (Carlos Caszeli)




Re: Tuples inserted and deleted by the same transaction

2022-09-13 Thread Matthias van de Meent
On Tue, 13 Sep 2022, 12:04 Laurenz Albe,  wrote:
>
> On Tue, 2022-09-13 at 11:47 +0300, Nikita Malakhov wrote:
>> Please correct me if I'm wrong, despite tuples being inserted and deleted by 
>> the same
>> transaction - they are visible inside the transaction and usable by it, so 
>> considering them
>> dead and cleaning up during execution is a bad idea until the transaction is 
>> ended.
>
> But once they are deleted or updated, even the transaction that created them 
> cannot
> see them any more, right?


Not quite. The command that is deleting the tuple might still be
running, and because deletions are only "visible" to statements at the
end of the delete operation, that command may still need to see the
deleted tuple (example: DELETE FROM tab t WHERE t.randnum > (select
count(*) from tab)); that count(*) will not change during the delete
operation.

So in order to mark that tuple as all_dead, you need proof that the
deleting statement finished executing. I can think of two ways to do
that: either the commit/abort of that transaction (this would be
similarly expensive as the normal commit lookup), or (e.g.) the
existence of another tuple with the same XID but with a newer CID.
That last one would not be impossible, but probably not worth the
extra cost of command id tracking.

Kind regards,

Matthias van de Meent




Re: allowing for control over SET ROLE

2022-09-13 Thread Robert Haas
On Mon, Sep 12, 2022 at 11:41 AM Peter Eisentraut
 wrote:
> I think this is because we have (erroneously) make SET ROLE to be the
> same as SET SESSION AUTHORIZATION.  If those two were separate (i.e.,
> there is a current user and a separate current role, as in the SQL
> standard), then this would be more straightforward.
>
> I don't know if it's possible to untangle that at this point.

I think that it already works as you describe:

rhaas=# create role foo;
CREATE ROLE
rhaas=# create role bar;
CREATE ROLE
rhaas=# grant bar to foo;
GRANT ROLE
rhaas=# set session authorization foo;
SET
rhaas=> set role bar;
SET
rhaas=> select current_user;
 current_user
--
 bar
(1 row)

rhaas=> select session_user;
 session_user
--
 foo
(1 row)

There may well be problems here, but this example shows that the
current_user and session_user concepts are different in PostgreSQL.
It's also true that the privileges required to execute the commands
are different: SET SESSION AUTHORIZATION requires that the session
user is a superuser, and SET ROLE requires that the identity
established via SET SESSION AUTHORIZATION has the target role granted
to it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Fix broken link to FreeBSD DocProj in docs

2022-09-13 Thread James Coleman
On Tue, Sep 13, 2022 at 4:43 AM Daniel Gustafsson  wrote:
>
> > On 12 Sep 2022, at 20:46, Daniel Gustafsson  wrote:
> >
> >> On 12 Sep 2022, at 18:13, James Coleman  wrote:
> >
> >> See attached simple patch to fix $SUBJECT; the old link generates a Not 
> >> Found.
> >
> > According to archive.org the freebsd.org site changed sometime in early 2021
> > with a 301 redirect to docproj/docproj which then ends up with a 404.  I'll
> > apply this back through v10 to get a working link and will report it to the
> > FreeBSD web team. Thanks for the fix!
>
> Committed and redirect bug reported to FreeBSD [0], thanks!
>
> --
> Daniel Gustafsson   https://vmware.com/
>
> [0] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266393

Thanks for getting this so quickly!

James Coleman




Re: [POC] Allow flattening of subquery with a link to upper query

2022-09-13 Thread Andrey Lepikhov

On 5/9/2022 12:22, Richard Guo wrote:


On Fri, Sep 2, 2022 at 7:09 PM Andrey Lepikhov 
mailto:a.lepik...@postgrespro.ru>> wrote:

 > Hmm, I'm not sure this patch works correctly in all cases. It
seems to
 > me this patch pulls up the subquery without checking the constraints
 > imposed by lateral references. If its quals contain any lateral
 > references to rels outside a higher outer join, we would need to
 > postpone quals from below an outer join to above it, which is
probably
 > incorrect.

Yeah, it's not easy-to-solve problem. If I correctly understand the
code, to fix this problem we must implement the same logic, as
pull_up_subqueries (lowest_outer_join/safe_upper_varnos). 


Yeah, I think we'd have to consider the restrictions from lateral
references to guarantee correctness when we pull up subqueries. We need
to avoid the situation where quals need to be postponed past outer join.

However, even if we have taken care of that, there may be other issues
with flattening direct-correlated ANY SubLink. The constraints imposed
by LATERAL references may make it impossible for us to find any legal
join orders, as discussed in [1].
To resolve both issues, lower outer join passes through pull_sublinks_* 
into flattening routine (see attachment).

I've added these cases into subselect.sql

--
regards,
Andrey Lepikhov
Postgres Professional
From 883f9586acb5482bb88d5ca5123195c0efdd9cc7 Mon Sep 17 00:00:00 2001
From: Andrey Lepikhov 
Date: Mon, 5 Sep 2022 16:40:48 +0500
Subject: [PATCH] Pass a lower outer join through the pullup_sublink recursive
 procedure to find out some situations when qual references outer side of an
 outer join.

---
 src/backend/optimizer/plan/subselect.c| 19 --
 src/backend/optimizer/prep/prepjointree.c | 71 ++-
 src/include/optimizer/subselect.h |  3 +-
 src/test/regress/expected/subselect.out   | 52 +
 src/test/regress/sql/subselect.sql| 18 ++
 5 files changed, 131 insertions(+), 32 deletions(-)

diff --git a/src/backend/optimizer/plan/subselect.c 
b/src/backend/optimizer/plan/subselect.c
index 0e7ddc4a4e..29da43bb4d 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1442,7 +1442,8 @@ pull_subquery_clauses_mutator(Node *node, correlated_t 
*ctx)
 }
 
 static List *
-pull_correlated_clauses(PlannerInfo *root, SubLink *sublink)
+pull_correlated_clauses(PlannerInfo *root, SubLink *sublink,
+   JoinExpr *lowest_outer_join)
 {
Query  *parse = root->parse;
Query  *subselect = (Query *) sublink->subselect;
@@ -1451,6 +1452,7 @@ pull_correlated_clauses(PlannerInfo *root, SubLink 
*sublink)
   .newvarno = 
list_length(parse->rtable) + 1, /* Looks like a hack */
   .pulling_quals = NIL,
   .varlevel_up = false};
+   Relids  safe_upper_varnos = NULL;
 
Assert(IsA(subselect, Query));
 
@@ -1489,7 +1491,16 @@ pull_correlated_clauses(PlannerInfo *root, SubLink 
*sublink)
f = subselect->jointree;
 
if (!f || !f->quals || !quals_is_pullable(f->quals))
-   /* Degenerate case */
+   return NULL;
+
+   if (lowest_outer_join)
+   safe_upper_varnos = get_relids_in_jointree(
+   (lowest_outer_join->jointype == JOIN_RIGHT) ?
+   lowest_outer_join->larg : 
lowest_outer_join->rarg, true);
+
+   if (safe_upper_varnos &&
+   !bms_is_subset(pull_varnos_of_level(root, f->quals, 1),
+  safe_upper_varnos))
return NULL;
 
/*
@@ -1540,7 +1551,7 @@ pull_correlated_clauses(PlannerInfo *root, SubLink 
*sublink)
  */
 JoinExpr *
 convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,
-   Relids available_rels)
+   Relids available_rels, 
JoinExpr *lowest_outer_join)
 {
JoinExpr   *result;
Query  *parse = root->parse;
@@ -1586,7 +1597,7 @@ convert_ANY_sublink_to_join(PlannerInfo *root, SubLink 
*sublink,
 * pulled_clauses list, their places in the quals replaces by "true" 
value.
 */
if (contain_vars_of_level((Node *) subselect, 1) &&
-   (pclauses = pull_correlated_clauses(root, sublink)) == NIL)
+   (pclauses = pull_correlated_clauses(root, sublink, 
lowest_outer_join)) == NIL)
return NULL;
 
/* Create a dummy ParseState for addRangeTableEntryForSubquery */
diff --git a/src/backend/optimizer/prep/prepjointree.c 
b/src/backend/optimizer/prep/prepjointree.c
index 41c7066d90..5458230c81 100644
--- a/src/backend/optimizer/pre

RE: Perform streaming logical transactions by background workers and parallel apply

2022-09-13 Thread kuroda.hay...@fujitsu.com
Dear Hou-san,

> I will dig your patch more, but I send partially to keep the activity of the 
> thread.

More minor comments about v28.

===
About 0002 

For 015_stream.pl

14. check_parallel_log

```
+# Check the log that the streamed transaction was completed successfully
+# reported by parallel apply worker.
+sub check_parallel_log
+{
+   my ($node_subscriber, $offset, $is_parallel)= @_;
+   my $parallel_message = 'finished processing the transaction finish 
command';
+
+   if ($is_parallel)
+   {
+   $node_subscriber->wait_for_log(qr/$parallel_message/, $offset);
+   }
+}
```

I think check_parallel_log() should be called only when streaming = 'parallel' 
and if-statement is not needed

===
For 016_stream_subxact.pl

15. test_streaming

```
+   INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 
 500) s(i);
```

"3" should be "3".

===
About 0003

For applyparallelworker.c

16. parallel_apply_relation_check()

```
+   if (rel->parallel_apply_safe == PARALLEL_APPLY_SAFETY_UNKNOWN)
+   logicalrep_rel_mark_parallel_apply(rel);
```

I was not clear when logicalrep_rel_mark_parallel_apply() is called here.
IIUC parallel_apply_relation_check() is called when parallel apply worker 
handles changes,
but before that relation is opened via logicalrep_rel_open() and 
parallel_apply_safe is set here.
If it guards some protocol violation, we may use Assert().

===
For create_subscription.sgml

17.
The restriction about foreign key does not seem to be documented.

===
About 0004

For 015_stream.pl

18. check_parallel_log

I heard that the removal has been reverted, but in the patch
check_parallel_log() is removed again... :-(


===
About throughout

I checked the test coverage via `make coverage`. About appluparallelworker.c 
and worker.c, both function coverage is 100%, and
line coverages are 86.2 % and 94.5 %. Generally it's good.
But I read the report and following parts seems not tested.

In parallel_apply_start_worker():

```
if (tmp_winfo->error_mq_handle == NULL)
{
/*
 * Release the worker information and try next one if 
the parallel
 * apply worker exited cleanly.
 */
ParallelApplyWorkersList = 
foreach_delete_current(ParallelApplyWorkersList, lc);
shm_mq_detach(tmp_winfo->mq_handle);
dsm_detach(tmp_winfo->dsm_seg);
pfree(tmp_winfo);
}
```

In HandleParallelApplyMessage():

```
case 'X':   /* Terminate, 
indicating clean exit */
{
shm_mq_detach(winfo->error_mq_handle);
winfo->error_mq_handle = NULL;
break;
}
```

Does it mean that we do not test the termination of parallel apply worker? If 
so I think it should be tested.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



RE: Perform streaming logical transactions by background workers and parallel apply

2022-09-13 Thread kuroda.hay...@fujitsu.com
Hi,

> > 01. filename
> > The word-ordering of filename seems not good
> > because you defined the new worker as "parallel apply worker".
> >
> 
> I think in the future we may have more files for apply work (like
> applyddl.c for DDL apply work), so it seems okay to name all apply
> related files in a similar way.

> > That flag is set only when an apply worker spill the transaction to the 
> > disk.
> > How about "in_streamed_transaction" -> "in_spilled_transaction"?
> >
> 
> Isn't this an existing variable? If so, it doesn't seem like a good
> idea to change the name unless we are changing its meaning.

Both of you said are reasonable. They do not have to be modified.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED



Re: pgsql: Doc: Explain about Column List feature.

2022-09-13 Thread Alvaro Herrera
On 2022-Sep-07, Amit Kapila wrote:

> Doc: Explain about Column List feature.
> 
> Add a new logical replication section for "Column Lists" (analogous to the
> Row Filters page). This explains how the feature can be used and the
> caveats in it.
> 
> Author: Peter Smith
> Reviewed-by: Shi yu, Vignesh C, Erik Rijkers, Amit Kapila
> Backpatch-through: 15, where it was introduced
> Discussion: 
> https://postgr.es/m/CAHut+PvOuc9=_4TbASc5=vuqh16uwtfo3gzckqk_5m1hrw3...@mail.gmail.com

Hi

I just read these docs and noticed that it mentions that column lists
can be used for security.  As far as I remember, this is wrong: it is
the subscriber that builds the query to read column data during initial
sync, and the publisher doesn't forbid to read columns not in it, so it
is entirely possible for a malicious subscriber to read columns other
than those published.  I'm pretty sure we discussed this at some point
during development of the feature.

So I suggest to mention this point explicitly in its own paragraph, to
avoid giving a false sense of security.

While going over this text I found some additional things that could
--in my opinion-- stand some improvement:

* It feels better to start the section saying that a list can be
  specified; subscriber must have all those columns; omitting list
  means to publish everything. That leads to shorter text (no need to
  say "you need to have them all, oh wait you might only have a few").

* there's no reason to explain the syntax in vague terms and refer the
  reader to the reference page.

* The first few  seem to give no useful structure, and instead
  cause the text to become disorganized. I propose to remove them, and
  instead mix the paragraphs in them so that we explain the rules and
  the behavior, and lastly the effect on specific commands.

The attached patch effects those changes.


One more thing. There's a sect2 about combining column list.  Part of it
seems pretty judgmental and I see no reason to have it in there; I
propose to remove it (it's not in this patch).  I think we should just
say it doesn't work at present, here's how to work around it, and
perhaps even say that we may lift the restriction in the future.  The
paragraph that starts with "Background:" is IMO out of place, and it
repeats the mistake that column lists are for security.


Lastly: In the create-publication reference page I think it would be
better to reword the Parameters section a bit.  It mentions
FOR TABLE as a parameter, but the parameter is actually
table_name; and the row-filter and
column-list explanations are also put in there when they should be in
their own expression and column_name
varlistentries.  I think splitting things that way would result in a
clearer explanation.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 0ab191e402..3ac6336be2 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1093,70 +1093,55 @@ test_sub=# SELECT * FROM child ORDER BY a;
   Column Lists
 
   
-   By default, all columns of a published table will be replicated to the
-   appropriate subscribers. The subscriber table must have at least all the
-   columns of the published table. However, if a
-   column list is specified then only the columns named
-   in the list will be replicated. This means the subscriber-side table only
-   needs to have those columns named by the column list. A user might choose to
-   use column lists for behavioral, security or performance reasons.
+   Each publication can optionally specify which columns of each table are
+   replicated to subscribers. The table on the subscriber side must have at
+   least all the columns that are published. If no column list is specified,
+   then all columns in the publisher are replicated.
+   See  for details on the syntax.
   
 
-  
-   Column List Rules
+  
+   The choice of columns can be based on behavioral or performance reasons.
+   However, do not rely on this feature for security: a malicious subscriber
+   is able to obtain data from columns that are not specifically
+   published.  If security is a consideration, protections can be applied
+   at the publisher side.
+  
 
-   
-A column list is specified per table following the table name, and enclosed
-by parentheses. See  for details.
-   
+  
+   If no column list is specified, all columns of the table are replicated
+   through this publication, including any columns added later. This means
+   that having a column list which names all columns is not the same as having
+   no column list at all: if columns are added to the table afterwards, those
+   will not be replicated.
+  
 
-   
-When specifying a column list, the order of columns is not important. If no
-column list is specified, all columns of the table are replicated through
-this publication, including any colum

Re: Tuples inserted and deleted by the same transaction

2022-09-13 Thread Matthias van de Meent
On Tue, 13 Sept 2022 at 12:40, Pantelis Theodosiou  wrote:
>
> Forgive me if this is not related but if there is a savepoint between
> the insertion and deletion, wouldn't it be possible for the
> transaction to recover the deleted tuples?

Savepoints result in changed TransactionIds (well, subtransactions
with their own ids), so if a tuple was created before a savepoint and
deleted after, the values in xmin and xmax would be different, as you
can see in the following:

matthias=> CREATE TABLE tst(i int);
matthias=> BEGIN; INSERT INTO tst VALUES (1); SAVEPOINT s1; DELETE
FROM tst; ROLLBACK TO SAVEPOINT s1;
CREATE TABLE
BEGIN
INSERT 0 1
SAVEPOINT
DELETE 1
ROLLBACK
matthias=*> SELECT xmin, xmax FROM tst;
 xmin  | xmax
---+---
 62468 | 62469
(1 row)

Note that this row has different xmin/xmax from being created and
deleted in different subtransactions. This means that this needs no
specific handling in the HTSVH code that Laurenz asked about.


Kind regards,

Matthias van de Meent




Re: ICU for global collation

2022-09-13 Thread Peter Eisentraut

On 13.09.22 07:34, Marina Polyakova wrote:
I agree with you that it is more comfortable and more similar to what 
has already been done in initdb. IMO it would be easier to do it like this:


diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index 
e523e58b2189275dc603a06324a2f28b0f49d8b7..a1482df3d981a680dd3322052e7c03ddacc8dc26 100644

--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -161,12 +161,10 @@ main(int argc, char *argv[])

  if (locale)
  {
-    if (lc_ctype)
-    pg_fatal("only one of --locale and --lc-ctype can be 
specified");

-    if (lc_collate)
-    pg_fatal("only one of --locale and --lc-collate can be 
specified");

-    lc_ctype = locale;
-    lc_collate = locale;
+    if (!lc_ctype)
+    lc_ctype = locale;
+    if (!lc_collate)
+    lc_collate = locale;
  }

  if (encoding)


done that way

Should we change the behaviour of createdb and CREATE DATABASE in 
previous major versions?..


I don't see a need for that.


BTW it's somewhat crummy that it uses a string comparison, so if you
write "UTF8" without a dash, it says this; it took me a few minutes to
see the difference...

postgres=# create database a LC_COLLATE "en_US.UTF8" LC_CTYPE
"en_US.UTF8" LOCALE "en_US.UTF8";
ERROR:  new collation (en_US.UTF8) is incompatible with the collation
of the template database (en_US.UTF-8)


Perhaps we could check the locale itself with the function 
normalize_libc_locale_name (collationcmds.c). But ISTM that the current 
check is a safety net in case the function pg_get_encoding_from_locale 
(chklocale.c) returns -1 or PG_SQL_ASCII...


This is not new behavior in PG15, is it?





Avoid redudant initialization and possible memory leak (src/backend/parser/parse_relation.c)

2022-09-13 Thread Ranier Vilela
Hi,

The commit: Revert SQL/JSON features
https://github.com/postgres/postgres/commit/2f2b18bd3f554e96a8cc885b177211be12288e4a

Left a little oversight.
I believe it will be properly corrected when it is applied again.
However, for Postgres 15 this may can cause a small memory leak.

Attached a fix patch.

regards,
Ranier Vilela


avoid-redundant-initialization-and-possible-memory-leak.patch
Description: Binary data


Re: Tuples inserted and deleted by the same transaction

2022-09-13 Thread Tom Lane
Laurenz Albe  writes:
> But once they are deleted or updated, even the transaction that created them 
> cannot
> see them any more, right?

I would not trust that claim very far.  The transaction might have active
snapshots with a command ID between the times of insertion and deletion.
(Consider a query that is firing triggers as it goes, and the triggers
are performing new actions that cause the command counter to advance.
The outer query should not see the results of those actions.)

regards, tom lane




Re: is_superuser is not documented

2022-09-13 Thread Fujii Masao




On 2022/09/13 17:25, bt22kawamotok wrote:



Thanks for the patch!


+ 
+  is_superuser (boolean)

You need to add this entry just after that of "in_hot_standby" because
the descriptions of preset parameters should be placed in alphabetical
order in the docs.


+   
+    Reports whether the user is superuser or not.

Isn't it better to add "current" before "user", e.g.,
"Reports whether the current user is a superuser"?


 /* Not for general use --- used by SET SESSION AUTHORIZATION */
-    {"is_superuser", PGC_INTERNAL, UNGROUPED,
+    {"is_superuser", PGC_INTERNAL, PRESET_OPTIONS,

This comment should be rewritten or removed because "Not for general
use" part is not true.


-    GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL |
GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+    GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | 
GUC_DISALLOW_IN_FILE

As Tom commented upthread, GUC_NO_RESET_ALL flag should be removed
because it's not necessary when PGC_INTERNAL context (i.e., in this context,
RESET ALL is prohibit by defaulted) is used.


With the patch, make check failed. You need to update
src/test/regress/expected/guc.out.


   
    IS_SUPERUSER
    
 
  True if the current role has superuser privileges.
 

I found that the docs of SET command has the above description about
is_superuser.
This description seems useless if we document the is_superuser GUC
itself. So isn't
it better to remove this description?


Thank you for your review.
I create new patch add_document_is_superuser_v2.
please check it.


Thanks for updating the patch!

The patch looks good to me.

-   /* Not for general use --- used by SET SESSION AUTHORIZATION */
{"session_authorization", PGC_USERSET, UNGROUPED,

If we don't document session_authorization and do expect that
it's usually used by SET/SHOW SESSION AUTHORIZATION,
this comment doesn't need to be removed, I think.

Could you register this patch into the next Commit Fest
so that we don't forget it?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: New docs chapter on Transaction Management and related changes

2022-09-13 Thread Simon Riggs
On Sun, 11 Sept 2022 at 21:35, Robert Treat  wrote:
>
> On Wed, Sep 7, 2022 at 8:05 AM Simon Riggs  
> wrote:
> >
> > On Tue, 6 Sept 2022 at 21:33, Justin Pryzby  wrote:
> > >
> > > On Tue, Sep 06, 2022 at 04:16:02PM +0100, Simon Riggs wrote:
> > > > New chapter on transaction management, plus a few related changes.
> > > >
> > > > Markup and links are not polished yet, so please comment initially on
> > > > the topics, descriptions and wording.
> > >
> > I've also added further notes about prepared transactions.
> >
> > I attach a diff against the original patch, plus a new clean copy.
> >
>
> Some feedback on the v4 patch, hopefully useful.
>
> +
> +Transactions may be started explicitly using BEGIN and COMMIT
> commands, known as
> +a transaction block. A transaction will also be started and ended
> implicitly for
> +each request when outside of a transaction block.
> +
>
> Transactions may be managed explicitly using BEGIN and COMMIT commands, known 
> as
> a transaction block.
>
>
> +Committed subtransactions are recorded as committed if the main transaction
> +commits.  The word subtransaction is often abbreviated to "subxact".
> +
>
> Committed subtransactions are only recorded as committed if the main
> transaction commits,
> otherwise any work done in a subtransaction will be rolled back or
> aborted. The word subtransaction is
> often abbreviated as "subxact".
>
> +
> +Subtransactions may be started explicitly by using the SAVEPOINT
> command, but may
> +also be started in other ways, such as PL/pgSQL's EXCEPTION clause.
> PL/Python and
> +PL/TCL also support explicit subtransactions. Working with the C API, users 
> may
> +also call BeginInternalSubTransaction().
> +
>
> I think this paragraph (or something similar) should be the opening
> paragraph for this section, so that readers are immediately given
> context for what PostgreSQL considers to be a subtransaction.
>
>
> +prepared transactions that were prepared before the last checkpoint.
> In the typical
> +case, shorter-lived prepared transactions are stored only in shared
> memory and WAL.
> +Currently-prepared transactions can be inspected using the
> pg_prepared_xacts view.
> +
>
> Transactions that are currently prepared can be inspected using the
> pg_prepated_xacts view.
>
> * I thought the hyphenated wording looked odd, though I understand why
> you used it. We don't use it elsewhere though (just `currently
> prepared` san hyphen) so re-worded to match the other wording.

Thanks Robert. I've tried to accommodate all of your thoughts, plus Alvaro's.

New v5 attached.

Happy to receive further comments.

-- 
Simon Riggshttp://www.EnterpriseDB.com/


xact_docs.v5.patch
Description: Binary data


Re: Splitting up guc.c

2022-09-13 Thread Tom Lane
Alvaro Herrera  writes:
> On 2022-Sep-12, Tom Lane wrote:
>> Yeah, I suspect people will have to manually reapply any changes in
>> the GUC tables to guc_tables.c.  That'll be the same amount of work
>> for them whenever we commit this patch (unless theirs lands first,
>> in which case I have to deal with it).  The issue I think is
>> whether it's politer to make that happen during a CF or between
>> CFs.

> Personally I would prefer that this kind of thing is done quickly rather
> than delay it to some uncertain future.  That way I can deal with it
> straight ahead rather than living with the anxiety that it will land
> later and I will have to deal with it then.  I see no benefit in
> waiting.

Fair enough.  I'm also not looking forward to having to rebase my
patch over anybody else's GUC changes -- even just a new GUC would
invalidate a thousand-line diff hunk, and I doubt that "git apply"
would deal with that very helpfully.  I'll go ahead and get this
pushed.

regards, tom lane




Re: Avoid redudant initialization and possible memory leak (src/backend/parser/parse_relation.c)

2022-09-13 Thread Julien Rouhaud
On Tue, Sep 13, 2022 at 10:21:22AM -0300, Ranier Vilela wrote:
> Hi,
>
> The commit: Revert SQL/JSON features
> https://github.com/postgres/postgres/commit/2f2b18bd3f554e96a8cc885b177211be12288e4a
>
> Left a little oversight.
> I believe it will be properly corrected when it is applied again.
> However, for Postgres 15 this may can cause a small memory leak.

It's not a memory leak, the chunk will be freed eventually when the owning
memory context is reset, but I agree that one of the two identical
initializations should be removed.




Re: Avoid redudant initialization and possible memory leak (src/backend/parser/parse_relation.c)

2022-09-13 Thread Alvaro Herrera
On 2022-Sep-13, Ranier Vilela wrote:

> However, for Postgres 15 this may can cause a small memory leak.

What memory leak?  There's no leak here.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree.  (Don Knuth)




Re: [PATCH] Log details for client certificate failures

2022-09-13 Thread Peter Eisentraut

On 09.09.22 00:32, Jacob Champion wrote:

On Thu, Jul 28, 2022 at 9:19 AM Jacob Champion  wrote:

On Thu, Jul 21, 2022 at 4:29 PM Jacob Champion  wrote:

v4 attempts to fix this by letting the check hooks pass
MCXT_ALLOC_NO_OOM to pg_clean_ascii(). (It's ignored in the frontend,
which just mallocs.)


Ping -- should I add an open item somewhere so this isn't lost?


Trying again. Peter, is this approach acceptable? Should I try something else?


This looks fine to me.  Committed.




Re: Tuples inserted and deleted by the same transaction

2022-09-13 Thread Matthias van de Meent
On Tue, 13 Sept 2022 at 15:45, Tom Lane  wrote:
>
> Laurenz Albe  writes:
> > But once they are deleted or updated, even the transaction that created 
> > them cannot
> > see them any more, right?
>
> I would not trust that claim very far.  The transaction might have active
> snapshots with a command ID between the times of insertion and deletion.
> (Consider a query that is firing triggers as it goes, and the triggers
> are performing new actions that cause the command counter to advance.
> The outer query should not see the results of those actions.)

I hadn't realized that triggers indeed consume command ids but might
not be visible to the outer query (that might still be running). That
invalidates the "or (e.g.) the existence of another tuple with the
same XID but with a newer CID" claim I made earlier, so thanks for
clarifying.

Kind regards,

Matthias van de Meent




Re: Can we avoid chdir'ing in resolve_symlinks() ?

2022-09-13 Thread Peter Eisentraut

On 13.09.22 01:26, Tom Lane wrote:

Andrew Dunstan  writes:

I think the discussion here is a bit tangential to the original topic.


Indeed, because I just wanted to reimplement *how* we resolve the
executable path to absolute, not question whether we should do it at all.


Well, if we decided not to do it, then we could just delete the code and 
not have to think about how to change it.



I'm not familiar with how homebrew sets up the installation
layout, but I'm suspicious that the situation Peter refers to
has a similar problem, only with a symlink for the bin directory
not the individual executable.


I think the two contradicting use cases are:

1) You configure and install with prefix=/usr/local/pgsql, and then 
symlink ~/bin/pg_ctl -> /usr/local/pgsql/bin/pg_ctl; hoping that that 
will allow pg_ctl to find the other programs it needs in 
/usr/local/pgsql/bin.  This is what we currently support.


2) You configure and install with prefix=/usr/local/pgsql-14, and then 
symlink /usr/local/pgsql -> /usr/local/pgsql-14; hoping that you can 
then use /usr/local/pgsql as if that's where it actually is.  We don't 
currently support that.  (Note that it would work if you made a copy of 
the tree instead of using the symlink.)


I don't know if anyone uses #1 or what the details of such use are.

#2 is how Homebrew and some other packaging systems work.





Re: Can we avoid chdir'ing in resolve_symlinks() ?

2022-09-13 Thread Tom Lane
Peter Eisentraut  writes:
> 2) You configure and install with prefix=/usr/local/pgsql-14, and then 
> symlink /usr/local/pgsql -> /usr/local/pgsql-14; hoping that you can 
> then use /usr/local/pgsql as if that's where it actually is.  We don't 
> currently support that.  (Note that it would work if you made a copy of 
> the tree instead of using the symlink.)

What about it does not work?

regards, tom lane




Re: Avoid redudant initialization and possible memory leak (src/backend/parser/parse_relation.c)

2022-09-13 Thread Ranier Vilela
Em ter., 13 de set. de 2022 às 11:09, Alvaro Herrera <
alvhe...@alvh.no-ip.org> escreveu:

> On 2022-Sep-13, Ranier Vilela wrote:
>
> > However, for Postgres 15 this may can cause a small memory leak.
>
> What memory leak?  There's no leak here.
>
Yeah, as per Julien's answer, there is really no memory leak, but just
unnecessary double execution of pstrdup.
But for Postgres 15, I believe it's worth avoiding this, because it's
wasted cycles.
For Postgres 16, I believe this will be fixed as well, but for robustness,
better fix soon, IMO.

regards,
Ranier Vilela


Re: First draft of the PG 15 release notes

2022-09-13 Thread Jonathan S. Katz

On 9/13/22 7:13 AM, Alvaro Herrera wrote:

On 2022-Sep-12, Jonathan S. Katz wrote:


+
+ 
+  Column-level and row-level filtering on
+  logical replication
+  publications.
+ 
+


-column-level filtering
+the ability to specify column lists


Adjusted to be similar to your suggestion. Updated patch attached.

Thanks,

Jonathan

diff --git a/doc/src/sgml/release-15.sgml b/doc/src/sgml/release-15.sgml
index 14440be77f..76fd071332 100644
--- a/doc/src/sgml/release-15.sgml
+++ b/doc/src/sgml/release-15.sgml
@@ -3,7 +3,7 @@
 
   
Release date:
-   AS OF 2022-06-11
+   AS OF 2022-09-13
   
 
   
@@ -15,7 +15,38 @@

 

-
+
+ 
+  Performance improvements for in-memory and on-disk sorting.
+ 
+
+
+ 
+  Support for the SQL MERGE
+  command.
+ 
+
+
+ 
+  The ability to specify column-lists and row-level filters on
+  logical replication
+  publications.
+ 
+
+
+ 
+  More options for compression, including support for Zstandard (zstd)
+  compression. This includes support for server-side compression for
+  
+pg_basebackup.
+ 
+
+
+ 
+  Structured log output using
+  the JSON format.
+ 
+

 



OpenPGP_signature
Description: OpenPGP digital signature


Re: ICU for global collation

2022-09-13 Thread Marina Polyakova

On 2022-09-13 15:41, Peter Eisentraut wrote:

On 13.09.22 07:34, Marina Polyakova wrote:
I agree with you that it is more comfortable and more similar to what 
has already been done in initdb. IMO it would be easier to do it like 
this:


diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index 
e523e58b2189275dc603a06324a2f28b0f49d8b7..a1482df3d981a680dd3322052e7c03ddacc8dc26 
100644

--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -161,12 +161,10 @@ main(int argc, char *argv[])

  if (locale)
  {
-    if (lc_ctype)
-    pg_fatal("only one of --locale and --lc-ctype can be 
specified");

-    if (lc_collate)
-    pg_fatal("only one of --locale and --lc-collate can be 
specified");

-    lc_ctype = locale;
-    lc_collate = locale;
+    if (!lc_ctype)
+    lc_ctype = locale;
+    if (!lc_collate)
+    lc_collate = locale;
  }

  if (encoding)


done that way


Thank you!


BTW it's somewhat crummy that it uses a string comparison, so if you
write "UTF8" without a dash, it says this; it took me a few minutes 
to

see the difference...

postgres=# create database a LC_COLLATE "en_US.UTF8" LC_CTYPE
"en_US.UTF8" LOCALE "en_US.UTF8";
ERROR:  new collation (en_US.UTF8) is incompatible with the collation
of the template database (en_US.UTF-8)


Perhaps we could check the locale itself with the function 
normalize_libc_locale_name (collationcmds.c). But ISTM that the 
current check is a safety net in case the function 
pg_get_encoding_from_locale (chklocale.c) returns -1 or 
PG_SQL_ASCII...


This is not new behavior in PG15, is it?


No, it has always existed [1] AFAICS..

[1] 
https://github.com/postgres/postgres/commit/61d967498802ab86d8897cb3c61740d7e9d712f6


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Modernizing our GUC infrastructure

2022-09-13 Thread Tom Lane
I wrote:
> Attached is a patch series that attempts to modernize our GUC
> infrastructure, in particular removing the performance bottlenecks
> it has when there are lots of GUC variables.

Rebased over 0a20ff54f.

regards, tom lane

commit d8b8742e80ef6f1e6546ddb9767bd846e8ff761c
Author: Tom Lane 
Date:   Tue Sep 13 11:36:56 2022 -0400

Preliminary improvements in memory-context infrastructure.

We lack a version of repalloc() that supports MCXT_ALLOC_NO_OOM
semantics, so invent repalloc_extended().  repalloc_huge()
becomes a legacy wrapper for that.

Also, fix dynahash.c so that it can support HASH_ENTER_NULL
requests when using the default palloc-based allocator.
The only reason it was like that was the lack of the
MCXT_ALLOC_NO_OOM option when that code was written, ages ago.

While here, simplify a few overcomplicated tests in mcxt.c.

diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 3babde8d70..4f62958883 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -289,7 +289,8 @@ static void *
 DynaHashAlloc(Size size)
 {
 	Assert(MemoryContextIsValid(CurrentDynaHashCxt));
-	return MemoryContextAlloc(CurrentDynaHashCxt, size);
+	return MemoryContextAllocExtended(CurrentDynaHashCxt, size,
+	  MCXT_ALLOC_NO_OOM);
 }
 
 
@@ -939,9 +940,7 @@ calc_bucket(HASHHDR *hctl, uint32 hash_val)
  *
  * HASH_ENTER will normally ereport a generic "out of memory" error if
  * it is unable to create a new entry.  The HASH_ENTER_NULL operation is
- * the same except it will return NULL if out of memory.  Note that
- * HASH_ENTER_NULL cannot be used with the default palloc-based allocator,
- * since palloc internally ereports on out-of-memory.
+ * the same except it will return NULL if out of memory.
  *
  * If foundPtr isn't NULL, then *foundPtr is set true if we found an
  * existing entry in the table, false otherwise.  This is needed in the
@@ -1084,12 +1083,8 @@ hash_search_with_hash_value(HTAB *hashp,
 			}
 			return NULL;
 
-		case HASH_ENTER_NULL:
-			/* ENTER_NULL does not work with palloc-based allocator */
-			Assert(hashp->alloc != DynaHashAlloc);
-			/* FALL THRU */
-
 		case HASH_ENTER:
+		case HASH_ENTER_NULL:
 			/* Return existing element if found, else create one */
 			if (currBucket != NULL)
 return (void *) ELEMENTKEY(currBucket);
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 115a64cfe4..80f99d51fc 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -1060,8 +1060,8 @@ MemoryContextAllocExtended(MemoryContext context, Size size, int flags)
 	AssertArg(MemoryContextIsValid(context));
 	AssertNotInCriticalSection(context);
 
-	if (((flags & MCXT_ALLOC_HUGE) != 0 && !AllocHugeSizeIsValid(size)) ||
-		((flags & MCXT_ALLOC_HUGE) == 0 && !AllocSizeIsValid(size)))
+	if (!((flags & MCXT_ALLOC_HUGE) != 0 ? AllocHugeSizeIsValid(size) :
+		  AllocSizeIsValid(size)))
 		elog(ERROR, "invalid memory alloc request size %zu", size);
 
 	context->isReset = false;
@@ -1215,8 +1215,8 @@ palloc_extended(Size size, int flags)
 	AssertArg(MemoryContextIsValid(context));
 	AssertNotInCriticalSection(context);
 
-	if (((flags & MCXT_ALLOC_HUGE) != 0 && !AllocHugeSizeIsValid(size)) ||
-		((flags & MCXT_ALLOC_HUGE) == 0 && !AllocSizeIsValid(size)))
+	if (!((flags & MCXT_ALLOC_HUGE) != 0 ? AllocHugeSizeIsValid(size) :
+		  AllocSizeIsValid(size)))
 		elog(ERROR, "invalid memory alloc request size %zu", size);
 
 	context->isReset = false;
@@ -1297,6 +1297,50 @@ repalloc(void *pointer, Size size)
 	return ret;
 }
 
+/*
+ * repalloc_extended
+ *		Adjust the size of a previously allocated chunk,
+ *		with HUGE and NO_OOM options.
+ */
+void *
+repalloc_extended(void *pointer, Size size, int flags)
+{
+#if defined(USE_ASSERT_CHECKING) || defined(USE_VALGRIND)
+	MemoryContext context = GetMemoryChunkContext(pointer);
+#endif
+	void	   *ret;
+
+	if (!((flags & MCXT_ALLOC_HUGE) != 0 ? AllocHugeSizeIsValid(size) :
+		  AllocSizeIsValid(size)))
+		elog(ERROR, "invalid memory alloc request size %zu", size);
+
+	AssertNotInCriticalSection(context);
+
+	/* isReset must be false already */
+	Assert(!context->isReset);
+
+	ret = MCXT_METHOD(pointer, realloc) (pointer, size);
+	if (unlikely(ret == NULL))
+	{
+		if ((flags & MCXT_ALLOC_NO_OOM) == 0)
+		{
+			MemoryContext cxt = GetMemoryChunkContext(pointer);
+
+			MemoryContextStats(TopMemoryContext);
+			ereport(ERROR,
+	(errcode(ERRCODE_OUT_OF_MEMORY),
+	 errmsg("out of memory"),
+	 errdetail("Failed on request of size %zu in memory context \"%s\".",
+			   size, cxt->name)));
+		}
+		return NULL;
+	}
+
+	VALGRIND_MEMPOOL_CHANGE(context, pointer, ret, size);
+
+	return ret;
+}
+
 /*
  * MemoryContextAllocHuge
  *		Allocate (possibly-expansive) space within the specified context.
@@ -1340,35 +1384,8 @@ MemoryContextAllocHuge(MemoryContext context, Size size)

Re: Pluggable toaster

2022-09-13 Thread Jacob Champion
On Mon, Sep 12, 2022 at 11:45 PM Nikita Malakhov  wrote:
> It would be more clear for complex data types like JSONB, where developers 
> could
> need some additional functionality to work with internal representation of 
> data type,
> and its full potential is revealed in our JSONB toaster extension. The JSONB 
> toaster
> is still in development but we plan to make it available soon.

Okay. It'll be good to have that, because as it is now it's hard to
see the whole picture.

> On installing dummy_toaster contrib: I've just checked it by making a patch 
> from commit
> and applying onto my clone of master and 2 patches provided in previous email 
> without
> any errors and sll checks passed - applying with git am, configure with 
> debug, cassert,
> depend and enable-tap-tests flags and run checks.
> Please advice what would cause such a behavior?

I don't think the default pg_upgrade tests will upgrade contrib
objects (there are instructions in src/bin/pg_upgrade/TESTING that
cover manual dumps, if you prefer that method). My manual steps were
roughly

=# CREATE EXTENSION dummy_toaster;
=# CREATE TABLE test (t TEXT
STORAGE external
TOASTER dummy_toaster_handler);
=# \q
$ initdb -D newdb
$ pg_ctl -D olddb stop
$ pg_upgrade -b /bin -B /bin -d
./olddb -D ./newdb

(where /bin is on the PATH, so we're using the right binaries).

Thanks,
--Jacob




Re: [PATCH] Log details for client certificate failures

2022-09-13 Thread Jacob Champion
On Tue, Sep 13, 2022 at 7:11 AM Peter Eisentraut
 wrote:
> This looks fine to me.  Committed.

Thanks!

--Jacob




Re: Tuples inserted and deleted by the same transaction

2022-09-13 Thread Laurenz Albe
On Tue, 2022-09-13 at 16:13 +0200, Matthias van de Meent wrote:
> On Tue, 13 Sept 2022 at 15:45, Tom Lane  wrote:
> > Laurenz Albe  writes:
> > > But once they are deleted or updated, even the transaction that created 
> > > them cannot
> > > see them any more, right?
> > 
> > I would not trust that claim very far.  The transaction might have active
> > snapshots with a command ID between the times of insertion and deletion.
> > (Consider a query that is firing triggers as it goes, and the triggers
> > are performing new actions that cause the command counter to advance.
> > The outer query should not see the results of those actions.)
> 
> I hadn't realized that triggers indeed consume command ids but might
> not be visible to the outer query (that might still be running). That
> invalidates the "or (e.g.) the existence of another tuple with the
> same XID but with a newer CID" claim I made earlier, so thanks for
> clarifying.

Yes, that makes sense.  Thanks.

Yours,
Laurenz Albe




Re: Making pg_rewind faster

2022-09-13 Thread Alexander Korotkov
Hi, Justin!

On Fri, Jul 29, 2022 at 1:05 PM Justin Kwan  wrote:
> Not sure if this email went through previously but thank you for your 
> feedback, I've incorporated your suggestions by scanning the logs produced 
> from pg_rewind when asserting that certain WAL segment files were skipped 
> from being copied over to the target server.
>
> I've also updated the pg_rewind patch file to target the Postgres master 
> branch (version 16 to be). Please see attached.

Thank you for the revision.

I've taken a look at this patch. Overall it looks good to me. I also
don't see any design objections in the thread.

A couple of points from me:
1) I would prefer to evade hard-coded names for WAL segments in the
tap tests. Could we calculate the names in the tap tests based on the
diverge point, etc.?
2) Patch contains some indentation with spaces, which should be done
in tabs. Please consider either manually fixing this or running
pgindent over modified files.

--
Regards,
Alexander Korotkov




wrong shell trap

2022-09-13 Thread Alvaro Herrera
While messing with the new guc.h stuff I happened run headerscheck and
wanted to abort it right away, and in doing so I realized that its
'trap' line is incorrect: it only removes its temp dir, but it doesn't
exit the program; so after you C-c it, it will spew a ton of complaints
about its temp dir not existing.

AFAICT almost all of our shell scripts contain the same mistake.  I
propose to fix them all as in the attached demo patch, which makes
headerscheck exit properly (no silly noise) when interrupted.

(I confess to not fully understanding why every other trap does
"rm && exit $ret" rather than "rm ; exit", but I guess rm -fr should not
fail anyway thus this should OK.)

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
diff --git a/src/tools/pginclude/headerscheck b/src/tools/pginclude/headerscheck
index 6f6f0b8bda..c2053b4ccd 100755
--- a/src/tools/pginclude/headerscheck
+++ b/src/tools/pginclude/headerscheck
@@ -46,7 +46,7 @@ CPPFLAGS=`echo "$CPPFLAGS" | sed "s|\\\$(PG_SYSROOT)|$PG_SYSROOT|g"`
 # Create temp directory.
 tmp=`mktemp -d /tmp/$me.XX`
 
-trap 'rm -rf $tmp' 0 1 2 3 15
+trap 'ret=$?; rm -rf $tmp && exit $ret' 0 1 2 3 15
 
 exit_status=0
 


Re: postgres_fdw hint messages

2022-09-13 Thread Nathan Bossart
On Tue, Sep 13, 2022 at 08:32:43AM +0200, Peter Eisentraut wrote:
> I notice that for column misspellings, the hint is phrased "Perhaps you
> meant X." whereas here we have "Did you mean X?".  Let's make that uniform.

Good point.  I attached a new version of the patch that uses the column
phrasing.  I wasn't sure whether "reference" was the right word to use in
this context, but I used it for now for consistency with the column hints.
I think "specify" or "use" would work as well.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From f9e5d136214e2b55c0bcf0b7b8ec2485aab0e7ba Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 2 Sep 2022 14:03:26 -0700
Subject: [PATCH v4 1/1] Adjust assorted hint messages that list all valid
 options.

Instead of listing all valid options, we now try to provide one
that looks similar.  Since this may be useful elsewhere, this
change introduces a new set of functions that can be reused for
similar purposes.
---
 contrib/dblink/dblink.c   | 27 +++---
 contrib/dblink/expected/dblink.out|  1 -
 contrib/file_fdw/expected/file_fdw.out|  2 -
 contrib/file_fdw/file_fdw.c   | 24 --
 .../postgres_fdw/expected/postgres_fdw.out|  3 +-
 contrib/postgres_fdw/option.c | 23 --
 src/backend/foreign/foreign.c | 26 --
 src/backend/utils/adt/varlena.c   | 82 +++
 src/include/utils/varlena.h   | 12 +++
 src/test/regress/expected/foreign_data.out|  6 +-
 10 files changed, 159 insertions(+), 47 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 3df3f9bbe9..8c26adb56c 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2008,27 +2008,32 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
 		{
 			/*
 			 * Unknown option, or invalid option for the context specified, so
-			 * complain about it.  Provide a hint with list of valid options
-			 * for the context.
+			 * complain about it.  Provide a hint with a valid option that
+			 * looks similar, if there is one.
 			 */
-			StringInfoData buf;
 			const PQconninfoOption *opt;
+			const char *closest_match;
+			ClosestMatchState match_state;
+			bool		has_valid_options = false;
 
-			initStringInfo(&buf);
+			initClosestMatch(&match_state, def->defname, 4);
 			for (opt = options; opt->keyword; opt++)
 			{
 if (is_valid_dblink_option(options, opt->keyword, context))
-	appendStringInfo(&buf, "%s%s",
-	 (buf.len > 0) ? ", " : "",
-	 opt->keyword);
+{
+	has_valid_options = true;
+	updateClosestMatch(&match_state, opt->keyword);
+}
 			}
+
+			closest_match = getClosestMatch(&match_state);
 			ereport(ERROR,
 	(errcode(ERRCODE_FDW_OPTION_NAME_NOT_FOUND),
 	 errmsg("invalid option \"%s\"", def->defname),
-	 buf.len > 0
-	 ? errhint("Valid options in this context are: %s",
-			   buf.data)
-	 : errhint("There are no valid options in this context.")));
+	 has_valid_options ? closest_match ?
+	 errhint("Perhaps you meant to reference the option \"%s\".",
+			 closest_match) : 0 :
+	 errhint("There are no valid options in this context.")));
 		}
 	}
 
diff --git a/contrib/dblink/expected/dblink.out b/contrib/dblink/expected/dblink.out
index c7bde6ad07..14d015e4d5 100644
--- a/contrib/dblink/expected/dblink.out
+++ b/contrib/dblink/expected/dblink.out
@@ -897,7 +897,6 @@ $d$;
 CREATE USER MAPPING FOR public SERVER fdtest
   OPTIONS (server 'localhost');  -- fail, can't specify server here
 ERROR:  invalid option "server"
-HINT:  Valid options in this context are: user, password, sslpassword
 CREATE USER MAPPING FOR public SERVER fdtest OPTIONS (user :'USER');
 GRANT USAGE ON FOREIGN SERVER fdtest TO regress_dblink_user;
 GRANT EXECUTE ON FUNCTION dblink_connect_u(text, text) TO regress_dblink_user;
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 261af1a8b5..36d76ba26c 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -166,7 +166,6 @@ ERROR:  invalid option "force_not_null"
 HINT:  There are no valid options in this context.
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (force_not_null '*'); -- ERROR
 ERROR:  invalid option "force_not_null"
-HINT:  Valid options in this context are: filename, program, format, header, delimiter, quote, escape, null, encoding
 -- force_null is not allowed to be specified at any foreign object level:
 ALTER FOREIGN DATA WRAPPER file_fdw OPTIONS (ADD force_null '*'); -- ERROR
 ERROR:  invalid option "force_null"
@@ -179,7 +178,6 @@ ERROR:  invalid option "force_null"
 HINT:  There are no valid options in this context.
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (force_null '*'); -- ERROR
 ERROR:  invalid option "force_null"
-HINT:  Valid options in this context are: filename, program, format, header, delimiter, quote,

Re: wrong shell trap

2022-09-13 Thread Tom Lane
Alvaro Herrera  writes:
> While messing with the new guc.h stuff I happened run headerscheck and
> wanted to abort it right away, and in doing so I realized that its
> 'trap' line is incorrect: it only removes its temp dir, but it doesn't
> exit the program; so after you C-c it, it will spew a ton of complaints
> about its temp dir not existing.

Ugh.

> AFAICT almost all of our shell scripts contain the same mistake.  I
> propose to fix them all as in the attached demo patch, which makes
> headerscheck exit properly (no silly noise) when interrupted.

Sounds like a good idea.

> (I confess to not fully understanding why every other trap does
> "rm && exit $ret" rather than "rm ; exit", but I guess rm -fr should not
> fail anyway thus this should OK.)

I didn't write these, but I think the idea might be "if rm -rf
fails, report its error status rather than whatever we had before".
However, if that is the idea then it's wrong, because as you've
discovered we won't exit at all unless the trap string does so.
So ISTM that 'ret=$?; rm -rf $tmp; exit $ret' is the correct coding,
or at least less incorrect than either of these alternatives.

regards, tom lane




Re: wrong shell trap

2022-09-13 Thread Peter Geoghegan
On Tue, Sep 13, 2022 at 2:01 PM Tom Lane  wrote:
> > AFAICT almost all of our shell scripts contain the same mistake.  I
> > propose to fix them all as in the attached demo patch, which makes
> > headerscheck exit properly (no silly noise) when interrupted.
>
> Sounds like a good idea.

Might not be a bad idea to run shellcheck against the scripts, to see
if that highlights anything.

I've found that shellcheck makes working with shell scripts less
terrible, especially when portability is a concern. It can be used to
enforce consistent coding standards that seem pretty well thought out.
It will sometimes produce dubious warnings, of course, but it tends to
mostly have the right idea, most of the time.

-- 
Peter Geoghegan




Re: pg_basebackup's --gzip switch misbehaves

2022-09-13 Thread Tom Lane
Michael Paquier  writes:
> Attached is the patch I am finishing with, consisting of:
> - the removal of PG_COMPRESSION_OPTION_LEVEL.
> - assigning a default compression level when nothing is specified in
> the spec.
> - a couple of complifications in pg_receivewal, pg_basebackup and the
> backend code as there is no need to worry about the compression
> level.

This looks good to me.  It seems simpler, and I concur that it
fixes the described problem.  I now see

tmp_check/backup_gzip:
total 3668
-rw-r-. 1 postgres postgres  137756 Sep 13 17:29 backup_manifest
-rw-r-. 1 postgres postgres 3537499 Sep 13 17:29 base.tar.gz
-rw-r-. 1 postgres postgres   73989 Sep 13 17:29 pg_wal.tar.gz

tmp_check/backup_gzip2:
total 3168
-rw-r-. 1 postgres postgres  137756 Sep 13 17:29 backup_manifest
-rw-r-. 1 postgres postgres 3083516 Sep 13 17:29 base.tar.gz
-rw-r-. 1 postgres postgres   17069 Sep 13 17:29 pg_wal.tar.gz

tmp_check/backup_gzip3:
total 3668
-rw-r-. 1 postgres postgres  137756 Sep 13 17:29 backup_manifest
-rw-r-. 1 postgres postgres 3537517 Sep 13 17:29 base.tar.gz
-rw-r-. 1 postgres postgres   73988 Sep 13 17:29 pg_wal.tar.gz

which looks sane: the gzip2 case should, and does, have better
compression than the other two.

BTW, this bit:

diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl 
b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 3d1a4ddd5c..40f1d3f7e2 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -860,9 +860,6 @@ SKIP:
my $gzip_is_valid =
  system_log($gzip, '--test', @zlib_files, @zlib_files2, @zlib_files3);
is($gzip_is_valid, 0, "gzip verified the integrity of compressed data");
-   rmtree("$tempdir/backup_gzip");
-   rmtree("$tempdir/backup_gzip2");
-   rmtree("$tempdir/backup_gzip3");
 }
 
 # Test background stream process terminating before the basebackup has

is something I tried along the way to diagnosing the problem, and
it turns out to have exactly zero effect.  The $tempdir is some
temporary subdirectory of tmp_check that will get nuked at the end
of the TAP test no matter what.  So these rmtrees are merely making
the evidence disappear a bit faster; it will anyway.

What I did to diagnose the problem was this:

@@ -860,9 +860,9 @@ SKIP:
my $gzip_is_valid =
  system_log($gzip, '--test', @zlib_files, @zlib_files2, @zlib_files3);
is($gzip_is_valid, 0, "gzip verified the integrity of compressed data");
-   rmtree("$tempdir/backup_gzip");
-   rmtree("$tempdir/backup_gzip2");
-   rmtree("$tempdir/backup_gzip3");
+   system_log('mv', "$tempdir/backup_gzip", "tmp_check");
+   system_log('mv', "$tempdir/backup_gzip2", "tmp_check");
+   system_log('mv', "$tempdir/backup_gzip3", "tmp_check");
 }
 
 # Test background stream process terminating before the basebackup has

which is not real clean, since then the files get left behind even
on success, which I doubt we want either.

Anyway, I have no objection to dropping the rmtrees, since they're
pretty useless as the code stands.  Just wanted to mention this
issue for the archives.

regards, tom lane




Re: failing to build preproc.c on solaris with sun studio

2022-09-13 Thread Thomas Munro
On Tue, Sep 6, 2022 at 9:34 AM Tom Lane  wrote:
> Peter Eisentraut  writes:
> > Why is this being proposed?
>
> Andres is annoyed by the long build time of ecpg, which he has to
> wait for whether he wants to test it or not.  I could imagine that
> I might disable ecpg testing on my slowest buildfarm animals, too.

This message triggered me to try to teach ccache how to cache
preproc.y -> preproc.{c,h}, and I got that basically working[1], but
upstream doesn't want it (yet).  I'll try again if the proposed
refactoring to allow more kinds of compiler-like-things goes
somewhere.  I think that started with people's struggles with GCC vs
MSVC.  Given the simplicity of this case, though, I suppose we could
have a little not-very-general shell/python/whatever wrapper script --
just compute a checksum of the input and keep the output files around.

[1] https://github.com/ccache/ccache/pull/1156




Re: Background writer and checkpointer in crash recovery

2022-09-13 Thread Justin Pryzby
On Tue, Sep 13, 2022 at 01:32:11PM +0900, Michael Paquier wrote:
> On Mon, Sep 12, 2022 at 09:13:11PM -0500, Justin Pryzby wrote:
> > Like this, maybe.
> > 
> > It's similar to what I suggested to consider backpatching here:
> > https://www.postgresql.org/message-id/20201214032224.GA30237%40telsasoft.com
> 
> At the same time, df9274adf has been done because the end-of-recovery
> checkpoint done potentially by the startup process if the checkpointer
> was not up at the end of recovery would take long.  Any of the actions
> taken in the startup process when finishing recovery are not that
> costly, on the contrary, as far as I know.

Your interpretation may be a bit different from mine.

The central problem for me was that the startup process said "recovering
NNN" after it was done recovering NNN.

To fix that, df9274adf took the approach of overwriting the existing PS
with "end-of-recovery checkpoint", which also adds some extra value...

> I am not opposed to clearing the ps display when we are at this state
> of the game for the startup process, but rather than clearing it we
> could switch provide something more useful that shows what's
> happening.  Say a simple "performing end-of-recovery actions", for
> example..

...but the minimal solution (which 2 years ago I suggested could've been
backpatched) is to *clear* the PS (in master branch at the time, that
would've been before also changing it to say stuff about the
checkpoint).

If we'd done that, I probably wouldn't be griping about it now, so my
preference is to *clear* the display as soon as the WAL processing is
done; further overwriting it with additional stuff can be left as a
future enhancement (like "syncing FS" that I brought up last year, and
maybe everything else that calls ereport_startup_progress()).




Re: build remaining Flex files standalone

2022-09-13 Thread Andres Freund
On 2022-09-12 14:49:50 +0700, John Naylor wrote:
> CI builds fine. For v2 I only adjusted the commit message. I'll push
> in a couple days unless there are objections.

Makes sense to me. Thanks for working on it!




Re: minimum perl version

2022-09-13 Thread Tom Lane
John Naylor  writes:
> On Sat, Sep 3, 2022 at 2:11 AM Tom Lane  wrote:
>> Yeah, that seems like a reasonable new minimum for Perl.  I might
>> see about setting up an animal running 5.14.0, just so we can say
>> "5.14" in the docs without fine print.

> Until such time as that happens, here is a draft to require 5.14.2.

I've just switched longfin to use built-from-source perl 5.14.0.

regards, tom lane




Re: Fix possible bogus array out of bonds (src/backend/access/brin/brin_minmax_multi.c)

2022-09-13 Thread Ranier Vilela
Em seg., 5 de set. de 2022 às 23:02, David Rowley 
escreveu:

> On Tue, 6 Sept 2022 at 13:52, Ranier Vilela  wrote:
> >
> > Em seg., 5 de set. de 2022 às 22:29, David Rowley 
> escreveu:
> >> It feels like it would be good if we had a way to detect a few of
> >> these issues much earlier than we are currently.  There's been a long
> >> series of commits fixing up this sort of thing.  If we had a tool to
> >> parse the .c files and look for things like a function call to
> >> appendPQExpBuffer() and appendStringInfo() with only 2 parameters (i.e
> >> no va_arg arguments).
> >
> > StaticAssert could check va_arg no?
>
> I'm not sure exactly what you have in mind. If you think you have a
> way to make that work, it would be good to see a patch with it.
>
About this:

1. StaticAssertSmt can not help.
Although some posts on the web show that it is possible to calculate the
number of arguments,
I didn't get anything useful.
So I left this option.

2. Compiler supports
Best solution.
But currently does not allow the suggestion to use another function.

3.  Owner tool
Temporary solution.
Can help, until the compilers build support for it.

So, I made one very simple tool, can do the basics here.
Not meant to be some universal lint.
It only processes previously coded functions.

pg_check test1.c
line (1): should be appendPQExpBufferStr?
line (2): should be appendPQExpBufferChar?
line (4): should be appendPQExpBufferStr?
line (5): should be appendPQExpBufferStr?

I don't think it's anywhere near the quality to be considered Postgres, but
it could be a start.
If it helps, great, if not, fine.

regards,
Ranier Vilela
/*
 * pg_check.c
 *
 * A simple check syntax program for PostgreSQL
 * Originally written by Ranier Vilela and enhanced by many contributors.
 *
 * src/bin/pg_check/pg_check.c
 * Copyright (c) 2022, PostgreSQL Global Development Group
 * ALL RIGHTS RESERVED;
 *
 * Permission to use, copy, modify, and distribute this software and its
 * documentation for any purpose, without fee, and without a written agreement
 * is hereby granted, provided that the above copyright notice and this
 * paragraph and the following two paragraphs appear in all copies.
 *
 * IN NO EVENT SHALL THE AUTHOR OR DISTRIBUTORS BE LIABLE TO ANY PARTY FOR
 * DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
 * LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
 * DOCUMENTATION, EVEN IF THE AUTHOR OR DISTRIBUTORS HAVE BEEN ADVISED OF THE
 * POSSIBILITY OF SUCH DAMAGE.
 *
 * THE AUTHOR AND DISTRIBUTORS SPECIFICALLY DISCLAIMS ANY WARRANTIES,
 * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
 * AND FITNESS FOR A PARTICULAR PURPOSE.  THE SOFTWARE PROVIDED HEREUNDER IS
 * ON AN "AS IS" BASIS, AND THE AUTHOR AND DISTRIBUTORS HAS NO OBLIGATIONS TO
 * PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
 *
 */

#include 
#include 
#include 
#include 


static const char * progname = "pg_check";


struct S_CTX
{
int nline;
int nargs;
int nchars;
int nbrackets;
int nkeys;
int nperc;
int nsfmt;
bool quote;
bool finish;
};

typedef struct S_CTX s_ctx;

struct S_FUNC
{
char * function;
char * should_string;
char * should_char;
};

typedef struct S_FUNC s_func;


static const s_func pg_functions[] = {
{ "appendPQExpBuffer(", "should be appendPQExpBufferStr?", "should be 
appendPQExpBufferChar?" },
{ "appendStringInfo(", "should be appendStringInfoString?", "should be 
appendStringInfoChar?" },
NULL
};


void parser_function(s_ctx * ctx, const s_func * pg_function, const char * line)
{
const char * ptr;
size_t len;
size_t i;
char ch;

ctx->finish = false;
ptr = line;
len = strlen(line);
for(i = 0; i < len; i++)
{
ch = ptr[i];
if (!ctx->quote)
{
if (ch == '(')
ctx->nbrackets++;
else if (ch == ')')
ctx->nbrackets--;
else if (ch == '{')
ctx->nkeys++;
else if (ch == '}')
ctx->nkeys--;
else if (ch == ';')
{
ctx->finish = true;
break;
}
else if (ch == ',')
ctx->nargs++;
}
if (ctx->nargs > 0)
if (ch == '"')
ctx->quote = !ctx->quote;
if (ctx->quote)
{
if (ch != '"')
ctx->nchars++;
if (ch == '%')
ctx->nperc++;
 

A question about wording in messages

2022-09-13 Thread Kyotaro Horiguchi
I saw the following message recently modified.

> This controls the maximum distance we can read ahead in the WAL to prefetch 
> referenced data blocks.

Maybe the "we" means "PostgreSQL program and you" but I see it
somewhat out of place.

I found other three uses of "we" in backend.

> client sent proto_version=%d but we only support protocol %d or lower
> client sent proto_version=%d but we only support protocol %d or higher
> System allows %d, we need at least %d.

This is a little different from the first one. In the three above,
"we" suggests "The developers and maybe the PostgreSQL program".

Is it the right word choice as error messages?  I'm not confident on
the precise wording, but I think something like the following are
appropriate here.

> This controls the maximum distance to read ahead in the WAL to prefetch 
> referenced data blocks.
> client sent proto_version=%d but only protocols %d or lower are supported
> client sent proto_version=%d but only protocols %d or higher are supported
> System allows %d, at least %d needed.

Thoughts?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: A question about wording in messages

2022-09-13 Thread Tom Lane
Kyotaro Horiguchi  writes:
> I saw the following message recently modified.
>> This controls the maximum distance we can read ahead in the WAL to prefetch 
>> referenced data blocks.
> Maybe the "we" means "PostgreSQL program and you" but I see it
> somewhat out of place.

+1, I saw that today and thought it was outside our usual style.
The whole thing is awfully verbose for a GUC description, too.
Maybe

"Maximum distance to read ahead in WAL to prefetch data blocks."

regards, tom lane




A small typo

2022-09-13 Thread Kyotaro Horiguchi
I found a small typo in a comment in pgbench.c of 15/master.

- * Return the number fo failed transactions.
+ * Return the number of failed transactions.

While at it, I found "* lot fo unnecessary work." in pg13's
procsignal.c. It has been fixed by 2a093355aa in PG14 but PG13 was
left alone at the time.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 098fb43b3c..4d5d6560bf 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -4415,7 +4415,7 @@ executeMetaCommand(CState *st, pg_time_usec_t *now)
 }
 
 /*
- * Return the number fo failed transactions.
+ * Return the number of failed transactions.
  */
 static int64
 getFailures(const StatsData *stats)


Re: A small typo

2022-09-13 Thread Richard Guo
On Wed, Sep 14, 2022 at 10:46 AM Kyotaro Horiguchi 
wrote:

> I found a small typo in a comment in pgbench.c of 15/master.
>
> - * Return the number fo failed transactions.
> + * Return the number of failed transactions.
>
> While at it, I found "* lot fo unnecessary work." in pg13's
> procsignal.c. It has been fixed by 2a093355aa in PG14 but PG13 was
> left alone at the time.


+1. And grep shows no more this kind of typo in source codes in master.

$ find . -name "*.[ch]" | xargs grep ' fo '
./src/bin/pgbench/pgbench.c: * Return the number fo failed transactions.

Thanks
Richard


Re: START_REPLICATION SLOT causing a crash in an assert build

2022-09-13 Thread Jaime Casanova
On Tue, Sep 13, 2022 at 06:48:45PM +0900, Kyotaro Horiguchi wrote:
> Nice finding.
> 
> At Tue, 13 Sep 2022 00:39:45 -0500, Jaime Casanova 
>  wrote in 
> > and the problem seems to be that after zero'ing the stats that includes
> > the name of the replication slot, this simple patch fixes it... not sure
> > if it's the right fix though...
> 
> That doesn't work. since what that function clears is not the name in
> the slot struct but that in stats entry.
> 

you're right... the curious thing is that I tested it and it worked, but
now it doesn't... maybe it was too late for me...

> The cause is what pg_stat_reset_replslot wants to do does not match
> what pgstat feature thinks as reset.
> 
[...]
> 
> Another measure would be to add the region to wipe-out on reset to
> PgStat_KindInfo, but it seems too much.. (attached)
> 

This patch solves the problem, i didn't like the other solution because
as you say it partly nullify the protection of the assertion.

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL




Re: failing to build preproc.c on solaris with sun studio

2022-09-13 Thread Thomas Munro
On Wed, Sep 14, 2022 at 10:23 AM Thomas Munro  wrote:
> Given the simplicity of this case, though, I suppose we could
> have a little not-very-general shell/python/whatever wrapper script --
> just compute a checksum of the input and keep the output files around.

Something as dumb as this perhaps...


simple-bison-cache.sh
Description: application/shellscript


Re: A small typo

2022-09-13 Thread Amit Kapila
On Wed, Sep 14, 2022 at 8:16 AM Kyotaro Horiguchi
 wrote:
>
> I found a small typo in a comment in pgbench.c of 15/master.
>
> - * Return the number fo failed transactions.
> + * Return the number of failed transactions.
>

LGTM.

> While at it, I found "* lot fo unnecessary work." in pg13's
> procsignal.c. It has been fixed by 2a093355aa in PG14 but PG13 was
> left alone at the time.
>

I think sometimes we fix typos only in HEAD. I am not sure if we have
a clear policy to backpatch such things.

-- 
With Regards,
Amit Kapila.




Re: minimum perl version

2022-09-13 Thread John Naylor
On Wed, Sep 14, 2022 at 6:47 AM Tom Lane  wrote:
>
> I've just switched longfin to use built-from-source perl 5.14.0.

In that case, here is a quick update with commit message. Not yet any
change for MSVC, but I can put together something later.

Since we're much less willing to support older Windows and Visual
Studio versions, maybe it's low-enough risk defer the check to the
Meson conversion? I understand our MSVC process will then go away much
more quickly than autoconf...

-- 
John Naylor
EDB: http://www.enterprisedb.com
From d8e241968395c552d667c1eb393cbaea12f0df6c Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Wed, 14 Sep 2022 09:58:13 +0700
Subject: [PATCH v2] Bump minimum Perl version to 5.14

The oldest vendor-shipped Perl in the buildfarm is 5.14.2, which is the
last version that Debian Wheezy shipped. That OS is EOL, but we keep
it running because there is no other convenient way to test certain
non-mainstream 32-bit platforms. There is no bugfix in the .2 release
that is required, and yet it's also not the latest minor release --
that would be 5.14.4. To clarify the situation, we also have arranged the
buildfarm to test 5.14.0. That allows configure scripts and documentation
to state 5.14 without fine print.

Discussion: https://www.postgresql.org/message-id/20220902181553.ev4pgzhubhdkg...@awork3.anarazel.de
---
 config/perl.m4   |  4 ++--
 configure|  6 +++---
 doc/src/sgml/install-windows.sgml|  2 +-
 doc/src/sgml/installation.sgml   |  4 ++--
 src/pl/plperl/plc_perlboot.pl|  1 -
 src/test/perl/PostgreSQL/Test/Cluster.pm |  2 +-
 src/test/perl/README | 10 +++---
 src/tools/msvc/gendef.pl |  1 -
 src/tools/pgindent/pgindent  |  1 -
 9 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/config/perl.m4 b/config/perl.m4
index c9fd91397c..8126e79f67 100644
--- a/config/perl.m4
+++ b/config/perl.m4
@@ -11,11 +11,11 @@ if test "$PERL"; then
   pgac_perl_version=`$PERL -v 2>/dev/null | sed -n ['s/This is perl.*v[a-z ]*\([0-9]\.[0-9][0-9.]*\).*$/\1/p']`
   AC_MSG_NOTICE([using perl $pgac_perl_version])
   if echo "$pgac_perl_version" | sed ['s/[.a-z_]/ /g'] | \
-$AWK '{ if ([$]1 == 5 && ([$]2 > 8 || ($[2] == 8 && [$]3 >= 3))) exit 1; else exit 0;}'
+$AWK '{ if ([$]1 == 5 && ([$]2 >= 14)) exit 1; else exit 0;}'
   then
 AC_MSG_WARN([
 *** The installed version of Perl, $PERL, is too old to use with PostgreSQL.
-*** Perl version 5.8.3 or later is required, but this is $pgac_perl_version.])
+*** Perl version 5.14 or later is required, but this is $pgac_perl_version.])
 PERL=""
   fi
 fi
diff --git a/configure b/configure
index fd2a782454..f325bd85b8 100755
--- a/configure
+++ b/configure
@@ -10491,14 +10491,14 @@ if test "$PERL"; then
   { $as_echo "$as_me:${as_lineno-$LINENO}: using perl $pgac_perl_version" >&5
 $as_echo "$as_me: using perl $pgac_perl_version" >&6;}
   if echo "$pgac_perl_version" | sed 's/[.a-z_]/ /g' | \
-$AWK '{ if ($1 == 5 && ($2 > 8 || ($2 == 8 && $3 >= 3))) exit 1; else exit 0;}'
+$AWK '{ if ($1 == 5 && ($2 >= 14)) exit 1; else exit 0;}'
   then
 { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING:
 *** The installed version of Perl, $PERL, is too old to use with PostgreSQL.
-*** Perl version 5.8.3 or later is required, but this is $pgac_perl_version." >&5
+*** Perl version 5.14 or later is required, but this is $pgac_perl_version." >&5
 $as_echo "$as_me: WARNING:
 *** The installed version of Perl, $PERL, is too old to use with PostgreSQL.
-*** Perl version 5.8.3 or later is required, but this is $pgac_perl_version." >&2;}
+*** Perl version 5.14 or later is required, but this is $pgac_perl_version." >&2;}
 PERL=""
   fi
 fi
diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml
index c00ab2b4b2..29d3294dc8 100644
--- a/doc/src/sgml/install-windows.sgml
+++ b/doc/src/sgml/install-windows.sgml
@@ -190,7 +190,7 @@ $ENV{MSBFLAGS}="/m";
   or Cygwin Perl will not work. It must also be present in the PATH.
   Binaries can be downloaded from
   https://www.activestate.com";>
-  (Note: version 5.8.3 or later is required,
+  (Note: version 5.14 or later is required,
   the free Standard Distribution is sufficient).
  
 
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 7c79608e55..319c7e6966 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -165,7 +165,7 @@ su - postgres
   PL/Perl you need a full
   Perl installation, including the
   libperl library and the header files.
-  The minimum required version is Perl 5.8.3.
+  The minimum required version is Perl 5.14.
   Since PL/Perl will be a shared
   library, the libperl
   libperl library must be a shared library
@@ -325,7 +325,7 @@ su - postgres
perl
   
 
-  Perl 5.8.3 or later is needed 

Re: A small typo

2022-09-13 Thread Tom Lane
Amit Kapila  writes:
> On Wed, Sep 14, 2022 at 8:16 AM Kyotaro Horiguchi
>  wrote:
>> I found a small typo in a comment in pgbench.c of 15/master.
>> - * Return the number fo failed transactions.
>> + * Return the number of failed transactions.

> LGTM.

+1

>> While at it, I found "* lot fo unnecessary work." in pg13's
>> procsignal.c. It has been fixed by 2a093355aa in PG14 but PG13 was
>> left alone at the time.

> I think sometimes we fix typos only in HEAD. I am not sure if we have
> a clear policy to backpatch such things.

I would not go back and change v13 at this point.  You're right
that this is fuzzy, but overriding the contemporaneous decision
not to backpatch seems well outside our usual habits.

There are basically two good reasons to back-patch comment changes:

* fear that the comment is wrong enough to mislead people looking
at the older branch;

* fear that leaving it alone will create a merge hazard for future
back-patches.

It doesn't seem to me that either of those is a strong concern
in this case.  In the absence of these concerns, back-patching
seems like make-work (and useless expenditure of buildfarm
cycles).

regards, tom lane




Re: minimum perl version

2022-09-13 Thread Tom Lane
John Naylor  writes:
> On Wed, Sep 14, 2022 at 6:47 AM Tom Lane  wrote:
>> I've just switched longfin to use built-from-source perl 5.14.0.

> In that case, here is a quick update with commit message. Not yet any
> change for MSVC, but I can put together something later.

Looks reasonable just by eyeball, did not test.

> Since we're much less willing to support older Windows and Visual
> Studio versions, maybe it's low-enough risk defer the check to the
> Meson conversion? I understand our MSVC process will then go away much
> more quickly than autoconf...

Agreed --- the MSVC scripts are on a pretty short leash now.
Not clear it's worth fixing them for this point.  If we've
failed to get rid of them by the time v16 release approaches,
maybe it'd be worth doing something then.

regards, tom lane




Re: A small typo

2022-09-13 Thread Amit Kapila
On Wed, Sep 14, 2022 at 9:10 AM Tom Lane  wrote:
>
> Amit Kapila  writes:
>
> There are basically two good reasons to back-patch comment changes:
>
> * fear that the comment is wrong enough to mislead people looking
> at the older branch;
>
> * fear that leaving it alone will create a merge hazard for future
> back-patches.
>
> It doesn't seem to me that either of those is a strong concern
> in this case.  In the absence of these concerns, back-patching
> seems like make-work (and useless expenditure of buildfarm
> cycles).
>

Agreed. I'll push this to HEAD after some time.

-- 
With Regards,
Amit Kapila.




Re: A question about wording in messages

2022-09-13 Thread Kyotaro Horiguchi
At Tue, 13 Sep 2022 22:38:46 -0400, Tom Lane  wrote in 
> Kyotaro Horiguchi  writes:
> > I saw the following message recently modified.
> >> This controls the maximum distance we can read ahead in the WAL to 
> >> prefetch referenced data blocks.
> > Maybe the "we" means "PostgreSQL program and you" but I see it
> > somewhat out of place.
> 
> +1, I saw that today and thought it was outside our usual style.
> The whole thing is awfully verbose for a GUC description, too.
> Maybe
> 
> "Maximum distance to read ahead in WAL to prefetch data blocks."

It seems to sufficiently work for average users and rather easy to
read, but it looks a short description.

wal_decode_buffer_size has the following descriptions.

Short: Buffer size for reading ahead in the WAL during recovery.
Extra: This controls the maximum distance we can read ahead in the WAL to 
prefetch referenced data blocks.

So, taking the middle of them, how about the following?

Short: Buffer size for reading ahead in the WAL during recovery.
Extra: This controls the maximum distance to read ahead in WAL to prefetch data 
blocks."

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: A small typo

2022-09-13 Thread Kyotaro Horiguchi
At Wed, 14 Sep 2022 09:19:22 +0530, Amit Kapila  wrote 
in 
> On Wed, Sep 14, 2022 at 9:10 AM Tom Lane  wrote:
> >
> > Amit Kapila  writes:
> >
> > There are basically two good reasons to back-patch comment changes:
> >
> > * fear that the comment is wrong enough to mislead people looking
> > at the older branch;
> >
> > * fear that leaving it alone will create a merge hazard for future
> > back-patches.
> >
> > It doesn't seem to me that either of those is a strong concern
> > in this case.  In the absence of these concerns, back-patching
> > seems like make-work (and useless expenditure of buildfarm
> > cycles).
> >
> 
> Agreed. I'll push this to HEAD after some time.

Thanks for committing, and for the clarification about back-patching
policy!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Expand palloc/pg_malloc API

2022-09-13 Thread Peter Eisentraut

On 12.09.22 15:49, Tom Lane wrote:

Peter Eisentraut  writes:

On 09.09.22 22:13, Tom Lane wrote:

I think serious consideration should be given to back-patching the
0001 part (that is, addition of the macros).  Otherwise we'll have
to remember not to use these macros in code intended for back-patch,
and that'll be mighty annoying once we are used to them.



Yes, the 0001 patch is kept separate so that we can do that when we feel
the time is right.


I think the right time is now, or at least as soon as you're
satisfied that the buildfarm is happy.


This has been done.





Re: Expand palloc/pg_malloc API

2022-09-13 Thread Peter Eisentraut
I have another little idea that fits well here: repalloc0 and 
repalloc0_array.  These zero out the space added by repalloc.  This is a 
common pattern in the backend code that is quite hairy to code by hand. 
See attached patch.
From fa611ecf7c8a7b99d37c77da66b421d2f9ebfec3 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 14 Sep 2022 06:05:46 +0200
Subject: [PATCH] Add repalloc0 and repalloc0_array

These zero out the space added by repalloc.  This is a common pattern
that is quite hairy to code by hand.
---
 src/backend/executor/nodeHash.c  |  8 ++-
 src/backend/libpq/be-fsstubs.c   |  9 +++-
 src/backend/optimizer/util/placeholder.c | 13 ---
 src/backend/optimizer/util/relnode.c | 29 +++-
 src/backend/parser/parse_param.c | 10 +++-
 src/backend/storage/lmgr/lwlock.c|  9 ++--
 src/backend/utils/adt/ruleutils.c|  9 ++--
 src/backend/utils/cache/typcache.c   | 10 ++--
 src/backend/utils/mmgr/mcxt.c| 15 
 src/include/utils/palloc.h   |  2 ++
 10 files changed, 43 insertions(+), 71 deletions(-)

diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 77dd1dae8b..674802a4ff 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -940,12 +940,8 @@ ExecHashIncreaseNumBatches(HashJoinTable hashtable)
else
{
/* enlarge arrays and zero out added entries */
-   hashtable->innerBatchFile = 
repalloc_array(hashtable->innerBatchFile, BufFile *, nbatch);
-   hashtable->outerBatchFile = 
repalloc_array(hashtable->outerBatchFile, BufFile *, nbatch);
-   MemSet(hashtable->innerBatchFile + oldnbatch, 0,
-  (nbatch - oldnbatch) * sizeof(BufFile *));
-   MemSet(hashtable->outerBatchFile + oldnbatch, 0,
-  (nbatch - oldnbatch) * sizeof(BufFile *));
+   hashtable->innerBatchFile = 
repalloc0_array(hashtable->innerBatchFile, BufFile *, nbatch, oldnbatch);
+   hashtable->outerBatchFile = 
repalloc0_array(hashtable->outerBatchFile, BufFile *, nbatch, oldnbatch);
}
 
MemoryContextSwitchTo(oldcxt);
diff --git a/src/backend/libpq/be-fsstubs.c b/src/backend/libpq/be-fsstubs.c
index 3e5cada7eb..bf901a9cdc 100644
--- a/src/backend/libpq/be-fsstubs.c
+++ b/src/backend/libpq/be-fsstubs.c
@@ -696,19 +696,16 @@ newLOfd(void)
newsize = 64;
cookies = (LargeObjectDesc **)
MemoryContextAllocZero(fscxt, newsize * 
sizeof(LargeObjectDesc *));
-   cookies_size = newsize;
}
else
{
/* Double size of array */
i = cookies_size;
newsize = cookies_size * 2;
-   cookies = (LargeObjectDesc **)
-   repalloc(cookies, newsize * sizeof(LargeObjectDesc *));
-   MemSet(cookies + cookies_size, 0,
-  (newsize - cookies_size) * sizeof(LargeObjectDesc 
*));
-   cookies_size = newsize;
+   cookies =
+   repalloc0_array(cookies, LargeObjectDesc *, newsize, 
cookies_size);
}
+   cookies_size = newsize;
 
return i;
 }
diff --git a/src/backend/optimizer/util/placeholder.c 
b/src/backend/optimizer/util/placeholder.c
index c7bfa293c9..e40b3a12fd 100644
--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -133,16 +133,11 @@ find_placeholder_info(PlannerInfo *root, PlaceHolderVar 
*phv)
while (phinfo->phid >= new_size)
new_size *= 2;
if (root->placeholder_array)
-   {
-   root->placeholder_array = (PlaceHolderInfo **)
-   repalloc(root->placeholder_array,
-sizeof(PlaceHolderInfo *) * 
new_size);
-   MemSet(root->placeholder_array + 
root->placeholder_array_size, 0,
-  sizeof(PlaceHolderInfo *) * (new_size - 
root->placeholder_array_size));
-   }
+   root->placeholder_array =
+   repalloc0_array(root->placeholder_array, 
PlaceHolderInfo *, new_size, root->placeholder_array_size);
else
-   root->placeholder_array = (PlaceHolderInfo **)
-   palloc0(new_size * sizeof(PlaceHolderInfo *));
+   root->placeholder_array =
+   palloc0_array(PlaceHolderInfo *, new_size);
root->placeholder_array_size = new_size;
}
root->placeholder_array[phinfo->phid] = phinfo;
diff --git a/src/backend/optimizer/util/relnode.c 
b/src/backend/optimizer/util/relnode.c
index edcdd0a360..3ea40ff1b6 100644
--- a/src/b

Re: build remaining Flex files standalone

2022-09-13 Thread John Naylor
On Wed, Sep 14, 2022 at 6:10 AM Andres Freund  wrote:
>
> On 2022-09-12 14:49:50 +0700, John Naylor wrote:
> > CI builds fine. For v2 I only adjusted the commit message. I'll push
> > in a couple days unless there are objections.
>
> Makes sense to me. Thanks for working on it!

This is done.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: failing to build preproc.c on solaris with sun studio

2022-09-13 Thread Justin Pryzby
On Wed, Sep 14, 2022 at 03:08:06PM +1200, Thomas Munro wrote:
> On Wed, Sep 14, 2022 at 10:23 AM Thomas Munro  wrote:
> > Given the simplicity of this case, though, I suppose we could
> > have a little not-very-general shell/python/whatever wrapper script --
> > just compute a checksum of the input and keep the output files around.
> 
> Something as dumb as this perhaps...

> if [ -z "$c_file" ] ; then
>   c_file="(echo "$y_file" | sed 's/\.y/.tab.c/')"
> fi

This looks wrong.  I guess you mean to use $() and missing "$" ?

It could be:
[ -z "$c_file" ] &&
c_file=${y_file%.y}.tab.c

> if [ -z "$SIMPLE_BISON_CACHE_PATH" ] ; then
>   SIMPLE_BISON_CACHE_PATH="/tmp/simple-bison-cache"
> fi

Should this default to CCACHE_DIR?  Then it would work under cirrusci...

> h_file="$(echo $c_file | sed 's/\.c/.h/')"

Could be ${c_file%.c}.h

> if [ ! -e "$cached_c_file" -o ! -e "$cached_h_file" ] ; then

You could write the easy case first (I forget whether it's considered to
be more portable to write && outside of []).

> if [ -e "$cached_c_file" ] && [ -e "$cached_h_file" ] ; then

I can't see what part of this would fail to handle filenames with spaces
(?)

-- 
Justin




Re: Expand palloc/pg_malloc API

2022-09-13 Thread Tom Lane
Peter Eisentraut  writes:
> I have another little idea that fits well here: repalloc0 and 
> repalloc0_array.  These zero out the space added by repalloc.  This is a 
> common pattern in the backend code that is quite hairy to code by hand. 
> See attached patch.

+1 in general --- you've put your finger on something I felt was
missing, but couldn't quite identify.

However, I'm a bit bothered by the proposed API:

+extern pg_nodiscard void *repalloc0(void *pointer, Size size, Size oldsize);

It kind of feels that the argument order should be pointer, oldsize, size.
It feels even more strongly that people will get the ordering wrong,
whichever we choose.  Is there a way to make that more bulletproof?

The only thought that comes to mind offhand is that the only plausible
use-case is with size >= oldsize, so maybe an assertion or even a
runtime check would help to catch getting it backwards.  (I notice
that your proposed coding will fail rather catastrophically if the
caller gets it backwards.  An assertion failure would be better.)

regards, tom lane




Re: archive modules

2022-09-13 Thread Peter Eisentraut

I noticed that this patch has gone around and mostly purged mentions of
archive_command from the documentation and replaced them with
archive_library.  I don't think this is helpful, since people still use
archive_command and will want to see what the documentation has to say
about it.  I suggest we rewind that a bit and for example replace things
like

removed or recycled until they are archived. If WAL archiving cannot keep up
-   with the pace that WAL is generated, or if 
archive_command
+   with the pace that WAL is generated, or if 
archive_library
fails repeatedly, old WAL files will accumulate in 
pg_wal

with

removed or recycled until they are archived. If WAL archiving cannot keep up
with the pace that WAL is generated, or if 
archive_command
with the pace that WAL is generated, or if 
archive_command
or archive_library
fail repeatedly, old WAL files will accumulate in 
pg_wal





Re: failing to build preproc.c on solaris with sun studio

2022-09-13 Thread John Naylor
On Wed, Sep 14, 2022 at 5:24 AM Thomas Munro  wrote:
>
> On Tue, Sep 6, 2022 at 9:34 AM Tom Lane  wrote:
> > Peter Eisentraut  writes:
> > > Why is this being proposed?
> >
> > Andres is annoyed by the long build time of ecpg, which he has to
> > wait for whether he wants to test it or not.  I could imagine that
> > I might disable ecpg testing on my slowest buildfarm animals, too.
>
> This message triggered me to try to teach ccache how to cache
> preproc.y -> preproc.{c,h}, and I got that basically working[1], but
> upstream doesn't want it (yet).  I'll try again if the proposed
> refactoring to allow more kinds of compiler-like-things goes
> somewhere.  I think that started with people's struggles with GCC vs
> MSVC.  Given the simplicity of this case, though, I suppose we could
> have a little not-very-general shell/python/whatever wrapper script --
> just compute a checksum of the input and keep the output files around.

If we're going to go to this length, it seems more straightforward to
just check the .c/.h files into version control, like every other
project that I have such knowledge of.

To be fair, our grammar changes much more often. One other possible
deal-breaker of that is that it makes it more painful for forks to
maintain additional syntax.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: Expand palloc/pg_malloc API

2022-09-13 Thread Tom Lane
I wrote:
> It kind of feels that the argument order should be pointer, oldsize, size.
> It feels even more strongly that people will get the ordering wrong,
> whichever we choose.  Is there a way to make that more bulletproof?

Actually ... an even-more-terrifyingly-plausible misuse is that the
supplied oldsize is different from the actual previous allocation.
We should try to check that.  In MEMORY_CONTEXT_CHECKING builds
it should be possible to assert that oldsize == requested_size.
We don't have that data if !MEMORY_CONTEXT_CHECKING, but we could
at least assert that oldsize <= allocated chunk size.

regards, tom lane




Re: pg_basebackup's --gzip switch misbehaves

2022-09-13 Thread Michael Paquier
On Tue, Sep 13, 2022 at 05:38:47PM -0400, Tom Lane wrote:
> is something I tried along the way to diagnosing the problem, and
> it turns out to have exactly zero effect.  The $tempdir is some
> temporary subdirectory of tmp_check that will get nuked at the end
> of the TAP test no matter what.  So these rmtrees are merely making
> the evidence disappear a bit faster; it will anyway.

FWIW, I just stick a die() in the middle of the code path when I want
to look at specific results.  Similar method, same result.

>  # Test background stream process terminating before the basebackup has
> 
> which is not real clean, since then the files get left behind even
> on success, which I doubt we want either.

Another thing that could be done here is to use the same base location
as the cluster nodes aka $PostgreSQL::Test::Utils::tmp_check.  That
would mean storing in a repo more data associated to the base backups
after a fresh run, though.  I am not sure that small machine would
like this accumulation in a single run even if disk space is cheap
these days.

> Anyway, I have no objection to dropping the rmtrees, since they're
> pretty useless as the code stands.  Just wanted to mention this
> issue for the archives.

I see more ways to change the existing behavior, so for now I have
left that untouched.

And so, I have spent a couple of hours torturing the patch, applying
it after a few tweaks and CI runs:
- --without-zlib was causing a failure in the pg_basebackup tests as
we have a few tests that parse and validate a set of invalid specs for
the client-side and server-side compression.  With zlib around, the
tests and their expected results are unchanged, that's just a 
consequence of moving the assignment of a default level much earlier.
- pg_basebackup was triggering an assertion when using client-lz4 or
client-zstd as we use the directory method of walmethods.c.  In this
case, we just support zlib as compression and enforce no compression
when we are under lz4 or zstd.  This came from an over-simplification
of the code.  There is a gap in the testing of pg_basebackup,
actually, because we have zero tests for LZ4 and zstd there.
- The documentation of the replication protocol needed some
adjustments for the default comporession levels.

The buildfarm is green so I think that we are good.  I have closed the
open item.

You have mentioned upthread an extra thing about the fact that we pass
down 0 to deflateParams(). This is indeed wrong and we are lucky that
Z_DEFAULT_STRATEGY maps to 0.  Better to fix and backpatch this one
down to where gzip compression has been added to walmethods.c..  I'll
just do that in a bit after double-checking the area and the other
routines.
--
Michael


signature.asc
Description: PGP signature


Re: failing to build preproc.c on solaris with sun studio

2022-09-13 Thread Tom Lane
John Naylor  writes:
> If we're going to go to this length, it seems more straightforward to
> just check the .c/.h files into version control, like every other
> project that I have such knowledge of.

Strong -1 on that, because then we'd have to mandate that every
committer use exactly the same version of bison.  It's been
painful enough to require that for autoconf (and I'm pleased that
it looks like meson will let us drop that nonsense).

regards, tom lane




Re: [PATCH]Feature improvement for MERGE tab completion

2022-09-13 Thread bt22kawamotok
When I tried to apply this patch, I got the following warning, please 
fix it.

Other than that, I think everything is fine.

$ git apply fix_tab_completion_merge_v4.patch
fix_tab_completion_merge_v4.patch:38: trailing whitespace.
else if (TailMatches("USING", MatchAny, "ON", MatchAny) ||
fix_tab_completion_merge_v4.patch:39: indent with spaces.
 TailMatches("USING", MatchAny, "AS", MatchAny, "ON",
MatchAny) ||
fix_tab_completion_merge_v4.patch:40: indent with spaces.
 TailMatches("USING", MatchAny, MatchAny, "ON", 
MatchAny))

fix_tab_completion_merge_v4.patch:53: trailing whitespace.
else if (TailMatches("WHEN", "MATCHED") ||
warning: 4 lines add whitespace errors.


Thanks for reviewing.

I fixed the problem and make patch v5.
Please check it.

Regards,

Kotaro Kawamotodiff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 62a39779b9..a0332c76a4 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -4063,23 +4063,28 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_mergetargets);
 	else if (TailMatches("MERGE", "INTO", MatchAny))
 		COMPLETE_WITH("USING", "AS");
-	else if (TailMatches("MERGE", "INTO", MatchAny, "USING"))
+	else if (TailMatches("MERGE", "INTO", MatchAny, "USING") ||
+			 TailMatches("MERGE", "INTO", MatchAny, MatchAny, "USING") ||
+			 TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny, "USING"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);
 	/* with [AS] alias */
-	else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny))
+	else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny) ||
+			 TailMatches("MERGE", "INTO", MatchAny, MatchAnyExcept("AS")))
 		COMPLETE_WITH("USING");
-	else if (TailMatches("MERGE", "INTO", MatchAny, MatchAny))
-		COMPLETE_WITH("USING");
-	else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny, "USING"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);
-	else if (TailMatches("MERGE", "INTO", MatchAny, MatchAny, "USING"))
+	else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny, "USING") ||
+			 TailMatches("MERGE", "INTO", MatchAny, MatchAny, "USING"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);
+	else if (TailMatches("MERGE", "INTO", MatchAny, "USING", MatchAny) ||
+			 TailMatches("MERGE", "INTO", MatchAny, MatchAny, "USING", MatchAny) ||
+			 TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny, "USING", MatchAny))
+		COMPLETE_WITH("AS", "ON");
 	/* ON */
-	else if (TailMatches("MERGE", "INTO", MatchAny, "USING", MatchAny))
-		COMPLETE_WITH("ON");
-	else if (TailMatches("INTO", MatchAny, "AS", MatchAny, "USING", MatchAny, "AS", MatchAny))
-		COMPLETE_WITH("ON");
-	else if (TailMatches("INTO", MatchAny, MatchAny, "USING", MatchAny, MatchAny))
+	else if (TailMatches("MERGE", "INTO", MatchAny, "USING", MatchAny, MatchAnyExcept("ON|AS")) ||
+			 TailMatches("MERGE", "INTO", MatchAny, "USING", MatchAny, "AS", MatchAny) ||
+			 TailMatches("MERGE", "INTO", MatchAny, MatchAny, "USING", MatchAny, MatchAnyExcept("ON|AS")) ||
+			 TailMatches("MERGE", "INTO", MatchAny, MatchAny, "USING", MatchAny, "AS", MatchAny) ||
+			 TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny, "USING", MatchAny, MatchAnyExcept("ON|AS")) ||
+			 TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny, "USING", MatchAny, "AS", MatchAny))
 		COMPLETE_WITH("ON");
 	/* ON condition */
 	else if (TailMatches("INTO", MatchAny, "USING", MatchAny, "ON"))
@@ -4089,18 +4094,25 @@ psql_completion(const char *text, int start, int end)
 	else if (TailMatches("INTO", MatchAny, MatchAny, "USING", MatchAny, MatchAny, "ON"))
 		COMPLETE_WITH_ATTR(prev6_wd);
 	/* WHEN [NOT] MATCHED */
-	else if (TailMatches("USING", MatchAny, "ON", MatchAny))
+	else if (TailMatches("USING", MatchAny, "ON", MatchAny) ||
+			 TailMatches("USING", MatchAny, "ON", MatchAny, MatchAnyExcept("When"), MatchAnyExcept("When")) ||
+			 TailMatches("USING", MatchAny, "AS", MatchAny, "ON", MatchAny) ||
+			 TailMatches("USING", MatchAny, "AS", MatchAny, "ON", MatchAny, MatchAnyExcept("When"), MatchAnyExcept("When")) ||
+			 TailMatches("USING", MatchAny, MatchAny, "ON", MatchAny) ||
+			 TailMatches("USING", MatchAny, MatchAny, "ON", MatchAny, MatchAnyExcept("When"), MatchAnyExcept("When")))
 		COMPLETE_WITH("WHEN MATCHED", "WHEN NOT MATCHED");
-	else if (TailMatches("USING", MatchAny, "AS", MatchAny, "ON", MatchAny))
-		COMPLETE_WITH("WHEN MATCHED", "WHEN NOT MATCHED");
-	else if (TailMatches("USING", MatchAny, MatchAny, "ON", MatchAny))
-		COMPLETE_WITH("WHEN MATCHED", "WHEN NOT MATCHED");
-	else if (TailMatches("WHEN", "MATCHED"))
-		COMPLETE_WITH("THEN", "AND");
-	else if (TailMatches("WHEN", "NOT", "MATCHED"))
+	else if (TailMatches("USING", MatchAny, "ON", MatchAny, "WHEN") ||
+			 TailMatches("USING", MatchAny, "ON", MatchAny, MatchAny, MatchAny, "WHEN") ||
+			 TailMatches("USING", MatchAny, "AS", MatchAny, "ON", Match

RE: why can't a table be part of the same publication as its schema

2022-09-13 Thread houzj.f...@fujitsu.com
On Monday, September 12, 2022 10:14 PM vignesh C  wrote:
> On Sat, 10 Sept 2022 at 07:32, Amit Kapila  wrote:
> >
> > On Fri, Sep 9, 2022 at 8:48 PM Robert Haas 
> wrote:
> > >
> > > On Fri, Sep 9, 2022 at 10:29 AM houzj.f...@fujitsu.com
> > >  wrote:
> > > > IIRC, the feature currently works almost the same as you
> > > > described. It doesn't create entry for tables that are published
> > > > via its schema level, it only record the published schema and check 
> > > > which
> tables are part of it.
> > >
> > > Oh, well if that's the case, that is great news.
> > >
> >
> > Yes, the feature works as you and Hou-San have mentioned.
> >
> > > But then I don't
> > > understand Amit's comment from before:
> > >
> > > > Yes, because otherwise, there was confusion while dropping the
> > > > objects from publication. Consider in the above case, if we would
> > > > have allowed it and then the user performs ALTER PUBLICATION p1
> > > > DROP ALL TABLES IN SCHEMA s1, then (a) shall we remove both schema
> > > > s1 and a table that is separately added (s1.t1) from that schema,
> > > > or (b) just remove schema s1?
> > >
> > > I believe that (b) is the correct behavior, so I assumed that this
> > > issue must be some difficulty in implementing it, like a funny
> > > catalog representation.
> > >
> >
> > No, it was because of syntax. IIRC, during development, Greg Nancarrow
> > raised a point [1] that a user can expect the individually added
> > tables for a schema which is also part of the publication to also get
> > dropped when she specifies DROP ALL TABLES IN SCHEMA. IIRC,
> > originally, the patch had a behavior (b) but then changed due to
> > discussion around this point. But now that it seems you and others
> > don't feel that was right, we can change back to (b) as I think that
> > shouldn't be difficult to achieve.
> 
> I have made the changes to allow creation of publication with a schema and
> table of the same schema. The attached patch has the changes for the same.
> I'm planning to review and test the patch further.

Thanks for the patch. While reviewing it, I found that the column list behavior
might need to be changed or confirmed after allowing the above case.

After applying the patch, we support adding a table with column list along with
the table's schema[1], and it will directly apply the column list in the
logical replication after applying the patch.

[1]--
CREATE PUBLICATION pub FOR TABLE public.test(a), FOR ALL TABLES IN SCHEMA 
public;
-

If from the point of view of consistency, for column list, we could report an
ERROR because we currently don't allow using different column lists for a
table. Maybe an ERROR like:

"ERROR: cannot use column for table x when the table's schema is also in the 
publication"

But if we want to report an ERROR for column list in above case. We might need
to restrict the ALTER TABLE SET SCHEMA as well because user could move a table
which is published with column list to a schema that is also published in the
publication, so we might need to add some similar check(which is removed in
Vignesh's patch) to tablecmd.c to disallow this case.

Another option could be just ingore the column list if table's schema is also
part of publication. But it seems slightly inconsistent with the rule that we
disallow using different column list for a table.

Best regards,
Hou zj


Re: pg_basebackup's --gzip switch misbehaves

2022-09-13 Thread Tom Lane
Michael Paquier  writes:
> And so, I have spent a couple of hours torturing the patch, applying
> it after a few tweaks and CI runs:
> ...
> The buildfarm is green so I think that we are good.  I have closed the
> open item.

+1, thanks for taking care of that.

As far as my original complaint about mamba goes, I've not quite
been able to run it to ground.  However, I found that NetBSD
seems to be shipping unmodified zlib 1.2.11, which contains a
number of known bugs in deflate_stored() --- that is, the code
path implementing compression level 0.  Red Hat for one is
carrying several back-patched fixes in that part of zlib.
So for the moment I'm willing to write it off as "not our bug".
We aren't intentionally testing compression level 0, and hardly
anybody would intentionally use it in the field, so it's not
really a thing worth worrying about IMO.  But if mamba continues
to show failures in that test then it will be worth looking closer.

regards, tom lane




Re: archive modules

2022-09-13 Thread Michael Paquier
On Wed, Sep 14, 2022 at 06:37:38AM +0200, Peter Eisentraut wrote:
> I noticed that this patch has gone around and mostly purged mentions of
> archive_command from the documentation and replaced them with
> archive_library.  I don't think this is helpful, since people still use
> archive_command and will want to see what the documentation has to say
> about it.  I suggest we rewind that a bit and for example replace things
> like
> 
> removed or recycled until they are archived. If WAL archiving cannot keep 
> up
> -   with the pace that WAL is generated, or if 
> archive_command
> +   with the pace that WAL is generated, or if 
> archive_library
> fails repeatedly, old WAL files will accumulate in 
> pg_wal
> 
> with
> 
> removed or recycled until they are archived. If WAL archiving cannot keep 
> up
> with the pace that WAL is generated, or if 
> archive_command
> with the pace that WAL is generated, or if 
> archive_command
> or archive_library
> fail repeatedly, old WAL files will accumulate in 
> pg_wal

Yep.  Some references to archive_library have been changed by 31e121
to do exactly that.  There seem to be more spots in need of an
update.
--
Michael


signature.asc
Description: PGP signature


Re: is_superuser is not documented

2022-09-13 Thread bt22kawamotok

Thanks for updating the patch!

The patch looks good to me.

-   /* Not for general use --- used by SET SESSION AUTHORIZATION */
{"session_authorization", PGC_USERSET, UNGROUPED,

If we don't document session_authorization and do expect that
it's usually used by SET/SHOW SESSION AUTHORIZATION,
this comment doesn't need to be removed, I think.


Thanks for reviewing.

I update patch to reflect master update.


Could you register this patch into the next Commit Fest
so that we don't forget it?


OK. I will register next Commit Fest. Thank you.

Regards,

Kotaro Kawamotodiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a5cd4e44c7..6358f5e887 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -10584,6 +10584,19 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
   
  
 
+ 
+  is_superuser (boolean)
+  
+   is_superuser configuration parameter
+  
+  
+  
+   
+Shows whether the current user is a superuser or not.
+   
+  
+ 
+
  
   lc_collate (string)
   
diff --git a/doc/src/sgml/ref/show.sgml b/doc/src/sgml/ref/show.sgml
index b3747b119f..d8721be805 100644
--- a/doc/src/sgml/ref/show.sgml
+++ b/doc/src/sgml/ref/show.sgml
@@ -100,15 +100,6 @@ SHOW ALL
  
 

-
-   
-IS_SUPERUSER
-
- 
-  True if the current role has superuser privileges.
- 
-
-   
   
 

diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 87e625aa7a..6b6e02e162 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -989,11 +989,10 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 	{
-		/* Not for general use --- used by SET SESSION AUTHORIZATION */
-		{"is_superuser", PGC_INTERNAL, UNGROUPED,
+		{"is_superuser", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows whether the current user is a superuser."),
 			NULL,
-			GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_REPORT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
 		},
 		&session_auth_is_superuser,
 		false,


Re: pgsql: Doc: Explain about Column List feature.

2022-09-13 Thread Peter Smith
On Tue, Sep 13, 2022 at 10:11 PM Alvaro Herrera  wrote:
>
> On 2022-Sep-07, Amit Kapila wrote:
>
> > Doc: Explain about Column List feature.
> >
> > Add a new logical replication section for "Column Lists" (analogous to the
> > Row Filters page). This explains how the feature can be used and the
> > caveats in it.
> >
> > Author: Peter Smith
> > Reviewed-by: Shi yu, Vignesh C, Erik Rijkers, Amit Kapila
> > Backpatch-through: 15, where it was introduced
> > Discussion: 
> > https://postgr.es/m/CAHut+PvOuc9=_4TbASc5=vuqh16uwtfo3gzckqk_5m1hrw3...@mail.gmail.com
>
> Hi
>
> I just read these docs and noticed that it mentions that column lists
> can be used for security.  As far as I remember, this is wrong: it is
> the subscriber that builds the query to read column data during initial
> sync, and the publisher doesn't forbid to read columns not in it, so it
> is entirely possible for a malicious subscriber to read columns other
> than those published.  I'm pretty sure we discussed this at some point
> during development of the feature.
>
> So I suggest to mention this point explicitly in its own paragraph, to
> avoid giving a false sense of security.
>

Thanks for the feedback.

The mention of 'security' in the page (as previously written) just
means to say that publications can prevent sensitive columns from
being replicated/subscribed by default. It was not intended to imply
those columns are immune from a malicious attack. Indeed, just having
another publication without any column lists could expose the same
sensitive columns.

I am fine with your rewording of the security part.

> While going over this text I found some additional things that could
> --in my opinion-- stand some improvement:

In general (because they have a lot of similarities), the wording and
the section structure of the "Column Lists" page tried to be
consistent with the "Row Filters" page. Perhaps this made everything
unnecessarily complex. Anyway, your suggested re-wording and removal
of those sections look OK to me - the content is the same AFAICT.

>
> * It feels better to start the section saying that a list can be
>   specified; subscriber must have all those columns; omitting list
>   means to publish everything. That leads to shorter text (no need to
>   say "you need to have them all, oh wait you might only have a few").
>
> * there's no reason to explain the syntax in vague terms and refer the
>   reader to the reference page.
>
> * The first few  seem to give no useful structure, and instead
>   cause the text to become disorganized. I propose to remove them, and
>   instead mix the paragraphs in them so that we explain the rules and
>   the behavior, and lastly the effect on specific commands.
>
> The attached patch effects those changes.
>

For some reason I was unable to apply your supplied patch to master:

[postgres@CentOS7-x64 oss_postgres_misc]$ git apply column-list-wording.patch
column-list-wording.patch:16: trailing whitespace.
   Each publication can optionally specify which columns of each table are
column-list-wording.patch:17: trailing whitespace.
   replicated to subscribers. The table on the subscriber side must have at
column-list-wording.patch:18: trailing whitespace.
   least all the columns that are published. If no column list is specified,
column-list-wording.patch:19: trailing whitespace.
   then all columns in the publisher are replicated.
column-list-wording.patch:20: trailing whitespace.
   See  for details on the syntax.
error: patch failed: doc/src/sgml/logical-replication.sgml:1093
error: doc/src/sgml/logical-replication.sgml: patch does not apply
error: patch failed: doc/src/sgml/ref/create_publication.sgml:94
error: doc/src/sgml/ref/create_publication.sgml: patch does not apply

>
> One more thing. There's a sect2 about combining column list.  Part of it
> seems pretty judgmental and I see no reason to have it in there; I
> propose to remove it (it's not in this patch).  I think we should just
> say it doesn't work at present, here's how to work around it, and
> perhaps even say that we may lift the restriction in the future.  The
> paragraph that starts with "Background:" is IMO out of place, and it
> repeats the mistake that column lists are for security.
>

It wasn't clear which part you felt was judgemental. I have removed
the "Background" paragraph but I have otherwise left that section and
Warning as-is because it still seemed useful for the user to know. You
can change/remove it if you disagree.

>
> Lastly: In the create-publication reference page I think it would be
> better to reword the Parameters section a bit.  It mentions
> FOR TABLE as a parameter, but the parameter is actually
> table_name; and the row-filter and
> column-list explanations are also put in there when they should be in
> their own expression and column_name
> varlistentries.  I think splitting things that way would result in a
> clearer explanation.
>

IMO this should be proposed as a separate patch. Some of those 

  1   2   >