date:20221010

Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf

2022-10-10 Thread Drouvot, Bertrand


Hi,

On 10/5/22 9:24 AM, Michael Paquier wrote:

On Tue, Sep 20, 2022 at 01:33:09PM +0200, Drouvot, Bertrand wrote:
Anyway, I have looked at the patch.

+   List   *roles_re;
+   List   *databases_re;
+   regex_thostname_re;
I am surprised by the approach of using separate lists for the regular
expressions and the raw names.  Wouldn't it be better to store
everything in a single list but assign an entry type?  In this case it
would be either regex or plain string.  This would minimize the
footprint of the changes (no extra arguments *_re in the routines
checking for a match on the roles, databases or hosts).  And it seems
to me that this would make unnecessary the use of re_num here and
there. 


Please find attached v5 addressing this. I started with an union but it 
turns out that we still need the plain string when a regex is used. This 
is not needed for the authentication per say but for fill_hba_line(). So 
I ended up creating a new struct without union in v5.



The hostname is different, of course, requiring only an extra
field for its type, or something like that.


I'm using the same new struct as described above for the hostname.



Perhaps the documentation would gain in clarity if there were more
examples, like a set of comma-separated examples (mix of regex and raw
strings for example, for all the field types that gain support for
regexes)?



Right, I added more examples in v5.


-$node->append_conf('postgresql.conf', "log_connections = on\n");
+$node->append_conf(
+'postgresql.conf', qq{
+listen_addresses = '127.0.0.1'
+log_connections = on
+});
Hmm.  I think that we may need to reconsider the location of the tests
for the regexes with the host name, as the "safe" regression tests
should not switch listen_addresses.  One location where we already do
that is src/test/ssl/, so these could be moved there. 


Good point, I moved the hostname related tests in src/test/ssl.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comdiff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index c6f1b70fd3..406628ef35 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -235,8 +235,9 @@ hostnogssenc  database  
userPostgreSQL database.
-   Multiple database names can be supplied by separating them with
-   commas.  A separate file containing database names can be specified by
+   Multiple database names and/or regular expressions preceded by 
/
+   can be supplied by separating them with commas.
+   A separate file containing database names can be specified by
preceding the file name with @.
   
  
@@ -249,7 +250,8 @@ hostnogssenc  database  
userall specifies that it
matches all users.  Otherwise, this is either the name of a specific
-   database user, or a group name preceded by +.
+   database user, a regular expression preceded by /
+   or a group name preceded by +.
(Recall that there is no real distinction between users and groups
in PostgreSQL; a + mark 
really means
match any of the roles that are directly or indirectly members
@@ -258,7 +260,8 @@ hostnogssenc  database  
user/
+   can be supplied by separating them with commas.
A separate file containing user names can be specified by preceding the
file name with @.
   
@@ -270,8 +273,9 @@ hostnogssenc  database  
user
   
Specifies the client machine address(es) that this record
-   matches.  This field can contain either a host name, an IP
-   address range, or one of the special key words mentioned below.
+   matches.  This field can contain either a host name, a regular 
expression
+   preceded by / representing host names, an IP address 
range,
+   or one of the special key words mentioned below.
   
 
   
@@ -739,6 +743,24 @@ hostall all ::1/128
 trust
 # TYPE  DATABASEUSERADDRESS METHOD
 hostall all localhost   trust
 
+# The same using a regular expression for host name, which allows connection 
for
+# host name ending with "test".
+#
+# TYPE  DATABASEUSERADDRESS METHOD
+hostall all /^.*test$   trust
+
+# The same using regular expression for DATABASE, which allows connection to 
the
+# db1 and testdb databases and any database with a name ending with "test".
+#
+# TYPE  DATABASE   USERADDRESS METHOD
+local   db1,/^.*test$,testdb   all /^.*test$   trust
+
+# The same using regular expression for USER, which allows connection to the
+# user1 and testuser users and any user with a name ending with "test".
+#
+# TYPE  DATABASE USER  ADDRESS 
METHO

Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf

2022-10-10 Thread Drouvot, Bertrand


Hi,

On 10/6/22 2:53 AM, Michael Paquier wrote:

On Wed, Oct 05, 2022 at 03:32:20PM +0200, Drouvot, Bertrand wrote:

On 10/5/22 9:24 AM, Michael Paquier wrote:

- test_role() -> test_conn() to be able to pass down a database name.
- reset_pg_hba() to control the host, db and user parts.  The host
part does not really apply after moving the hosts checks to a more
secure location, so I guess that this had better be extended just for
the user and database, keeping host=local all the time.
I am planning to apply 0001 attached independently,


0001 looks good to me.


Thanks.  I have applied this refactoring, leaving the host part out of
the equation as we should rely only on local connections for this
part of the test. 


Thanks!


Thanks! I'll look at it and the comments you just made up-thread.


Cool, thanks.  One thing that matters a lot IMO (that I forgot to
mention previously) is to preserve the order of the items parsed from
the configuration files.


Fully agree, all the versions that have been submitted in this thread 
preserves the ordering.




Also, I am wondering whether we'd better have some regression tests
where a regex includes a comma and a role name itself has a comma,
actually, just to stress more the parsing of individual items in the
HBA file.


Good idea, it has been added in v5 just shared up-thread.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Adding Support for Copy callback functionality on COPY TO api

2022-10-10 Thread Bharath Rupireddy

On Sun, Oct 9, 2022 at 2:44 AM Nathan Bossart  wrote:
>
> Sorry for the noise.  There was an extra #include in v4 that I've removed
> in v5.

IIUC, COPY TO callback helps move a table's data out of postgres
server. Just wondering, how is it different from existing solutions
like COPY TO ... PROGRAM/FILE, logical replication, pg_dump etc. that
can move a table's data out? I understandb that the COPY FROM callback
was needed for logical replication 7c4f52409. Mentioning a concrete
use-case helps here.

I'm not quite sure if we need a separate module to just tell how to
use this new callback. I strongly feel that it's not necessary. It
unnecessarily creates extra code (actual code is 25 LOC with v1 patch
but 150 LOC with v5 patch) and can cause maintenance burden. These
callback APIs are simple enough to understand for those who know
BeginCopyTo() or BeginCopyFrom() and especially for those who know how
to write extensions. These are not APIs that an end-user uses. The
best would be to document both COPY FROM and COPY TO callbacks,
perhaps with a pseudo code specifying just the essence [1], and their
possible usages somewhere here
https://www.postgresql.org/docs/devel/sql-copy.html.

The order of below NOTICE messages isn't guaranteed and it can change
depending on platforms. Previously, we've had to suppress such
messages in the test output 6adc5376d.

+SELECT test_copy_to_callback('public.test'::pg_catalog.regclass);
+NOTICE:  COPY TO callback called with data "123" and length 5
+NOTICE:  COPY TO callback called with data "123456" and length 8
+NOTICE:  COPY TO callback called with data "123456789" and length 11
+ test_copy_to_callback

[1]
+Relationrel = table_open(PG_GETARG_OID(0), AccessShareLock);
+CopyToState cstate;
+
+cstate = BeginCopyTo(NULL, rel, NULL, RelationGetRelid(rel), NULL, NULL,
+ to_cb, NIL, NIL);
+(void) DoCopyTo(cstate);
+EndCopyTo(cstate);
+
+table_close(rel, AccessShareLock);

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: ps command does not show walsender's connected db

2022-10-10 Thread Bharath Rupireddy

On Mon, Oct 10, 2022 at 12:20 PM bt22nakamorit
 wrote:
>
> 2022-10-10 12:27 Bharath Rupireddy wrote:
> > if (port->database_name != NULL && port->database_name[0] != '\0')
> > appendStringInfo(&ps_data, "%s ", port->database_name);
> >
> > The above works better.
>
> Thank you for the suggestion.
> I have edited the patch to reflect your idea.

Thanks. LGTM.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: archive modules

2022-10-10 Thread Peter Eisentraut


On 05.10.22 20:57, Nathan Bossart wrote:

On Wed, Oct 05, 2022 at 07:55:58PM +0200, Peter Eisentraut wrote:

Leaving archive_command empty is an intentional configuration choice.

What we are talking about here is, arguably, a misconfiguration, so it
should result in an error.


Okay.  What do you think about something like the attached?


That looks like the right solution to me.

Let's put that into PG 16, and maybe we can consider backpatching it.

Re: Remove unnecessary commas for goto labels

2022-10-10 Thread John Naylor

On Sun, Oct 9, 2022 at 10:08 AM Julien Rouhaud  wrote:
>
> Hi,
>
> On Sun, Oct 09, 2022 at 07:42:58AM +0800, Japin Li wrote:
> >
> > Hi hackers,
> >
> > I find there are some unnecessary commas for goto lables,
> > attached a patch to remove them.
>
> You mean semi-colon?  +1, and a quick regex later I don't see any other
> occurrence.

Interestingly, I did find that in C, some statement is needed after a
label, even an empty one, otherwise it won't compile. That's not the case
here, though, so pushed.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Remove unnecessary commas for goto labels

2022-10-10 Thread Japin Li



On Mon, 10 Oct 2022 at 16:18, John Naylor  wrote:
> Interestingly, I did find that in C, some statement is needed after a
> label, even an empty one, otherwise it won't compile.

Yeah, this is required by ISO C [1].

> That's not the case here, though, so pushed.

Thank you!


[1] https://www.gnu.org/software/gnu-c-manual/gnu-c-manual.html#Labels

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

clean up pid_t printing and get rid of pgpid_t

2022-10-10 Thread Peter Eisentraut


On 04.10.22 10:15, Peter Eisentraut wrote:
I wanted to propose the attached patch to get rid of the custom pgpid_t 
typedef in pg_ctl.  Since we liberally use pid_t elsewhere, this seemed 
plausible.


However, this patch fails the CompilerWarnings job on Cirrus, because 
apparently under mingw, pid_t is "volatile long long int", so all the 
printf placeholders mismatch.  However, we print pid_t as %d in a lot of 
other places, so I'm confused why this fails here.


I figured out that in most places we actually store PIDs in int, and in 
the cases where we use pid_t, casts before printing are indeed used and 
necessary.  So nevermind that.


In any case, I took this opportunity to standardize the printing of PIDs 
as %d.  There were a few stragglers.


And then the original patch to get rid of pgpid_t in pg_ctl, now updated 
with the correct casts for printing.  I confirmed that this now passes 
the CompilerWarnings job.


From 72b0620f3824ef91bf9f64b4814e5874f8152322 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 10 Oct 2022 10:04:45 +0200
Subject: [PATCH v2 1/3] doc: Correct type of bgw_notify_pid

This has apparently been wrong since the beginning
(090d0f2050647958865cb495dff74af7257d2bb4).
---
 doc/src/sgml/bgworker.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/bgworker.sgml b/doc/src/sgml/bgworker.sgml
index 3d4e4afcf919..7ba5da27e50a 100644
--- a/doc/src/sgml/bgworker.sgml
+++ b/doc/src/sgml/bgworker.sgml
@@ -63,7 +63,7 @@ Background Worker Processes
 charbgw_function_name[BGW_MAXLEN];
 Datum   bgw_main_arg;
 charbgw_extra[BGW_EXTRALEN];
-int bgw_notify_pid;
+pid_t   bgw_notify_pid;
 } BackgroundWorker;
 
   

base-commit: 06dbd619bfbfe03fefa7223838690d4012f874ad
-- 
2.37.3

From c405c4a942c861a720662035983dca237fb92527 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 10 Oct 2022 10:07:14 +0200
Subject: [PATCH v2 2/3] Standardize format for printing PIDs

Most code prints PIDs as %d, but some code tried to print them as long
or unsigned long.  While this is in theory allowed, the fact that PIDs
fit into int is deeply baked into all PostgreSQL code, so these random
deviations don't accomplish anything except confusion.

Note that we still need casts from pid_t to int, because on 64-bit
MinGW, pid_t is long long int.  (But per above, actually supporting
that range in PostgreSQL code would be major surgery and probably not
useful.)
---
 contrib/pg_prewarm/autoprewarm.c | 20 ++--
 src/backend/postmaster/bgworker.c|  4 ++--
 src/backend/storage/ipc/procsignal.c |  4 ++--
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index c8d673a20e36..1843b1862e57 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -193,8 +193,8 @@ autoprewarm_main(Datum main_arg)
{
LWLockRelease(&apw_state->lock);
ereport(LOG,
-   (errmsg("autoprewarm worker is already running 
under PID %lu",
-   (unsigned long) 
apw_state->bgworker_pid)));
+   (errmsg("autoprewarm worker is already running 
under PID %d",
+   (int) 
apw_state->bgworker_pid)));
return;
}
apw_state->bgworker_pid = MyProcPid;
@@ -303,8 +303,8 @@ apw_load_buffers(void)
{
LWLockRelease(&apw_state->lock);
ereport(LOG,
-   (errmsg("skipping prewarm because block dump 
file is being written by PID %lu",
-   (unsigned long) 
apw_state->pid_using_dumpfile)));
+   (errmsg("skipping prewarm because block dump 
file is being written by PID %d",
+   (int) 
apw_state->pid_using_dumpfile)));
return;
}
LWLockRelease(&apw_state->lock);
@@ -599,12 +599,12 @@ apw_dump_now(bool is_bgworker, bool dump_unlogged)
{
if (!is_bgworker)
ereport(ERROR,
-   (errmsg("could not perform block dump 
because dump file is being used by PID %lu",
-   (unsigned long) 
apw_state->pid_using_dumpfile)));
+   (errmsg("could not perform block dump 
because dump file is being used by PID %d",
+   (int) 
apw_state->pid_using_dumpfile)));
 
ereport(LOG,
-   (errmsg("skipping block dump because it is 
already being performed by PID %lu",
-   (unsigned long) 
apw_state->pid_using_dumpfile)));
+   (

Re: problems with making relfilenodes 56-bits

2022-10-10 Thread Dilip Kumar

On Wed, Oct 5, 2022 at 5:19 AM Andres Freund  wrote:

>
> I light dusted off my old varint implementation from [1] and converted the
> RelFileLocator and BlockNumber from fixed width integers to varint ones. This
> isn't meant as a serious patch, but an experiment to see if this is a path
> worth pursuing.
>
> A run of installcheck in a cluster with autovacuum=off, full_page_writes=off
> (for increased reproducability) shows a decent saving:
>
> master: 241106544 - 230 MB
> varint: 227858640 - 217 MB
>
> The average record size goes from 102.7 to 95.7 bytes excluding the remaining
> FPIs, 118.1 to 111.0 including FPIs.
>

I have also executed my original test after applying these patches on
top of the 56 bit relfilenode patch.  So earlier we saw the WAL size
increased by 11% (66199.09375 kB to 73906.984375 kB) and after this
patch now the WAL generated is 58179.2265625.  That means in this
particular example this patch is reducing the WAL size by 12% even
with the 56 bit relfilenode patch.

[1] 
https://www.postgresql.org/message-id/CAFiTN-uut%2B04AdwvBY_oK_jLvMkwXUpDJj5mXg--nek%2BucApPQ%40mail.gmail.com

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: create subscription - improved warning message

2022-10-10 Thread Peter Smith

On Mon, Oct 10, 2022 at 4:04 PM Amit Kapila  wrote:
>
> On Mon, Oct 10, 2022 at 10:10 AM Tom Lane  wrote:
> >
> > Amit Kapila  writes:
...
> The docs say [1]: "When creating a subscription, the remote host is
> not reachable or in an unclear state. In that case, the subscription
> can be created using the connect = false option. The remote host will
> then not be contacted at all. This is what pg_dump uses. The remote
> replication slot will then have to be created manually before the
> subscription can be activated."
>
> I think the below gives accurate information:
> WARNING: subscription was created, but is not connected
> HINT: You should create the slot manually, enable the subscription,
> and run %s to initiate replication.
>

+1

PSA  patch v2 worded like that.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v2-0001-create-subscription-warning-message.patch
Description: Binary data

src/test/perl/PostgreSQL/Test/*.pm not installed

2022-10-10 Thread Alvaro Herrera

I noticed that the new Perl test modules are not installed, so if you
try to use PostgreSQL/Test/Cluster.pm in an external test from pgxs, it
fails with the modules not being found.

I see no reason for this other than having overseen it in b235d41d9646,
so I propose the attached (for all branches, naturally.)

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La gente vulgar sólo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"
diff --git a/src/test/perl/Makefile b/src/test/perl/Makefile
index 3d3a95b52f..79f790772f 100644
--- a/src/test/perl/Makefile
+++ b/src/test/perl/Makefile
@@ -17,6 +17,7 @@ ifeq ($(enable_tap_tests),yes)
 
 installdirs:
 	$(MKDIR_P) '$(DESTDIR)$(pgxsdir)/$(subdir)'
+	$(MKDIR_P) '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test'
 
 install: all installdirs
 	$(INSTALL_DATA) $(srcdir)/TestLib.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/TestLib.pm'
@@ -24,6 +25,8 @@ install: all installdirs
 	$(INSTALL_DATA) $(srcdir)/RecursiveCopy.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/RecursiveCopy.pm'
 	$(INSTALL_DATA) $(srcdir)/PostgresNode.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgresNode.pm'
 	$(INSTALL_DATA) $(srcdir)/PostgresVersion.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgresVersion.pm'
+	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/Cluster.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm'
+	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/Utils.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Utils.pm'
 
 uninstall:
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/TestLib.pm'
@@ -31,5 +34,7 @@ uninstall:
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/RecursiveCopy.pm'
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgresNode.pm'
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgresVersion.pm'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Utils.pm'
 
 endif

Re: Suppressing useless wakeups in walreceiver

2022-10-10 Thread Bharath Rupireddy

On Wed, Oct 5, 2022 at 11:38 PM Nathan Bossart  wrote:
>
> I moved as much as I could to 0001 in v4.

Some comments on v4-0002:

1. You might want to use PG_INT64_MAX instead of INT64_MAX for portability?

2. With the below change, the time walreceiver spends in
XLogWalRcvSendReply() is also included for XLogWalRcvSendHSFeedback
right? I think it's a problem given that XLogWalRcvSendReply() can
take a while. Earlier, this wasn't the case, each function calculating
'now' separately. Any reason for changing this behaviour? I know that
GetCurrentTimestamp(); isn't cheaper all the time, but here it might
result in a change in the behaviour.
-XLogWalRcvSendReply(false, false);
-XLogWalRcvSendHSFeedback(false);
+TimestampTz now = GetCurrentTimestamp();
+
+XLogWalRcvSendReply(state, now, false, false);
+XLogWalRcvSendHSFeedback(state, now, false);

3. I understand that TimestampTz type is treated as microseconds.
Would you mind having a comment in below places to say why we're
multiplying with 1000 or 100 here? Also, do we need 1000L or
100L or type cast to
TimestampTz?
+state->wakeup[reason] = now + (wal_receiver_timeout / 2) * 1000;
+state->wakeup[reason] = now + wal_receiver_timeout * 1000;
+state->wakeup[reason] = now +
wal_receiver_status_interval * 100;

4. How about simplifying WalRcvComputeNextWakeup() something like below?
static void
WalRcvComputeNextWakeup(WalRcvInfo *state, WalRcvWakeupReason reason,
TimestampTz now)
{
TimestampTz ts = INT64_MAX;

switch (reason)
{
case WALRCV_WAKEUP_TERMINATE:
if (wal_receiver_timeout > 0)
ts = now + (TimestampTz) (wal_receiver_timeout * 1000L);
break;
case WALRCV_WAKEUP_PING:
if (wal_receiver_timeout > 0)
ts = now + (TimestampTz) ((wal_receiver_timeout / 2) * 1000L);
break;
case WALRCV_WAKEUP_HSFEEDBACK:
if (hot_standby_feedback && wal_receiver_status_interval > 0)
ts = now + (TimestampTz) (wal_receiver_status_interval * 100L);
break;
case WALRCV_WAKEUP_REPLY:
if (wal_receiver_status_interval > 0)
ts = now + (TimestampTz) (wal_receiver_status_interval * 100);
break;
default:
/* An error is better here. */
}

state->wakeup[reason] = ts;
}

5. Can we move below code snippets to respective static functions for
better readability and code reuse?
This:
+/* Find the soonest wakeup time, to limit our nap. */
+nextWakeup = INT64_MAX;
+for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
+nextWakeup = Min(state.wakeup[i], nextWakeup);
+nap = Max(0, (nextWakeup - now + 999) / 1000);

And this:

+now = GetCurrentTimestamp();
+for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
+WalRcvComputeNextWakeup(&state, i, now);

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re[2]: Possible solution for masking chosen columns when using pg_dump

2022-10-10 Thread Олег Целебровский


Hi,

I applied most of suggestions: used separate files for most of added code, 
fixed typos/mistakes, got rid of that pesky TODO that was already implemented, 
just not deleted.

Added tests (and functionality) for cases when you need to mask columns in 
tables with the same name in different schemas. If schema is not specified, 
then columns in all tables with specified name are masked (Example - pg_dump -t 
‘t0’ --mask-columns id --mask-function mask_int will mask all ids in all tables 
with names ‘t0’ in all existing schemas).

Wrote comments for all ‘magic numbers’

About that

>- Also it can be hard to use a lot of different functions for different 
>fields, maybe it would be better to set up functions in a file.

I agree with that, but I know about at least 2 other patches (both are WIP, but 
still) that are interacting with reading command-line options from file. And if 
everyone will write their own version of reading command-line options from 
file, it will quickly get confusing.

A solution to that problem is another patch that will put all options from file 
(one file for any possible options, from existing to future ones) into **argv 
in main, so that pg_dump can process them as if they came form command line.
  
>Пятница, 7 октября 2022, 8:01 +07:00 от Виктория Шепард :
> 
>Hi,
>I took a look, here are several suggestions for improvement:
> 
>- Masking is not a main functionality of pg_dump and it is better to write 
>most of the connected things in a separate file like parallel.c or 
>dumputils.c. This will help slow down the growth of an already huge pg_dump 
>file.
> 
>- Also it can be hard to use a lot of different functions for different 
>fields, maybe it would be better to set up functions in a file.
> 
>- How will it work for the same field and tables in the different schemas? Can 
>we set up the exact schema for the field?
> 
>- misspelling in a word
>>/*
>>* Add all columns and funcions to list of MaskColumnInfo structures,
>>*/
> 
>- Why did you use 256 here?
>> char* table = (char*) pg_malloc(256 * sizeof(char));
>Also for malloc you need malloc on 1 symbol more because you have to store 
>'\0' symbol.
> 
>- Instead of addFuncToDatabase you can run your query using something already 
>defined from fe_utils/query_utils.c. And It will be better to set up a 
>connection only once and create all functions. Establishing a connection is a 
>resource-intensive procedure. There are a lot of magic numbers, better to 
>leave some comments explaining why there are 64 or 512.
> 
>- It seems that you are not using temp_string
>> char   *temp_string = (char*)malloc(256 * sizeof(char));
> 
>- Grammar issues
>>/*
>>* mask_column_info_list contains info about every to-be-masked column:
>>* its name, a name its table (if nothing is specified - mask all columns with 
>>this name),
>>* name of masking function and name of schema containing this function 
>>(public if not specified)
>>*/
>the name of its table
>   
>пн, 3 окт. 2022 г. в 20:45, Julien Rouhaud < rjuju...@gmail.com >:
>>Hi,
>>
>>On Mon, Oct 03, 2022 at 06:30:17PM +0300, Олег Целебровский wrote:
>>>
>>> Hello, here's my take on masking data when using pg_dump
>>>  
>>> The main idea is using PostgreSQL functions to replace data during a SELECT.
>>> When table data is dumped SELECT a,b,c,d ... from ... query is generated, 
>>> the columns that are marked for masking are replaced with result of 
>>> functions on those columns
>>> Example: columns name, count are to be masked, so the query will look as 
>>> such: SELECT id, mask_text(name), mask_int(count), date from ...
>>>  
>>> So about the interface: I added 2 more command-line options: 
>>>  
>>> --mask-columns, which specifies what columns from what tables will be 
>>> masked 
>>>     usage example:
>>>             --mask-columns " t1.name , t2.description" - both columns will 
>>> be masked with the same corresponding function
>>>             or --mask-columns name - ALL columns with name "name" from all 
>>> dumped tables will be masked with correspoding function
>>>  
>>> --mask-function, which specifies what functions will mask data
>>>     usage example:
>>>             --mask-function mask_int - corresponding columns will be masked 
>>> with function named "mask_int" from default schema (public)
>>>             or --mask-function my_schema.mask_varchar - same as above but 
>>> with specified schema where the function is stored
>>>             or --mask-function somedir/filename - the function is "defined" 
>>> here - more on the structure below
>>
>>FTR I wrote an extension POC [1] last weekend that does that but on the 
>>backend
>>side.  The main advantage is that it's working with any existing versions of
>>pg_dump (or any client relying on COPY or even plain interactive SQL
>>statements), and that the DBA can force a dedicated role to only get a masked
>>dump, even if they forgot to ask for it.
>>
>>I only had a quick look at your patch but it seems that you left some todo i

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-10 Thread Alvaro Herrera

Hello

On 2022-Oct-07, Yugo NAGATA wrote:

> I found that tag files generated by src/tools/make_ctags
> doesn't include some keywords, that were field names of node
> structs, for example norm_select in RestrictInfo. Such fields
> are defined with pg_node_attr macro that was introduced
> recently, like:
> 
> Selectivity norm_selec pg_node_attr(equal_ignore);
> 
> In this case, pg_node_attr is mistakenly interpreted to be
> the name of the field. So, I propose to use -I option of ctags
> to ignore the marco name. Attached is a patch to do it.

I've wondered if there's anybody that uses this script.  I suppose if
you're reporting this problem then it has at least one user, and
therefore worth fixing.

If we do patch it, how about doing some more invasive surgery and
bringing it to the XXI century?  I think the `find` line is a bit lame:

>  # this is outputting the tags into the file 'tags', and appending
>  find `pwd`/ -type f -name '*.[chyl]' -print |
> - xargs ctags -a -f tags "$FLAGS"
> + xargs ctags -a -f tags "$FLAGS" -I "$IGNORE_LIST"

especially because it includes everything in tmp_install, which pollutes
the tag list.

In my own tags script I just call "ctags -R", and I feed cscope with
these find lines

(find $SRCDIR \( -name tmp_install -prune -o -name tmp_check -prune \) -o \( 
-name "*.[chly]" -o -iname "*makefile*" -o -name "*.mk" -o -name "*.in" -o 
-name "*.sh" -o -name "*.sgml" -o -name "*.sql" -o -name "*.p[lm]" \) -type f 
-print ; \
find $BUILDDIR \( -name tmp_install -prune \) -o \( -name \*.h -a -type f \) 
-print )

which seems to give decent results.  (Nowadays I wonder if it'd be
better to exclude the "*_d.h" files from the builddir.)
(I wonder why don't I have a prune for .git ...)

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
¡Ay, ay, ay!  Con lo mucho que yo lo quería (bis)
se fue de mi vera ... se fue para siempre, pa toíta ... pa toíta la vida
¡Ay Camarón! ¡Ay Camarón!(Paco de Lucía)

Re: [BUG] Logical replica crash if there was an error in a function.

2022-10-10 Thread Alvaro Herrera

On 2022-Sep-24, Tom Lane wrote:

> "Anton A. Melnikov"  writes:
> > [ v4-0001-Fix-logical-replica-assert-on-func-error.patch ]
> 
> I took a quick look at this.  I think you're solving the
> problem in the wrong place.  The real issue is why are
> we not setting up ActivePortal correctly when running
> user-defined code in a logrep worker?  There is other code
> that expects that to be set, eg EnsurePortalSnapshotExists.

Right ... mostly, the logical replication *does* attempt to set up a
transaction and active snapshot when replaying actions (c.f.
begin_replication_step()).  Is this firing at an inappropriate time,
perhaps?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: problems with making relfilenodes 56-bits

2022-10-10 Thread Robert Haas

On Mon, Oct 10, 2022 at 5:16 AM Dilip Kumar  wrote:
> I have also executed my original test after applying these patches on
> top of the 56 bit relfilenode patch.  So earlier we saw the WAL size
> increased by 11% (66199.09375 kB to 73906.984375 kB) and after this
> patch now the WAL generated is 58179.2265625.  That means in this
> particular example this patch is reducing the WAL size by 12% even
> with the 56 bit relfilenode patch.

That's a very promising result, but the question in my mind is how
much work would be required to bring this patch to a committable
state?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Startup process on a hot standby crashes with an error "invalid memory alloc request size 1073741824" while replaying "Standby/LOCK" records

2022-10-10 Thread Simon Riggs

On Wed, 5 Oct 2022 at 16:30, Tom Lane  wrote:
>
> Kyotaro Horiguchi  writes:
> > At Tue, 4 Oct 2022 17:15:31 -0700, Nathan Bossart 
> >  wrote in
> >> On Tue, Oct 04, 2022 at 07:53:11PM -0400, Tom Lane wrote:
> >>> After further thought, maybe it'd be better to do it as attached,
> >>> with one long-lived hash table for all the locks.
>
> > First one is straight forward outcome from the current implement but I
> > like the new one.  I agree that it is natural and that the expected
> > overhead per (typical) transaction is lower than both the first one
> > and doing the same operation on a list. I don't think that space
> > inefficiency in that extent doesn't matter since it is the startup
> > process.
>
> To get some hard numbers about this, I made a quick hack to collect
> getrusage() numbers for the startup process (patch attached for
> documentation purposes).  I then ran the recovery/t/027_stream_regress.pl
> test a few times and collected the stats (also attached).  This seems
> like a reasonably decent baseline test, since the core regression tests
> certainly take lots of AccessExclusiveLocks what with all the DDL
> involved, though they shouldn't ever take large numbers at once.  Also
> they don't run long enough for any lock list bloat to occur, so these
> numbers don't reflect a case where the patches would provide benefit.
>
> If you look hard, there's maybe about a 1% user-CPU penalty for patch 2,
> although that's well below the run-to-run variation so it's hard to be
> sure that it's real.  The same comments apply to the max resident size
> stats.  So I'm comforted that there's not a significant penalty here.
>
> I'll go ahead with patch 2 if there's not objection.

Happy to see this change.

> One other point to discuss: should we consider back-patching?  I've
> got mixed feelings about that myself.  I don't think that cases where
> this helps significantly are at all mainstream, so I'm kind of leaning
> to "patch HEAD only".

It looks fine to eventually backpatch, since StandbyReleaseLockTree()
was optimized to only be called when the transaction had actually done
some AccessExclusiveLocks.

So the performance loss is minor and isolated to the users of such
locks, so I see no problems with it.

-- 
Simon Riggshttp://www.EnterpriseDB.com/

Re: create subscription - improved warning message

2022-10-10 Thread Robert Haas

On Mon, Oct 10, 2022 at 12:41 AM Tom Lane  wrote:
> I'm beginning to question the entire premise here.  That is,
> rather than tweaking this message until it's within hailing
> distance of sanity, why do we allow the no-connect case at all?

That sounds pretty nuts to me, because of the pg_dump use case if
nothing else. I don't think it's reasonable to say "oh, if you execute
this DDL on your system, it will instantaneously and automatically
begin to create outbound network connections, and there's no way to
turn that off." It ought to be possible to set up a configuration
first and then only later turn it on. And it definitely ought to be
possible, if things aren't working out, to turn it back off, too.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Query Jumbling for CALL and SET utility statements

2022-10-10 Thread Drouvot, Bertrand


Hi,

On 10/7/22 6:13 AM, Michael Paquier wrote:

On Thu, Oct 06, 2022 at 11:51:52PM -0400, Tom Lane wrote:

I've been thinking since the beginning of this thread that there
was no coherent, defensible rationale being offered for jumbling
some utility statements and not others.


Yeah.  The potential performance impact of all the TransactionStmts
worries me a bit, though.


I wonder if the answer is to jumble them all.  We avoided that
up to now because it would imply a ton of manual effort and
future code maintenance ... but now that the backend/nodes/
infrastructure is largely auto-generated, could we auto-generate
the jumbling code?


I think that's a good idea.



Probably.  One part that may be tricky though is the location of the
constants we'd like to make generic, but perhaps this could be handled
by using a dedicated variable type that just maps to int?  


It looks to me that we'd also need to teach the perl parser which fields 
per statements struct need to be jumbled (or more probably which ones to 
exclude from the jumbling, for example funccall in CallStmt). Not sure 
yet how to teach the perl parser, but I'll build this list first to help 
decide for a right approach, unless you already have some thoughts about it?


Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Query Jumbling for CALL and SET utility statements

2022-10-10 Thread Julien Rouhaud

Hi,

On Mon, Oct 10, 2022 at 03:04:57PM +0200, Drouvot, Bertrand wrote:
>
> On 10/7/22 6:13 AM, Michael Paquier wrote:
> >
> > Probably.  One part that may be tricky though is the location of the
> > constants we'd like to make generic, but perhaps this could be handled
> > by using a dedicated variable type that just maps to int?
>
> It looks to me that we'd also need to teach the perl parser which fields per
> statements struct need to be jumbled (or more probably which ones to exclude
> from the jumbling, for example funccall in CallStmt). Not sure yet how to
> teach the perl parser, but I'll build this list first to help decide for a
> right approach, unless you already have some thoughts about it?

Unless I'm missing something both can be handled using pg_node_attr()
annotations that already exists?

doc: add entry for pg_get_partkeydef()

2022-10-10 Thread Ian Lawrence Barwick

Hi

Small patch for $subject, as the other pg_get_XXXdef() functions are
documented
and I was looking for this one but couldn't remember what it was called.

Regards

Ian Barwick
commit 1ab855e72e077abad247cc40619b6fc7f7758983
Author: Ian Barwick 
Date:   Mon Oct 10 22:32:31 2022 +0900

doc: add entry for pg_get_partkeydef()

Other pg_get_XXXdef() functions are documented, so it seems reasonable
to include this as well.

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b2bdbc7d1c..d59b8ab77e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -23718,6 +23718,23 @@ SELECT pg_type_is_visible('myschema.widget'::regtype);

   
 
+  
+   
+
+ pg_get_partkeydef
+
+pg_get_partkeydef ( table oid )
+text
+   
+   
+Reconstructs the definition of a partition key, in the form it would
+need to appear in the PARTITION BY clause in
+CREATE TABLE.
+(This is a decompiled reconstruction, not the original text
+of the command.)
+   
+  
+

Re: Add common function ReplicationOriginName.

2022-10-10 Thread Amit Kapila

On Tue, Sep 27, 2022 at 5:04 PM Aleksander Alekseev
 wrote:
>
> Hi Peter,
>
> > PSA patch v3 to combine the different replication origin name
> > formatting in a single function ReplicationOriginNameForLogicalRep as
> > suggested.
>
> LGTM except for minor issues with the formatting; fixed.
>

LGTM as well. I'll push this tomorrow unless there are any more comments.


-- 
With Regards,
Amit Kapila.

PostgreSQL 15 GA - Oct 13, 2022

2022-10-10 Thread Jonathan S. Katz


Hi,

The PostgreSQL 15 GA will be Oct 13, 2022.

Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature

Re: Non-decimal integer literals

2022-10-10 Thread Peter Eisentraut


On 16.02.22 11:11, Peter Eisentraut wrote:
The remaining patches are material for PG16 at this point, and I will 
set the commit fest item to returned with feedback in the meantime.


Time to continue with this.

Attached is a rebased and cleaned up patch for non-decimal integer 
literals.  (I don't include the underscores-in-numeric literals patch. 
I'm keeping that for later.)


Two open issues from my notes:

Technically, numeric_in() should be made aware of this, but that seems 
relatively complicated and maybe not necessary for the first iteration.


Taking another look around ecpg to see how this interacts with C-syntax 
integer literals.  I'm not aware of any particular issues, but it's 
understandably tricky.


Other than that, this seems pretty complete as a start.
From d0bc72fa4c339ba2ea0bb8d1e5a3923d76ee8105 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 10 Oct 2022 16:03:15 +0200
Subject: [PATCH v9] Non-decimal integer literals

Add support for hexadecimal, octal, and binary integer literals:

0x42F
0o273
0b100101

per SQL:202x draft.

This adds support in the lexer as well as in the integer type input
functions.

Discussion: 
https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb...@enterprisedb.com
---
 doc/src/sgml/syntax.sgml   |  26 
 src/backend/catalog/information_schema.sql |   6 +-
 src/backend/catalog/sql_features.txt   |   1 +
 src/backend/parser/scan.l  |  99 +++
 src/backend/utils/adt/numutils.c   | 140 +
 src/fe_utils/psqlscan.l|  78 +---
 src/interfaces/ecpg/preproc/pgc.l  | 108 +---
 src/test/regress/expected/int2.out |  19 +++
 src/test/regress/expected/int4.out |  19 +++
 src/test/regress/expected/int8.out |  19 +++
 src/test/regress/expected/numerology.out   |  59 -
 src/test/regress/sql/int2.sql  |   7 ++
 src/test/regress/sql/int4.sql  |   7 ++
 src/test/regress/sql/int8.sql  |   7 ++
 src/test/regress/sql/numerology.sql|  21 +++-
 15 files changed, 523 insertions(+), 93 deletions(-)

diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 93ad71737f51..bba78c22f1a9 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -694,6 +694,32 @@ Numeric Constants
 
 
 
+
+ Additionally, non-decimal integer constants can be used in these forms:
+
+0xhexdigits
+0ooctdigits
+0bbindigits
+
+ hexdigits is one or more hexadecimal digits
+ (0-9, A-F), octdigits is one or more octal
+ digits (0-7), bindigits is one or more binary
+ digits (0 or 1).  Hexadecimal digits and the radix prefixes can be in
+ upper or lower case.  Note that only integers can have non-decimal forms,
+ not numbers with fractional parts.
+
+
+
+ These are some examples of this:
+0b100101
+0B10011001
+0o273
+0O755
+0x42f
+0X
+
+
+
 
  integer
  bigint
diff --git a/src/backend/catalog/information_schema.sql 
b/src/backend/catalog/information_schema.sql
index 18725a02d1fb..95c27a625e7e 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -119,7 +119,7 @@ CREATE FUNCTION _pg_numeric_precision(typid oid, typmod 
int4) RETURNS integer
  WHEN 1700 /*numeric*/ THEN
   CASE WHEN $2 = -1
THEN null
-   ELSE (($2 - 4) >> 16) & 65535
+   ELSE (($2 - 4) >> 16) & 0x
END
  WHEN 700 /*float4*/ THEN 24 /*FLT_MANT_DIG*/
  WHEN 701 /*float8*/ THEN 53 /*DBL_MANT_DIG*/
@@ -147,7 +147,7 @@ CREATE FUNCTION _pg_numeric_scale(typid oid, typmod int4) 
RETURNS integer
WHEN $1 IN (1700) THEN
 CASE WHEN $2 = -1
  THEN null
- ELSE ($2 - 4) & 65535
+ ELSE ($2 - 4) & 0x
  END
ELSE null
   END;
@@ -163,7 +163,7 @@ CREATE FUNCTION _pg_datetime_precision(typid oid, typmod 
int4) RETURNS integer
WHEN $1 IN (1083, 1114, 1184, 1266) /* time, timestamp, same + tz */
THEN CASE WHEN $2 < 0 THEN 6 ELSE $2 END
WHEN $1 IN (1186) /* interval */
-   THEN CASE WHEN $2 < 0 OR $2 & 65535 = 65535 THEN 6 ELSE $2 & 65535 
END
+   THEN CASE WHEN $2 < 0 OR $2 & 0x = 0x THEN 6 ELSE $2 & 
0x END
ELSE null
   END;
 
diff --git a/src/backend/catalog/sql_features.txt 
b/src/backend/catalog/sql_features.txt
index da7c9c772e09..e897e28ed148 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -527,6 +527,7 @@ T652SQL-dynamic statements in SQL routines  
NO
 T653   SQL-schema statements in external routines  YES 
 T654   SQL-dynamic statements in external routines NO  
 T655   Cyclically dependent r

RE: Fix some newly modified tab-complete changes

2022-10-10 Thread shiy.f...@fujitsu.com

On Mon, Oct 10, 2022 2:12 PM shiy.f...@fujitsu.com  
wrote:
> 
> On Tue, Oct 4, 2022 4:17 PM Peter Smith  wrote:
> >
> > But, while testing I noticed another different quirk
> >
> > It seems that neither the GRANT nor the REVOKE auto-complete
> > recognises the optional PRIVILEGE keyword
> >
> > e.g. GRANT ALL  --> ON  (but not PRIVILEGE)
> > e.g. GRANT ALL PRIV --> ???
> >
> > e.g. REVOKE ALL  --> ON (but not PRIVILEGE)..
> > e.g. REVOKE ALL PRIV --> ???
> >
> 
> I tried to add tab-completion for it. Pleases see attached updated patch.
> 

Sorry for attaching a wrong patch. Here is the right one.

Regards,
Shi yu


v4-0001-Fix-tab-completion-for-GRANT-REVOKE.patch
Description: v4-0001-Fix-tab-completion-for-GRANT-REVOKE.patch

Re: kerberos/001_auth test fails on arm CPU darwin

2022-10-10 Thread Nazir Bilal Yavuz


Hi,

Thanks for the review!

On 10/1/22 14:12, Peter Eisentraut wrote:

This patch could use some more in-code comments.  For example, this

+# get prefix for kerberos executables and try to find them at this path
+sub test_krb5_paths

is not helpful.  What does it "get", where does it put it, how does it 
"try", and what does it do if it fails?  What are the inputs and 
outputs of this function?


+   # remove '\n' since 'krb5-config --prefix' returns path ends with 
'\n'

+   $krb5_path =~ s/\n//g;

use chomp



I updated patch regarding these comments.

I have a question about my logic:
+    elsif ($^O eq 'linux')
+    {
+        test_krb5_paths('/usr/');
+    }
 }

Before that, test could use krb5kdc, kadmin and kdb5_util from 
'/usr/sbin/'; krb5_config and kinit from $PATH. However, now it will try 
to use all of them from $PATH or from '/usr/sbin/' and '/usr/bin/'. Does 
that cause a problem?


Ci run after fix is applied:

https://cirrus-ci.com/build/5359971746447360


Regards,
Nazir Bilal Yavuz
From 5aa95409db8146a076255e00be399a0d9ce1efe8 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Mon, 26 Sep 2022 10:56:51 +0300
Subject: [PATCH 1/2] ci: Add arm CPU for darwin

.
ci-os-only: darwin.
---
 .cirrus.yml | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 9f2282471a..c23be7363c 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -293,7 +293,6 @@ task:
   name: macOS - Monterey - Meson
 
   env:
-CPUS: 12 # always get that much for cirrusci macOS instances
 BUILD_JOBS: $CPUS
 # Test performance regresses noticably when using all cores. 8 seems to
 # work OK. See
@@ -313,15 +312,26 @@ task:
 
   only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*(macos|darwin|osx).*'
 
-  osx_instance:
-image: monterey-base
+  matrix:
+- name: macOS - Monterey - Meson - ARM CPU
+  macos_instance:
+image: ghcr.io/cirruslabs/macos-monterey-base:latest
+  env:
+BREW_PATH: /opt/homebrew
+CPUS: 4
+
+- name: macOS - Monterey - Meson - Intel CPU
+  osx_instance:
+image: monterey-base
+  env:
+BREW_PATH: /usr/local
+CPUS: 12 # always get that much for cirrusci macOS instances
 
   sysinfo_script: |
 id
 uname -a
 ulimit -a -H && ulimit -a -S
 export
-
   setup_core_files_script:
 - mkdir ${HOME}/cores
 - sudo sysctl kern.corefile="${HOME}/cores/core.%P"
@@ -360,7 +370,7 @@ task:
   ccache_cache:
 folder: $CCACHE_DIR
   configure_script: |
-brewpath="/usr/local"
+brewpath=${BREW_PATH}
 PKG_CONFIG_PATH="${brewpath}/lib/pkgconfig:${PKG_CONFIG_PATH}"
 
 for pkg in icu4c krb5 openldap openssl zstd ; do
-- 
2.37.3

From 5ad2357ff7f3ddcb24754847f542952143ccafa7 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 23 Sep 2022 14:30:06 +0300
Subject: [PATCH 2/2] fix: darwin: ARM CPU darwin krb5 path fix

---
 src/test/kerberos/t/001_auth.pl | 103 
 1 file changed, 79 insertions(+), 24 deletions(-)

diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index a2bc8a5351..9587698df3 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -30,39 +30,94 @@ elsif ($ENV{PG_TEST_EXTRA} !~ /\bkerberos\b/)
 	plan skip_all => 'Potentially unsafe test GSSAPI/Kerberos not enabled in PG_TEST_EXTRA';
 }
 
-my ($krb5_bin_dir, $krb5_sbin_dir);
-
-if ($^O eq 'darwin')
-{
-	$krb5_bin_dir  = '/usr/local/opt/krb5/bin';
-	$krb5_sbin_dir = '/usr/local/opt/krb5/sbin';
-}
-elsif ($^O eq 'freebsd')
-{
-	$krb5_bin_dir  = '/usr/local/bin';
-	$krb5_sbin_dir = '/usr/local/sbin';
-}
-elsif ($^O eq 'linux')
-{
-	$krb5_sbin_dir = '/usr/sbin';
-}
-
 my $krb5_config  = 'krb5-config';
 my $kinit= 'kinit';
 my $kdb5_util= 'kdb5_util';
 my $kadmin_local = 'kadmin.local';
 my $krb5kdc  = 'krb5kdc';
 
-if ($krb5_bin_dir && -d $krb5_bin_dir)
+# control variable for checking if all executables are found
+my $is_krb5_found = 0;
+
+# Given $krb5_path_prefix (possible paths of kerberos executables),
+# first check value of $is_krb5_found:
+# if it is truthy value this means executables are already found so
+# don't try to locate further, else add '/bin/' and '/sbin/'
+# to the $krb5_path_prefix and try to locate all required kerberos executables
+# ($krb5_config, $kinit, $kdb5_util, $kadmin_local, $krb5kdc).
+# If all executables are found, set $is_krb5_found to 1 and use executables
+# from this path; even if one of the executables is not found, skip this path.
+# If all paths are searched and executables are still not found,
+# test will try to use kerberos executables from $PATH.
+sub test_krb5_paths
 {
-	$krb5_config = $krb5_bin_dir . '/' . $krb5_config;
-	$kinit   = $krb5_bin_dir . '/' . $kinit;
+	my ($krb5_path_prefix) = (@_);
+
+	# if executables are already found, return
+	if ($is_krb5_found)
+	{

Turn TransactionIdRetreat/Advance into inline functions

2022-10-10 Thread Maxim Orlov

Hi!

This patch is inspired by [0] and many others.
I've notice recent activity to convert macros into inline functions. We
should make TransactionIdRetreat/Advance functions
Instead of a macro, should we?

I also think about NormalTransactionIdPrecedes and
NormalTransactionIdFollows, but maybe, they should be addressed
separately: the comment says that "this is a macro for speed".

Any thoughts?

[0]:
https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517%40enterprisedb.com

-- 
Best regards,
Maxim Orlov.


v1-0001-Convert-macros-to-static-inline-functions-transam.patch
Description: Binary data

Re: src/test/perl/PostgreSQL/Test/*.pm not installed

2022-10-10 Thread Tom Lane

Alvaro Herrera  writes:
> I noticed that the new Perl test modules are not installed, so if you
> try to use PostgreSQL/Test/Cluster.pm in an external test from pgxs, it
> fails with the modules not being found.
> I see no reason for this other than having overseen it in b235d41d9646,
> so I propose the attached (for all branches, naturally.)

+1, but I suppose you need some adjustment in the meson.build files
now too.

(Also, please wait for the v15 release freeze to lift.)

regards, tom lane

Re: create subscription - improved warning message

2022-10-10 Thread Tom Lane

Peter Smith  writes:
> On Mon, Oct 10, 2022 at 4:04 PM Amit Kapila  wrote:
>> I think the below gives accurate information:
>> WARNING: subscription was created, but is not connected
>> HINT: You should create the slot manually, enable the subscription,
>> and run %s to initiate replication.

> +1

It feels a little strange to me that we describe two steps rather vaguely
and then give exact SQL for the third step.  How about, say,

HINT: To initiate replication, you must manually create the replication
slot, enable the subscription, and refresh the subscription.

Another idea is

HINT: To initiate replication, create the replication slot on the
publisher, then run ALTER SUBSCRIPTION ... ENABLE and ALTER
SUBSCRIPTION ... REFRESH PUBLICATION.

regards, tom lane

Re: Turn TransactionIdRetreat/Advance into inline functions

2022-10-10 Thread Tom Lane

Maxim Orlov  writes:
> I've notice recent activity to convert macros into inline functions. We
> should make TransactionIdRetreat/Advance functions
> Instead of a macro, should we?

-1.  Having to touch all the call sites like this outweighs
any claimed advantage: it makes them uglier and it will greatly
complicate any back-patching we might have to do in those areas.

regards, tom lane

Re: Turn TransactionIdRetreat/Advance into inline functions

2022-10-10 Thread Maxim Orlov

> -1.  Having to touch all the call sites like this outweighs
> any claimed advantage: it makes them uglier and it will greatly
> complicate any back-patching we might have to do in those areas.
>
> regards, tom lane
>

Ok, got it. But what if we change the semantics of these calls to
xid = TransactionIdAdvance(xid) ?

-- 
Best regards,
Maxim Orlov.

Re: Turn TransactionIdRetreat/Advance into inline functions

2022-10-10 Thread Tom Lane

Maxim Orlov  writes:
>> -1.  Having to touch all the call sites like this outweighs
>> any claimed advantage: it makes them uglier and it will greatly
>> complicate any back-patching we might have to do in those areas.

> Ok, got it. But what if we change the semantics of these calls to
> xid = TransactionIdAdvance(xid) ?

Uh ... you'd still have to touch all the call sites.

regards, tom lane

Re: shadow variables - pg15 edition

2022-10-10 Thread Tom Lane

David Rowley  writes:
> On Fri, 7 Oct 2022 at 13:24, David Rowley  wrote:
>> Since I just committed the patch to fix the final warnings, I think we
>> should go ahead and commit the patch you wrote to add
>> -Wshadow=compatible-local to the standard build flags. I don't mind
>> doing this.

> Pushed.

The buildfarm's showing a few instances of this warning, which seem
to indicate that not all versions of the Perl headers are clean:

 fairywren | 2022-10-10 09:03:50 | C:/Perl64/lib/CORE/cop.h:612:13: 
warning: declaration of 'av' shadows a previous local 
[-Wshadow=compatible-local]
 fairywren | 2022-10-10 09:03:50 | C:/Perl64/lib/CORE/cop.h:612:13: 
warning: declaration of 'av' shadows a previous local 
[-Wshadow=compatible-local]
 fairywren | 2022-10-10 09:03:50 | C:/Perl64/lib/CORE/cop.h:612:13: 
warning: declaration of 'av' shadows a previous local 
[-Wshadow=compatible-local]
 fairywren | 2022-10-10 09:03:50 | C:/Perl64/lib/CORE/cop.h:612:13: 
warning: declaration of 'av' shadows a previous local 
[-Wshadow=compatible-local]
 fairywren | 2022-10-10 09:03:50 | C:/Perl64/lib/CORE/cop.h:612:13: 
warning: declaration of 'av' shadows a previous local 
[-Wshadow=compatible-local]
 fairywren | 2022-10-10 09:03:50 | C:/Perl64/lib/CORE/cop.h:612:13: 
warning: declaration of 'av' shadows a previous local 
[-Wshadow=compatible-local]
 snakefly  | 2022-10-10 08:21:05 | Util.c:457:14: warning: declaration of 
'cv' shadows a parameter [-Wshadow=compatible-local]

Before you ask:

fairywren: perl 5.24.3
snakefly: perl 5.16.3

which are a little old, but not *that* old.

Scraping the configure logs also shows that only half of the buildfarm
(exactly 50 out of 100 reporting animals) knows -Wshadow=compatible-local,
which suggests that we might see more of these if they all did.  On the
other hand, animals with newer compilers probably also have newer Perl
installations, so assuming that the Perl crew have kept this clean
recently, maybe not.

Not sure if this is problematic enough to justify removing the switch.
A plausible alternative is to have a few animals with known-clean Perl
installations add the switch manually (and use -Werror), so that we find
out about violations without having warnings in the face of developers
who can't fix them.  I'm willing to wait to see if anyone complains of
such warnings, though.

regards, tom lane

Re: shadow variables - pg15 edition

2022-10-10 Thread Andres Freund

Hi,

On 2022-10-10 12:06:22 -0400, Tom Lane wrote:
> Scraping the configure logs also shows that only half of the buildfarm
> (exactly 50 out of 100 reporting animals) knows -Wshadow=compatible-local,
> which suggests that we might see more of these if they all did.

I think it's not just newness - only gcc has compatible-local, even very new
clang doesn't.

This was fixed ~6 years ago in perl:

commit f2b9631d5d19d2b71c1776e1193173d13f3620bf
Author: David Mitchell 
Date:   2016-05-23 14:43:56 +0100

CX_POP_SAVEARRAY(): use more distinctive var name

Under -Wshadow, CX_POP_SAVEARRAY's local var 'av' can generate this
warning:

warning: declaration shadows a local variable [-Wshadow]

So rename it to cx_pop_savearay_av to reduce the risk of a clash.

(See http://nntp.perl.org/group/perl.perl5.porters/236444)

> Not sure if this is problematic enough to justify removing the switch.
> A plausible alternative is to have a few animals with known-clean Perl
> installations add the switch manually (and use -Werror), so that we find
> out about violations without having warnings in the face of developers
> who can't fix them.  I'm willing to wait to see if anyone complains of
> such warnings, though.

Given the age of affected perl instances I suspect there'll not be a lot of
developers affected, and the number of warnings is reasonably small too. It'd
likely hurt more developers to not see the warnings locally, given that such
shadowing often causes bugs.

Greetings,

Andres Freund

Re: src/test/perl/PostgreSQL/Test/*.pm not installed

2022-10-10 Thread Alvaro Herrera

On 2022-Oct-10, Tom Lane wrote:

> +1, but I suppose you need some adjustment in the meson.build files
> now too.

Oh, right, I forgot ...

> (Also, please wait for the v15 release freeze to lift.)

... and now that I look, it turns out that 15 and master need no
changes: both the Makefile and the meson files are correct already.
Only 14 and back have this problem.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Re: shadow variables - pg15 edition

2022-10-10 Thread Alvaro Herrera

On 2022-Oct-10, Andres Freund wrote:

> Given the age of affected perl instances I suspect there'll not be a lot of
> developers affected, and the number of warnings is reasonably small too. It'd
> likely hurt more developers to not see the warnings locally, given that such
> shadowing often causes bugs.

Maybe we can install a filter-out in src/pl/plperl's Makefile for the
time being.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Por suerte hoy explotó el califont porque si no me habría muerto
 de aburrido"  (Papelucho)

Re: shadow variables - pg15 edition

2022-10-10 Thread Andres Freund

Hi,

On 2022-10-10 18:33:11 +0200, Alvaro Herrera wrote:
> On 2022-Oct-10, Andres Freund wrote:
> 
> > Given the age of affected perl instances I suspect there'll not be a lot of
> > developers affected, and the number of warnings is reasonably small too. 
> > It'd
> > likely hurt more developers to not see the warnings locally, given that such
> > shadowing often causes bugs.
> 
> Maybe we can install a filter-out in src/pl/plperl's Makefile for the
> time being.

We could, but is it really a useful thing for something fixed 6 years ago?

Greetings,

Andres Freund

Re: Adding Support for Copy callback functionality on COPY TO api

2022-10-10 Thread Nathan Bossart

On Mon, Oct 10, 2022 at 12:41:40PM +0530, Bharath Rupireddy wrote:
> IIUC, COPY TO callback helps move a table's data out of postgres
> server. Just wondering, how is it different from existing solutions
> like COPY TO ... PROGRAM/FILE, logical replication, pg_dump etc. that
> can move a table's data out? I understandb that the COPY FROM callback
> was needed for logical replication 7c4f52409. Mentioning a concrete
> use-case helps here.

This new callback allows the use of COPY TO's machinery in extensions.  A
couple of generic use-cases are listed upthread [0], and one concrete
use-case is the aws_s3 extension [1].

> I'm not quite sure if we need a separate module to just tell how to
> use this new callback. I strongly feel that it's not necessary. It
> unnecessarily creates extra code (actual code is 25 LOC with v1 patch
> but 150 LOC with v5 patch) and can cause maintenance burden. These
> callback APIs are simple enough to understand for those who know
> BeginCopyTo() or BeginCopyFrom() and especially for those who know how
> to write extensions. These are not APIs that an end-user uses. The
> best would be to document both COPY FROM and COPY TO callbacks,
> perhaps with a pseudo code specifying just the essence [1], and their
> possible usages somewhere here
> https://www.postgresql.org/docs/devel/sql-copy.html.
> 
> The order of below NOTICE messages isn't guaranteed and it can change
> depending on platforms. Previously, we've had to suppress such
> messages in the test output 6adc5376d.

I really doubt that this small test case is going to cause anything
approaching undue maintenance burden.  I think it's important to ensure
this functionality continues to work as expected long into the future.

[0] https://postgr.es/m/253C21D1-FCEB-41D9-A2AF-E6517015B7D7%40amazon.com
[1] 
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/postgresql-s3-export.html#aws_s3.export_query_to_s3

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Add index scan progress to pg_stat_progress_vacuum

2022-10-10 Thread Imseih (AWS), Sami

> One idea would be to add a flag, say report_parallel_vacuum_progress,
> to IndexVacuumInfo struct and expect index AM to check and update the
> parallel index vacuum progress, say every 1GB blocks processed. The
> flag is true only when the leader process is vacuuming an index.

Sorry for the long delay on this. I have taken the approach as suggested
by Sawada-san and Robert and attached is v12.

1. The patch introduces a new counter in the shared memory already
used by the parallel leader and workers to keep track of the number
of indexes completed. This way there is no reason to loop through
the index status everytime we want to get the status of indexes completed.

2. A new function in vacuumparallel.c will be used to update
the progress of a indexes completed by reading from the
counter created in point #1.

3. The function is called during the vacuum_delay_point as a
matter of convenience, since it's called in all major vacuum
loops. The function will only do anything if the caller
sets a boolean to report progress. Doing so will also ensure
progress is being reported in case the parallel workers completed
before the leader.

4. Rather than adding any complexity to WaitForParallelWorkersToFinish
and introducing a new callback, vacuumparallel.c will wait until
the number of vacuum workers is 0 and then process to call
WaitForParallelWorkersToFinish as it does.

5. Went back to the idea of adding a new view called 
pg_stat_progress_vacuum_index
which is accomplished by adding a new type called VACUUM_PARALLEL in progress.h


Thanks,

Sami Imseih
Amazon Web Servies (AWS)


v12-0001--Show-progress-for-index-vacuums.patch
Description: v12-0001--Show-progress-for-index-vacuums.patch

Re: shadow variables - pg15 edition

2022-10-10 Thread Andres Freund

On 2022-10-10 09:37:38 -0700, Andres Freund wrote:
> On 2022-10-10 18:33:11 +0200, Alvaro Herrera wrote:
> > On 2022-Oct-10, Andres Freund wrote:
> > 
> > > Given the age of affected perl instances I suspect there'll not be a lot 
> > > of
> > > developers affected, and the number of warnings is reasonably small too. 
> > > It'd
> > > likely hurt more developers to not see the warnings locally, given that 
> > > such
> > > shadowing often causes bugs.
> > 
> > Maybe we can install a filter-out in src/pl/plperl's Makefile for the
> > time being.
> 
> We could, but is it really a useful thing for something fixed 6 years ago?

As an out, a hypothetical dev could add -Wno-shadow=compatible-local to their
CFLAGS.

Re: shadow variables - pg15 edition

2022-10-10 Thread Alvaro Herrera

On 2022-Oct-10, Andres Freund wrote:

> On 2022-10-10 09:37:38 -0700, Andres Freund wrote:
> > On 2022-10-10 18:33:11 +0200, Alvaro Herrera wrote:
> > > On 2022-Oct-10, Andres Freund wrote:
> > > 
> > > > Given the age of affected perl instances I suspect there'll not be a 
> > > > lot of
> > > > developers affected, and the number of warnings is reasonably small 
> > > > too. It'd
> > > > likely hurt more developers to not see the warnings locally, given that 
> > > > such
> > > > shadowing often causes bugs.
> > > 
> > > Maybe we can install a filter-out in src/pl/plperl's Makefile for the
> > > time being.
> > 
> > We could, but is it really a useful thing for something fixed 6 years ago?

Well, for people purposefully installing using older installs of Perl
(not me, admittedly), it does seem useful, because you get the benefit
of checking shadow vars for the rest of the tree and still get no
warnings if everything is clean.

> As an out, a hypothetical dev could add -Wno-shadow=compatible-local to their
> CFLAGS.

But that disables it for the tree as a whole, which is not better.

We can remove the filter-out when we decide to move the Perl version
requirement up, say 4 years from now.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)

Re: Error-safe user functions

2022-10-10 Thread Corey Huinker

>
>
> The idea is simple -- introduce new "error-safe" calling mode of user
> functions by passing special node through FunctCallInfo.context, in
> which function should write error info and return instead of throwing
> it.  Also such functions should manually free resources before
> returning an error.  This gives ability to avoid PG_TRY/PG_CATCH and
> subtransactions.
>
> I tried something similar when trying to implement TRY_CAST (
https://learn.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver16)
late last year. I also considered having a default datum rather than just
returning NULL.

I had not considered a new node type. I had considered having every
function have a "safe" version, which would be a big duplication of logic
requiring a lot of regression tests and possibly fuzzing tests.

Instead, I extended every core input function to have an extra boolean
parameter to indicate if failures were allowed, and then an extra Datum
parameter for the default value. The Input function wouldn't need to check
the value of the new parameters until it was already in a situation where
it found invalid data, but the extra overhead still remained, and it meant
that basically every third party type extension would need to be changed.

Then I considered whether the cast failure should be completely silent, or
if the previous error message should instead be omitted as a LOG/INFO/WARN,
and if we'd want that to be configurable, so then the boolean parameter
became an integer enum:

* regular fail (0)
* use default silently (1)
* use default emit LOG/NOTICE/WARNING (2,3,4)

At the time, all of this seemed like too big of a change for a function
that isn't even in the SQL Standard, but maybe SQL/JSON changes that.

If so, it would allow for a can-cast-to test that users would find very
useful. Something like:

SELECT CASE WHEN 'abc' CAN BE integer THEN 'Integer' ELSE 'Nope' END

There's obviously no standard syntax to support that, but the data
cleansing possibilities would be great.

Re: src/test/perl/PostgreSQL/Test/*.pm not installed

2022-10-10 Thread Tom Lane

Alvaro Herrera  writes:
> Only 14 and back have this problem.

Ah, cool.  There's no freeze on those branches ...

regards, tom lane

Re: shadow variables - pg15 edition

2022-10-10 Thread Tom Lane

Alvaro Herrera  writes:
> On 2022-Oct-10, Andres Freund wrote:
>> We could, but is it really a useful thing for something fixed 6 years ago?

> Well, for people purposefully installing using older installs of Perl
> (not me, admittedly), it does seem useful, because you get the benefit
> of checking shadow vars for the rest of the tree and still get no
> warnings if everything is clean.

Meh --- people purposefully using old Perls are likely using old
compilers too.  Let's wait and see if any devs actually complain.

regards, tom lane

Re: Suppressing useless wakeups in walreceiver

2022-10-10 Thread Nathan Bossart

On Mon, Oct 10, 2022 at 03:22:15PM +0530, Bharath Rupireddy wrote:
> Some comments on v4-0002:

Thanks for taking a look.

> 1. You might want to use PG_INT64_MAX instead of INT64_MAX for portability?

Yes, I used PG_INT64_MAX in v5.

> 2. With the below change, the time walreceiver spends in
> XLogWalRcvSendReply() is also included for XLogWalRcvSendHSFeedback
> right? I think it's a problem given that XLogWalRcvSendReply() can
> take a while. Earlier, this wasn't the case, each function calculating
> 'now' separately. Any reason for changing this behaviour? I know that
> GetCurrentTimestamp(); isn't cheaper all the time, but here it might
> result in a change in the behaviour.

Yes, if XLogWalRcvSendReply() takes a long time, we might defer sending the
hot_standby_feedback message until a later time.  The only reason I changed
this was to avoid extra calls to GetCurrentTimestamp(), which might be
expensive on some platforms.  Outside of the code snippet you pointed out,
I think WalReceiverMain() has a similar problem.  That being said, I'm
generally skeptical that this sort of thing is detrimental given the
current behavior (i.e., wake up every 100ms), the usual values of these
GUCs (i.e., tens of seconds), and the fact that any tasks that are
inappropriately skipped will typically be retried in the next iteration of
the loop.

> 3. I understand that TimestampTz type is treated as microseconds.
> Would you mind having a comment in below places to say why we're
> multiplying with 1000 or 100 here? Also, do we need 1000L or
> 100L or type cast to
> TimestampTz?

I used INT64CONST() in v5, as that seemed the most portable, but I stopped
short of adding comments to explain all the conversions.  IMO the
conversions are relatively obvious, and the units of the GUCs can be easily
seen in guc_tables.c.

> 4. How about simplifying WalRcvComputeNextWakeup() something like below?

Other than saving a couple lines of code, IMO this doesn't meaningfully
simplify the function or improve readability.

> 5. Can we move below code snippets to respective static functions for
> better readability and code reuse?
> This:
> +/* Find the soonest wakeup time, to limit our nap. */
> +nextWakeup = INT64_MAX;
> +for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
> +nextWakeup = Min(state.wakeup[i], nextWakeup);
> +nap = Max(0, (nextWakeup - now + 999) / 1000);
> 
> And this:
> 
> +now = GetCurrentTimestamp();
> +for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
> +WalRcvComputeNextWakeup(&state, i, now);

This did cross my mind, but I opted to avoid creating new functions because
1) they aren't likely to be reused very much, and 2) I actually think it
might hurt readability by forcing developers to traipse around the file to
figure out what these functions are actually doing.  It's not like it would
save many lines of code, either.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 9d274eecef8e66e60b34d14c24459e70e782cf86 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 5 Oct 2022 10:31:35 -0700
Subject: [PATCH v5 1/2] Move WAL receivers' non-shared state to a new struct.

This is preparatory work for a follow-up change that will revamp
the wakeup mechanism for periodic tasks that WAL receivers must
perform.
---
 src/backend/replication/walreceiver.c | 90 ++-
 1 file changed, 48 insertions(+), 42 deletions(-)

diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 6cbb67c92a..89985c54cf 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -116,6 +116,14 @@ static struct
 	XLogRecPtr	Flush;			/* last byte + 1 flushed in the standby */
 }			LogstreamResult;
 
+/*
+ * A struct to keep track of non-shared state.
+ */
+typedef struct WalRcvInfo
+{
+	TimeLineID	startpointTLI;
+} WalRcvInfo;
+
 static StringInfoData reply_message;
 static StringInfoData incoming_message;
 
@@ -123,12 +131,12 @@ static StringInfoData incoming_message;
 static void WalRcvFetchTimeLineHistoryFiles(TimeLineID first, TimeLineID last);
 static void WalRcvWaitForStartPosition(XLogRecPtr *startpoint, TimeLineID *startpointTLI);
 static void WalRcvDie(int code, Datum arg);
-static void XLogWalRcvProcessMsg(unsigned char type, char *buf, Size len,
- TimeLineID tli);
-static void XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr,
-			TimeLineID tli);
-static void XLogWalRcvFlush(bool dying, TimeLineID tli);
-static void XLogWalRcvClose(XLogRecPtr recptr, TimeLineID tli);
+static void XLogWalRcvProcessMsg(WalRcvInfo *state, unsigned char type,
+ char *buf, Size len);
+static void XLogWalRcvWrite(WalRcvInfo *state, char *buf, Size nbytes,
+			XLogRecPtr recptr);
+static void XLogWalRcvFlush(WalRcvInfo *state, bool dying);
+static void XLogWalRcvClose(WalRc

Re: [patch] \g with multiple result sets and \watch with copy queries

2022-10-10 Thread Corey Huinker

>
> This is a bit more complicated than the usual tests, but not
> that much.
> Any opinions on this?


+1

I think that because it is more complicated than usual psql, we may want to
comment on the intention of the tests and some of the less-than-common psql
elements (\set concatenation, resetting \o, etc). If you see value in that
I can amend the patch.

Are there any options on COPY (header, formats) that we think we should
test as well?

Re: Suppressing useless wakeups in walreceiver

2022-10-10 Thread Nathan Bossart

On Mon, Oct 10, 2022 at 10:51:14AM -0700, Nathan Bossart wrote:
>> +/* Find the soonest wakeup time, to limit our nap. */
>> +nextWakeup = INT64_MAX;
>> +for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
>> +nextWakeup = Min(state.wakeup[i], nextWakeup);
>> +nap = Max(0, (nextWakeup - now + 999) / 1000);

Hm.  We should probably be more cautious here since nextWakeup is an int64
and nap is an int.  My guess is that we should just set nap to -1 if
nextWakeup > INT_MAX.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: [patch] \g with multiple result sets and \watch with copy queries

2022-10-10 Thread Fabien COELHO



Bonjour Daniel,

Good catch! Thanks for the quick fix!

As usual, what is not tested does not:-(

Attached a tap test to check for the expected behavior with multiple 
command \g.


--
Fabien.diff --git a/src/bin/psql/t/001_basic.pl b/src/bin/psql/t/001_basic.pl
index f447845717..c81feadd4e 100644
--- a/src/bin/psql/t/001_basic.pl
+++ b/src/bin/psql/t/001_basic.pl
@@ -325,4 +325,22 @@ is($row_count, '10',
 	'client-side error commits transaction, no ON_ERROR_STOP and multiple -c switches'
 );
 
+# test \g
+psql_like($node, "SELECT 'un' \\g g_file_1.out", qr//, "one command \\g");
+my $c1 = slurp_file("g_file_1.out");
+like($c1, qr/un/);
+unlink "g_file_1.out" or die $!;
+
+psql_like($node, "SELECT 'deux' \\; SELECT 'trois' \\g g_file_2.out", qr//, "two commands \\g");
+my $c2 = slurp_file("g_file_2.out");
+like($c2, qr/deux.*trois/s);
+unlink "g_file_2.out" or die $!;
+
+psql_like($node, "\\set SHOW_ALL_RESULTS 0\nSELECT 'quatre' \\; SELECT 'cinq' \\g g_file_3.out", qr//,
+  "two commands \\g with only last result");
+my $c3 = slurp_file("g_file_3.out");
+like($c3, qr/cinq/);
+unlike($c3, qr/quatre/);
+unlink "g_file_3.out" or die $!;
+
 done_testing();

Re: subtransaction performance

2022-10-10 Thread Bruce Momjian

On Fri, Oct  7, 2022 at 03:23:27PM -0700, Zhihong Yu wrote:
> Hi,
> I stumbled over:
> 
> https://about.gitlab.com/blog/2021/09/29/
> why-we-spent-the-last-month-eliminating-postgresql-subtransactions/
> 
> I wonder if SAVEPOINT / subtransaction performance has been boosted since the
> blog was written.

No, I have not seen any changes in this area since then.  Seems there
are two problems --- the 64 cache per session and the 64k on the
replica.  In both cases, it seems sizing is not optimal, but sizing is
never optimal.  I guess we can look at allowing manual size adjustment,
automatic size adjustment, or a different approach that is more graceful
for larger savepoint workloads.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: list of acknowledgments for PG15

2022-10-10 Thread Bruce Momjian

On Mon, Oct 10, 2022 at 02:41:06AM -0400, Tom Lane wrote:
> I don't wish to object to adding Nakamori-san here, but I feel like we
> need a policy that doesn't require last-minute updates to release notes.
> 
> As far as I've understood, the idea is to credit people based on the
> time frame in which their patches were committed, not on the branch(es)
> that the patches were committed to.  Otherwise we'd have to retroactively
> add people to back-branch acknowledgements, and we have not been doing
> that.  So a patch that goes in during the v16 development cycle means
> that the author should get acknowledged in the v16 release notes,
> even if it got back-patched to older branches.  What remains is to
> define when is the cutoff point between "acknowledge in v15" versus
> "acknowledge in v16".  I don't have a strong opinion about that,
> but I'd like it to be more than 24 hours before the 15.0 wrap.
> Could we make the cutoff be, say, beta1?

Is the issue that we are really only crediting people whose commits/work
appears in major releases, and not in minor ones?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: list of acknowledgments for PG15

2022-10-10 Thread Tom Lane

Bruce Momjian  writes:
> On Mon, Oct 10, 2022 at 02:41:06AM -0400, Tom Lane wrote:
>> As far as I've understood, the idea is to credit people based on the
>> time frame in which their patches were committed, not on the branch(es)
>> that the patches were committed to.

> Is the issue that we are really only crediting people whose commits/work
> appears in major releases, and not in minor ones?

What Peter has said about this is that he lists everyone whose name
has appeared in commit messages over thus-and-such a time frame.
So it doesn't matter which branch is involved, just when the contribution
was made.  That process is fine with me; I'm just seeking a bit more
clarity as to what the time frames are.

regards, tom lane

Re: list of acknowledgments for PG15

2022-10-10 Thread Bruce Momjian

On Mon, Oct 10, 2022 at 02:44:22PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Mon, Oct 10, 2022 at 02:41:06AM -0400, Tom Lane wrote:
> >> As far as I've understood, the idea is to credit people based on the
> >> time frame in which their patches were committed, not on the branch(es)
> >> that the patches were committed to.
> 
> > Is the issue that we are really only crediting people whose commits/work
> > appears in major releases, and not in minor ones?
> 
> What Peter has said about this is that he lists everyone whose name
> has appeared in commit messages over thus-and-such a time frame.
> So it doesn't matter which branch is involved, just when the contribution
> was made.  That process is fine with me; I'm just seeking a bit more
> clarity as to what the time frames are.

Oh, that's an interesting approach but it might mean that, for example,
PG 16 patch authors appear in the PG 15 major release notes.  It seems
that the stable major release branch date should be the cut-off, so no
one is missed.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-10-10 Thread Melanie Plageman

I've gone ahead and implemented option 1 (commented below).

On Thu, Oct 6, 2022 at 6:23 PM Melanie Plageman
 wrote:
>
> v31 failed in CI, so
> I've attached v32 which has a few issues fixed:
> - addressed some compiler warnings I hadn't noticed locally
> - autovac launcher and worker do indeed use bulkread strategy if they
>   end up starting before critical indexes have loaded and end up doing a
>   sequential scan of some catalog tables, so I have changed the
>   restrictions on BackendTypes allowed to track IO Operations in
>   IOCONTEXT_BULKREAD
> - changed the name of the column "fsynced" to "files_synced" to make it
>   more clear what unit it is in (and that the unit differs from that of
>   the "unit" column)
>
> In an off-list discussion with Andres, he mentioned that he thought
> buffers reused by a BufferAccessStrategy should be split from buffers
> "acquired" and that "acquired" should be renamed "clocksweeps".
>
> I have started doing this, but for BufferAccessStrategy IO there are a
> few choices about how we want to count the clocksweeps:
>
> Currently the following situations are counted under the following
> IOContexts and IOOps:
>
> IOCONTEXT_[VACUUM,BULKREAD,BULKWRITE], IOOP_ACQUIRE
> - reuse a buffer from the ring
>
> IOCONTEXT_SHARED, IOOP_ACQUIRE
> - add a buffer to the strategy ring initially
> - add a new shared buffer to the ring when all the existing buffers in
>   the ring are pinned
>
> And in the new paradigm, I think these are two good options:
>
> 1)
> IOCONTEXT_[VACUUM,BULKREAD,BULKWRITE], IOOP_CLOCKSWEEP
> - add a buffer to the strategy ring initially
> - add a new shared buffer to the ring when all the existing buffers in
>   the ring are pinned
>
> IOCONTEXT_[VACUUM,BULKREAD,BULKWRITE], IOOP_REUSE
> - reuse a buffer from the ring
>

I've implemented this option in attached v33.

> 2)
> IOCONTEXT_[VACUUM,BULKREAD,BULKWRITE], IOOP_CLOCKSWEEP
> - add a buffer to the strategy ring initially
>
> IOCONTEXT_[VACUUM,BULKREAD,BULKWRITE], IOOP_REUSE
> - reuse a buffer from the ring
>
> IOCONTEXT SHARED, IOOP_CLOCKSWEEP
> - add a new shared buffer to the ring when all the existing buffers in
>   the ring are pinned


- Melanie
From 6a83f0028a69a56243fa5b036299185766b80629 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Thu, 6 Oct 2022 12:23:38 -0400
Subject: [PATCH v33 2/4] Aggregate IO operation stats per BackendType

Stats on IOOps for all IOContexts for a backend are tracked locally. Add
functionality for backends to flush these stats to shared memory and
accumulate them with those from all other backends, exited and live.
Also add reset and snapshot functions used by cumulative stats system
for management of these statistics.

The aggregated stats in shared memory could be extended in the future
with per-backend stats -- useful for per connection IO statistics and
monitoring.

Some BackendTypes will not flush their pending statistics at regular
intervals and explicitly call pgstat_flush_io_ops() during the course of
normal operations to flush their backend-local IO operation statistics
to shared memory in a timely manner.

Because not all BackendType, IOOp, IOContext combinations are valid, the
validity of the stats is checked before flushing pending stats and
before reading in the existing stats file to shared memory.

Author: Melanie Plageman 
Reviewed-by: Andres Freund 
Reviewed-by: Justin Pryzby 
Reviewed-by: Kyotaro Horiguchi 
Reviewed-by: Maciek Sakrejda 
Reviewed-by: Lukas Fittl 
Discussion: https://www.postgresql.org/message-id/flat/20200124195226.lth52iydq2n2uilq%40alap3.anarazel.de
---
 doc/src/sgml/monitoring.sgml  |   2 +
 src/backend/utils/activity/pgstat.c   |  35 
 src/backend/utils/activity/pgstat_bgwriter.c  |   7 +-
 .../utils/activity/pgstat_checkpointer.c  |   7 +-
 src/backend/utils/activity/pgstat_io_ops.c| 161 ++
 src/backend/utils/activity/pgstat_relation.c  |  15 +-
 src/backend/utils/activity/pgstat_shmem.c |   4 +
 src/backend/utils/activity/pgstat_wal.c   |   4 +-
 src/backend/utils/adt/pgstatfuncs.c   |   4 +-
 src/include/miscadmin.h   |   2 +
 src/include/pgstat.h  |  84 +
 src/include/utils/pgstat_internal.h   |  36 
 src/tools/pgindent/typedefs.list  |   3 +
 13 files changed, 358 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 342b20ebeb..14dfd650f8 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5360,6 +5360,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
 the pg_stat_bgwriter
 view, archiver to reset all the counters shown in
 the pg_stat_archiver view,
+io to reset all the counters shown in the
+pg_stat_io view,
 wal to reset all the counters shown in the
 pg_stat_wal view or
 recovery_prefetch to re

Re: Simplifying our Trap/Assert infrastructure

2022-10-10 Thread Tom Lane

Nathan Bossart  writes:
> On Sun, Oct 09, 2022 at 05:08:39PM -0400, Tom Lane wrote:
>> Something I thought about but forgot to mention in the initial email:
>> is it worth sprinkling these macros with "unlikely()"?

> I don't see why not.

I experimented with that, and found something that surprised me:
there's a noticeable code-bloat effect.  With the patch as given,

$ size src/backend/postgres 
   textdata bss dec hex filename
9001199   86280  204496 9291975  8dc8c7 src/backend/postgres

but with unlikely(),

$ size src/backend/postgres 
   textdata bss dec hex filename
9035423   86280  204496 9326199  8e4e77 src/backend/postgres

I don't quite understand why that's happening, but it seems to
show that this requires some investigation of its own.  So for
now I just pushed the patch as-is.

regards, tom lane

Re: Reducing the chunk header sizes on all memory context types

2022-10-10 Thread Tom Lane

I wrote:
> What I am mainly wondering about at this point is whether Asserts
> are good enough or we should use actual test-and-elog checks for
> these things.

Hearing no comments on that, I decided that a good policy would be
to use Asserts in the paths dealing with small chunks but test-and-elog
in the paths dealing with large chunks.  We already had test-and-elog
sanity checks in the latter paths, at least in aset.c, which the new
checks can reasonably be combined with.  It's unlikely that the
large-chunk case is at all performance-critical, too.  But adding
runtime checks in the small-chunk paths would probably risk losing
some performance in production builds, and I'm not currently prepared
to argue that the problem is big enough to justify that.

Hence v2 attached, which cleans things up a tad in aset.c and then
extends similar policy to generation.c and slab.c.  Of note is
that slab.c was doing things like this:

SlabContext *slab = castNode(SlabContext, context);

Assert(slab);

which has about the same effect as what I'm proposing with
AllocSetIsValid, but (a) it's randomly different from the
other allocators, and (b) it's probably a hair less efficient,
because I doubt the compiler can optimize away castNode's
special handling of NULL.  So I made these bits follow the
style of aset.c.

regards, tom lane

diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index ec423375ae..db402e3a41 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -132,6 +132,10 @@ typedef struct AllocFreeListLink
 #define GetFreeListLink(chkptr) \
 	(AllocFreeListLink *) ((char *) (chkptr) + ALLOC_CHUNKHDRSZ)
 
+/* Validate a freelist index retrieved from a chunk header */
+#define FreeListIdxIsValid(fidx) \
+	((fidx) >= 0 && (fidx) < ALLOCSET_NUM_FREELISTS)
+
 /* Determine the size of the chunk based on the freelist index */
 #define GetChunkSizeFromFreeListIdx(fidx) \
 	Size) 1) << ALLOC_MINBITS) << (fidx))
@@ -202,7 +206,15 @@ typedef struct AllocBlockData
  * AllocSetIsValid
  *		True iff set is valid allocation set.
  */
-#define AllocSetIsValid(set) PointerIsValid(set)
+#define AllocSetIsValid(set) \
+	(PointerIsValid(set) && IsA(set, AllocSetContext))
+
+/*
+ * AllocBlockIsValid
+ *		True iff block is valid block of allocation set.
+ */
+#define AllocBlockIsValid(block) \
+	(PointerIsValid(block) && AllocSetIsValid((block)->aset))
 
 /*
  * We always store external chunks on a dedicated block.  This makes fetching
@@ -530,8 +542,7 @@ AllocSetReset(MemoryContext context)
 {
 	AllocSet	set = (AllocSet) context;
 	AllocBlock	block;
-	Size		keepersize PG_USED_FOR_ASSERTS_ONLY
-	= set->keeper->endptr - ((char *) set);
+	Size		keepersize PG_USED_FOR_ASSERTS_ONLY;
 
 	AssertArg(AllocSetIsValid(set));
 
@@ -540,6 +551,9 @@ AllocSetReset(MemoryContext context)
 	AllocSetCheck(context);
 #endif
 
+	/* Remember keeper block size for Assert below */
+	keepersize = set->keeper->endptr - ((char *) set);
+
 	/* Clear chunk freelists */
 	MemSetAligned(set->freelist, 0, sizeof(set->freelist));
 
@@ -598,8 +612,7 @@ AllocSetDelete(MemoryContext context)
 {
 	AllocSet	set = (AllocSet) context;
 	AllocBlock	block = set->blocks;
-	Size		keepersize PG_USED_FOR_ASSERTS_ONLY
-	= set->keeper->endptr - ((char *) set);
+	Size		keepersize PG_USED_FOR_ASSERTS_ONLY;
 
 	AssertArg(AllocSetIsValid(set));
 
@@ -608,6 +621,9 @@ AllocSetDelete(MemoryContext context)
 	AllocSetCheck(context);
 #endif
 
+	/* Remember keeper block size for Assert below */
+	keepersize = set->keeper->endptr - ((char *) set);
+
 	/*
 	 * If the context is a candidate for a freelist, put it into that freelist
 	 * instead of destroying it.
@@ -994,9 +1010,16 @@ AllocSetFree(void *pointer)
 
 	if (MemoryChunkIsExternal(chunk))
 	{
-
+		/* Release single-chunk block. */
 		AllocBlock	block = ExternalChunkGetBlock(chunk);
 
+		/*
+		 * Try to verify that we have a sane block pointer: the block header
+		 * should reference an aset and the freeptr should match the endptr.
+		 */
+		if (!AllocBlockIsValid(block) || block->freeptr != block->endptr)
+			elog(ERROR, "could not find block containing chunk %p", chunk);
+
 		set = block->aset;
 
 #ifdef MEMORY_CONTEXT_CHECKING
@@ -1011,14 +1034,6 @@ AllocSetFree(void *pointer)
 		}
 #endif
 
-
-		/*
-		 * Try to verify that we have a sane block pointer, the freeptr should
-		 * match the endptr.
-		 */
-		if (block->freeptr != block->endptr)
-			elog(ERROR, "could not find block containing chunk %p", chunk);
-
 		/* OK, remove block from aset's list and free it */
 		if (block->prev)
 			block->prev->next = block->next;
@@ -1036,12 +1051,23 @@ AllocSetFree(void *pointer)
 	}
 	else
 	{
-		int			fidx = MemoryChunkGetValue(chunk);
 		AllocBlock	block = MemoryChunkGetBlock(chunk);
-		AllocFreeListLink *link = GetFreeListLink(chunk);
+		int			fidx;
+		AllocFreeListLink *link;
 
+		/*
+		 * In this path, for speed reasons we just Assert that t

Re: Reducing the chunk header sizes on all memory context types

2022-10-10 Thread David Rowley

On Tue, 11 Oct 2022 at 08:35, Tom Lane  wrote:
> Hearing no comments on that, I decided that a good policy would be
> to use Asserts in the paths dealing with small chunks but test-and-elog
> in the paths dealing with large chunks.

This seems like a good policy.  I think it's good to get at least the
Asserts in there. If we have any troubles in the future then we can
revisit this and reconsider if we need to elog them instead.

> Hence v2 attached, which cleans things up a tad in aset.c and then
> extends similar policy to generation.c and slab.c.

Looking at your changes to SlabFree(), I don't really think that
change is well aligned to the newly proposed policy. My understanding
of the rationale behind this policy is that large chunks get malloced
and will be slower anyway, so the elog(ERROR) is less overhead for
those.  In SlabFree(), we're most likely not doing any free()s, so I
don't quite understand why you've added the elog rather than the
Assert for this case. The slab allocator *should* be very fast.

I don't have any issue with any of the other changes.

David

Re: Reducing the chunk header sizes on all memory context types

2022-10-10 Thread Tom Lane

David Rowley  writes:
> Looking at your changes to SlabFree(), I don't really think that
> change is well aligned to the newly proposed policy. My understanding
> of the rationale behind this policy is that large chunks get malloced
> and will be slower anyway, so the elog(ERROR) is less overhead for
> those.  In SlabFree(), we're most likely not doing any free()s, so I
> don't quite understand why you've added the elog rather than the
> Assert for this case. The slab allocator *should* be very fast.

Yeah, slab.c hasn't any distinction between large and small chunks,
so we have to just pick one policy or the other.  I'd hoped to get
away with the more robust runtime test on the basis that slab allocation
is not used so much that this'd result in any noticeable performance
change.  SlabRealloc, at least, is not used *at all* per the code
coverage tests, and if we're there at all we should be highly suspicious
that something is wrong.  However, I could be wrong about SlabFree,
and if you're going to hold my feet to the fire then I'll change it
rather than try to produce some performance evidence.

regards, tom lane

Re: problems with making relfilenodes 56-bits

2022-10-10 Thread Andres Freund

Hi,

On 2022-10-10 08:10:22 -0400, Robert Haas wrote:
> On Mon, Oct 10, 2022 at 5:16 AM Dilip Kumar  wrote:
> > I have also executed my original test after applying these patches on
> > top of the 56 bit relfilenode patch.  So earlier we saw the WAL size
> > increased by 11% (66199.09375 kB to 73906.984375 kB) and after this
> > patch now the WAL generated is 58179.2265625.  That means in this
> > particular example this patch is reducing the WAL size by 12% even
> > with the 56 bit relfilenode patch.
> 
> That's a very promising result, but the question in my mind is how
> much work would be required to bring this patch to a committable
> state?

The biggest part clearly is to review the variable width integer patch. It's
not a large amount of code, but probably more complicated than average.

One complication there is that currently the patch assumes:
 * Note that this function, for efficiency, reads 8 bytes, even if the
 * variable integer is less than 8 bytes long. The buffer has to be
 * allocated sufficiently large to account for that fact. The maximum
 * amount of memory read is 9 bytes.

We could make a less efficient version without that assumption, but I think it
might be easier to just guarantee it in the xlog*.c case.

Using it in xloginsert.c is pretty darn simple, code-wise. xlogreader is bit
harder, although not for intrinsic reasons - the logic underlying
COPY_HEADER_FIELD seems unneccessary complicated to me. The minimal solution
would likely be to just wrap the varint reads in another weird macro.

Leaving the code issues themselves aside, one important thing would be to
evaluate what the performance impacts of the varint encoding/decoding are as
part of "full" server. I suspect it'll vanish in the noise, but we'd need to
validate that.

Greetings,

Andres Freund

Re: meson PGXS compatibility

2022-10-10 Thread Andres Freund

Hi,

On 2022-10-05 13:07:10 -0700, Andres Freund wrote:
> 0004: solaris: Check for -Wl,-E directly instead of checking for gnu LD
> 
>   This allows us to get rid of the nontrivial detection of with_gnu_ld,
>   simplifying meson PGXS compatibility. It's also nice to delete libtool.m4.
> 
>   I don't like the SOLARIS_EXPORT_DYNAMIC variable I invented. If somebody has
>   a better idea...

A cleaner approach could be to add a LDFLAGS_BE and emit the -Wl,-E into it
from both configure and meson, solely based on whether -Wl,-E is supported.  I
haven't verified cygwin, but on our other platforms that seems to do the right
thing.

Greetings,

Andres Freund

Re: [PATCH] Fix build with LLVM 15 or above

2022-10-10 Thread Thomas Munro

On Tue, Oct 4, 2022 at 10:45 AM Zhihong Yu  wrote:
> On Mon, Oct 3, 2022 at 2:41 PM Andres Freund  wrote:
>> On 2022-10-03 12:16:12 -0700, Andres Freund wrote:
>> > I haven't yet run through the whole regression test with an assert enabled
>> > llvm because an assert-enabled llvm is *SLOW*, but it got through the first
>> > few parallel groups ok.  Using an optimized llvm 15, all tests pass with
>> > PGOPTIONS=-cjit_above_cost=0.

+/*
+ * When targetting an llvm version with opaque pointers enabled by
+ * default, turn them off for the context we build our code in. Don't need
+ * to do so for other contexts (e.g. llvm_ts_context) - once the IR is
+ * generated, it carries the necessary information.
+ */
+#if LLVM_VERSION_MAJOR > 14
+LLVMContextSetOpaquePointers(LLVMGetGlobalContext(), false);
+#endif

Ahh, right, thanks!

>> That did pass. But to be able to use clang >= 15 one more piece is
>> needed. Updated patch attached.

+  bitcode_cflags += ['-Xclang', '-no-opaque-pointers']

Oh, right.  That makes sense.

> I think `targetting` should be spelled as targeting

Yeah.

OK, I'll wait for the dust to settle on our 15 release and then
back-patch this.  Then I'll keep working on the opaque pointer support
for master, which LLVM 16 will need (I expect we'll eventually want to
back-patch that eventually, but first things first...).

Re: Reducing the chunk header sizes on all memory context types

2022-10-10 Thread David Rowley

On Tue, 11 Oct 2022 at 10:07, Tom Lane  wrote:
> Yeah, slab.c hasn't any distinction between large and small chunks,
> so we have to just pick one policy or the other.  I'd hoped to get
> away with the more robust runtime test on the basis that slab allocation
> is not used so much that this'd result in any noticeable performance
> change.  SlabRealloc, at least, is not used *at all* per the code
> coverage tests, and if we're there at all we should be highly suspicious
> that something is wrong.  However, I could be wrong about SlabFree,
> and if you're going to hold my feet to the fire then I'll change it
> rather than try to produce some performance evidence.

The main reason I brought it up was that only yesterday I was looking
into fixing the slowness of the Slab allocator. It's currently quite
far behind the performance of both generation.c and aset.c and it
would be very nice to bring it up to at least be on-par with those.
Ideally there would be some performance advantages of the fixed-sized
chunks. I'd just rather not have any additional things go in to make
that goal harder to reach.

The proposed patches in [1] do aim to make additional usages of the
slab allocator, and I have a feeling that we'll want to fix the
performance of slab.c before those. Perhaps the Asserts are a better
option if we're to get the proposed radix tree implementation.

David

[1] 
https://postgr.es/m/cad21aod3w76wers_lq7_ua6+gtaooerpji+yz8ac6aui4jw...@mail.gmail.com

Re: Reducing the chunk header sizes on all memory context types

2022-10-10 Thread Tom Lane

David Rowley  writes:
> The main reason I brought it up was that only yesterday I was looking
> into fixing the slowness of the Slab allocator. It's currently quite
> far behind the performance of both generation.c and aset.c and it
> would be very nice to bring it up to at least be on-par with those.

Really!?  That's pretty sad, because surely it should be handling a
simpler case.

Anyway, I'm about to push this with an Assert in SlabFree and
run-time test in SlabRealloc.  That should be enough to assuage
my safety concerns, and then we can think about better performance.

regards, tom lane

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-10-10 Thread Maciek Sakrejda

Thanks for working on this! Like Lukas, I'm excited to see more
visibility into important parts of the system like this.

On Mon, Oct 10, 2022 at 11:49 AM Melanie Plageman
 wrote:
>
> I've gone ahead and implemented option 1 (commented below).

No strong opinion on 1 versus 2, but I guess at least partly because I
don't understand the implications (I do understand the difference,
just not when it might be important in terms of stats). Can we think
of a situation where combining stats about initial additions with
pinned additions hides some behavior that might be good to understand
and hard to pinpoint otherwise?

I took a look at the latest docs (as someone mostly familiar with
internals at only a pretty high level, so probably somewhat close to
the target audience) and have some feedback.

+ 
+  
+   backend_type text
+  
+  
+   Type of backend (e.g. background worker, autovacuum worker).
+  
+ 

Not critical, but is there a list of backend types we could
cross-reference elsewhere in the docs?

>From the io_context column description:

+   The autovacuum daemon, explicit VACUUM,
explicit
+   ANALYZE, many bulk reads, and many bulk
writes use a
+   fixed amount of memory, acquiring the equivalent number of
shared
+   buffers and reusing them circularly to avoid occupying an
undue portion
+   of the main shared buffer pool.
+  

I don't understand how this is relevant to the io_context column.
Could you expand on that, or am I just missing something obvious?

+ 
+  
+   extended bigint
+  
+  
+   Extends of relations done by this
backend_type in
+   order to write data in this io_context.
+  
+ 

I understand what this is, but not why this is something I might want
to know about.

And from your earlier e-mail:

On Thu, Oct 6, 2022 at 10:42 AM Melanie Plageman
 wrote:
>
> Because we want to add non-block-oriented IO in the future (like
> temporary file IO) to this view and want to use the same "read",
> "written", "extended" columns, I would prefer not to prefix the columns
> with "blks_". I have added a column "unit" which would contain the unit
> in which read, written, and extended are in. Unfortunately, fsyncs are
> not per block, so "unit" doesn't really work for this. I documented
> this.
>
> The most correct thing to do to accommodate block-oriented and
> non-block-oriented IO would be to specify all the values in bytes.
> However, I would like this view to be usable visually (as opposed to
> just in scripts and by tools). The only current value of unit is
> "block_size" which could potentially be combined with the value of the
> GUC to get bytes.
>
> I've hard-coded the string "block_size" into the view generation
> function pg_stat_get_io(), so, if this idea makes sense, perhaps I
> should do something better there.

That seems broadly reasonable, but pg_settings also has a 'unit'
field, and in that view, unit is '8kB' on my system--i.e., it
(presumably) reflects the block size. Is that something we should try
to be consistent with (not sure if that's a good idea, but thought it
was worth asking)?

> On Fri, Sep 30, 2022 at 7:18 PM Lukas Fittl  wrote:
> > - Overall it would be helpful if we had a dedicated documentation page on 
> > I/O statistics that's linked from the pg_stat_io view description, and 
> > explains how the I/O statistics tie into the various concepts of shared 
> > buffers / buffer access strategies / etc (and what is not tracked today)
>
> I haven't done this yet. How specific were you thinking -- like
> interpretations of all the combinations and what to do with what you
> see? Like you should run pg_prewarm if you see X? Specific checkpointer
> or bgwriter GUCs to change? Or just links to other docs pages on
> recommended tunings?
>
> Were you imagining the other IO statistics views (like
> pg_statio_all_tables and pg_stat_database) also being included in this
> page? Like would it be a comprehensive guide to IO statistics and what
> their significance/purposes are?

I can't speak for Lukas here, but I encouraged him to suggest more
thorough documentation in general, so I can speak to my concerns: in
general, these stats should be usable for someone who does not know
much about Postgres internals. It's pretty low-level information,
sure, so I think you need some understanding of how the system broadly
works to make sense of it. But ideally you should be able to find what
you need to understand the concepts involved within the docs.

I think your updated docs are much clearer (with the caveats of my
specific comments above). It would still probably be helpful to have a
dedicated page on I/O stats (and yeah, something with a broad scope,
along the lines of a comprehensive guide), but I think that can wait
until a future patch.

Thanks,
Maciek

autovacuum_freeze_max_age reloption seems broken

2022-10-10 Thread Peter Geoghegan

The autovacuum_freeze_max_age reloption exists so that the DBA can
optionally have antiwraparound autovacuums run against a table that
requires more frequent antiwraparound autovacuums. This has problems
because there are actually two types of VACUUM right now (aggressive
and non-aggressive), which, strictly speaking, is an independent
condition of antiwraparound-ness. There is a tacit assumption within
autovacuum.c that all antiwraparound autovacuums are also aggressive,
I think. But that just isn't true, which leads to clearly broken
behavior when the autovacuum_freeze_max_age reloption is in use.

Note that the VacuumParams state that gets passed down to
vacuum_set_xid_limits() does not include anything about the "reloption
version" of autovacuum_freeze_max_age. So quite naturally
vacuum_set_xid_limits() can only work off of the
autovacuum_freeze_max_age GUC, even when the reloption happens to have
been used over in autovacuum.c. In practice this means that we can
easily see autovacuum spin uselessly when the reloption is in use --
it'll launch antiwraparound autovacuums that never advance
relfrozenxid and so never address the relfrozenxid age issue from the
point of view of autovacuum.c.

There is no reason to think that the user will also (say) set the
autovacuum_freeze_table_age reloption separately (not to be confused
with the vacuum_freeze_table_age GUC!). We'll usually just work off
the GUC (I mean why wouldn't we?). I don't see why vacuumlazy.c
doesn't just force aggressive mode whenever it sees an antiwraparound
autovacuum, no matter what. Recall the problem scenario that led to
bugfix commit dd9ac7d5 -- that also could have been avoided by making
sure that every antiwraparound autovacuum was aggressive (actually the
original problem was that we'd suppress non-aggressive antiwraparound
autovacuums as redundant).

I only noticed this problem because I am in the process of writing a
patch series that demotes vacuum_freeze_table_age to a mere
compatibility option (and even gets rid of the whole concept of
aggressive VACUUM). The whole way that vacuum_freeze_table_age and
autovacuum_freeze_max_age are supposed to work together seems very
confusing to me. I'm not surprised that this was overlooked for so long.

--
Peter Geoghegan

Re: Adding Support for Copy callback functionality on COPY TO api

2022-10-10 Thread Michael Paquier

On Mon, Oct 10, 2022 at 09:38:59AM -0700, Nathan Bossart wrote:
> This new callback allows the use of COPY TO's machinery in extensions.  A
> couple of generic use-cases are listed upthread [0], and one concrete
> use-case is the aws_s3 extension [1].

FWIW, I understand that the proposal is to have an easier control of
how, what and where to the data is processed.  COPY TO PROGRAM
provides that with exactly the same kind of interface (data input, its
length) once you have a program able to process the data piped out the
same way.  However, it is in the shape of an external process that
receives the data through a pipe hence it provides a much wider attack
surface which is something that all cloud provider care about.  The
thing is that this allows extension developers to avoid arbitrary
commands on the backend as the OS user running the Postgres instance,
while still being able to process the data the way they want
(auditing, analytics, whatever) within the strict context of the
process running an extension code.  I'd say that this is a very cheap
change to allow people to have more fun with the backend engine
(similar to the recent changes with archive libraries for
archive_command, but much less complex):
 src/backend/commands/copy.c   |  2 +-
 src/backend/commands/copyto.c | 18 +++---
 2 files changed, 16 insertions(+), 4 deletions(-)

(Not to mention that we've had our share of CVEs regarding COPY
PROGRAM even if it is superuser-only).

> I really doubt that this small test case is going to cause anything
> approaching undue maintenance burden.  I think it's important to ensure
> this functionality continues to work as expected long into the future.

I like these toy modules, they provide test coverage while acting as a
template for new developers.  I am wondering whether it should have
something for the copy from callback, actually, as it is named
"test_copy_callbacks" but I see no need to extend the module more than
necessary in the context of this thread (logical decoding uses it,
anyway).
--
Michael

signature.asc
Description: PGP signature

Re: Adding Support for Copy callback functionality on COPY TO api

2022-10-10 Thread Nathan Bossart

On Tue, Oct 11, 2022 at 09:01:41AM +0900, Michael Paquier wrote:
> I like these toy modules, they provide test coverage while acting as a
> template for new developers.  I am wondering whether it should have
> something for the copy from callback, actually, as it is named
> "test_copy_callbacks" but I see no need to extend the module more than
> necessary in the context of this thread (logical decoding uses it,
> anyway).

Yeah, I named it that way because I figured we might want a test for the
COPY FROM callback someday.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: shadow variables - pg15 edition

2022-10-10 Thread David Rowley

On Tue, 11 Oct 2022 at 06:02, Tom Lane  wrote:
>
> Alvaro Herrera  writes:
> > On 2022-Oct-10, Andres Freund wrote:
> >> We could, but is it really a useful thing for something fixed 6 years ago?
>
> > Well, for people purposefully installing using older installs of Perl
> > (not me, admittedly), it does seem useful, because you get the benefit
> > of checking shadow vars for the rest of the tree and still get no
> > warnings if everything is clean.
>
> Meh --- people purposefully using old Perls are likely using old
> compilers too.  Let's wait and see if any devs actually complain.

I can't really add much here, apart from I think it would be a shame
if some 3rd party 6 year old code was to hold us back on this.

I'm also keen to wait for complaints and only if we really have to,
remove the shadow flag from being used only in the places where we
need to.

Aside from this issue, if anything I'd be keen to go a little further
with this and upgrade to -Wshadow=local. The reason being is that I
noticed that the const qualifier is not classed as "compatible" with
the equivalently named and typed variable without the const qualifier.
ISTM that there's close to as much opportunity to mix up two variables
with the same name that are const and non-const as there are two
variables with the same const qualifier. However, I'll be waiting for
the dust to settle on the current flags before thinking any more about
that.

David

Re: Perform streaming logical transactions by background workers and parallel apply

2022-10-10 Thread Masahiko Sawada

On Fri, Oct 7, 2022 at 2:00 PM Amit Kapila  wrote:
>
> On Fri, Oct 7, 2022 at 8:47 AM Masahiko Sawada  wrote:
> >
> > On Thu, Oct 6, 2022 at 9:04 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > I think the root reason for this kind of deadlock problems is the table
> > > structure difference between publisher and subscriber(similar to the 
> > > unique
> > > difference reported earlier[1]). So, I think we'd better disallow this 
> > > case. For
> > > example to avoid the reported problem, we could only support parallel 
> > > apply if
> > > pubviaroot is false on publisher and replicated tables' types(relkind) 
> > > are the
> > > same between publisher and subscriber.
> > >
> > > Although it might restrict some use cases, but I think it only restrict 
> > > the
> > > cases when the partitioned table's structure is different between 
> > > publisher and
> > > subscriber. User can still use parallel apply for cases when the table
> > > structure is the same between publisher and subscriber which seems 
> > > acceptable
> > > to me. And we can also document that the feature is expected to be used 
> > > for the
> > > case when tables' structure are the same. Thoughts ?
> >
> > I'm concerned that it could be a big restriction for users. Having
> > different partitioned table's structures on the publisher and the
> > subscriber is quite common use cases.
> >
> > From the feature perspective, the root cause seems to be the fact that
> > the apply worker does both receiving and applying changes. Since it
> > cannot receive the subsequent messages while waiting for a lock on a
> > table, the parallel apply worker also cannot move forward. If we have
> > a dedicated receiver process, it can off-load the messages to the
> > worker while another process waiting for a lock. So I think that
> > separating receiver and apply worker could be a building block for
> > parallel-apply.
> >
>
> I think the disadvantage that comes to mind is the overhead of passing
> messages between receiver and applier processes even for non-parallel
> cases. Now, I don't think it is advisable to have separate handling
> for non-parallel cases. The other thing is that we need to someway
> deal with feedback messages which helps to move synchronous replicas
> and update subscriber's progress which in turn helps to keep the
> restart point updated. These messages also act as heartbeat messages
> between walsender and walapply process.
>
> To deal with this, one idea is that we can have two connections to
> walsender process, one with walreceiver and the other with walapply
> process which according to me could lead to a big increase in resource
> consumption and it will bring another set of complexities in the
> system. Now, in this, I think we have two possibilities, (a) The first
> one is that we pass all messages to the leader apply worker and then
> it decides whether to execute serially or pass it to the parallel
> apply worker. However, that can again deadlock in the truncate
> scenario we discussed because the main apply worker won't be able to
> receive new messages once it is blocked at the truncate command. (b)
> The second one is walreceiver process itself takes care of passing
> streaming transactions to parallel apply workers but if we do that
> then walreceiver needs to wait at the transaction end to maintain
> commit order which means it can also lead to deadlock in case the
> truncate happens in a streaming xact.

I imagined (b) but I had missed the point of preserving the commit
order. Separating the receiver and apply worker cannot resolve this
problem.

>
> The other alternative is that we allow walreceiver process to wait for
> apply process to finish transaction and send the feedback but that
> seems to be again an overhead if we have to do it even for small
> transactions, especially it can delay sync replication cases. Even, if
> we don't consider overhead, it can still lead to a deadlock because
> walreceiver won't be able to move in the scenario we are discussing.
>
> About your point that having different partition structures for
> publisher and subscriber, I don't know how common it will be once we
> have DDL replication. Also, the default value of
> publish_via_partition_root is false which doesn't seem to indicate
> that this is a quite common case.

So how can we consider these concurrent issues that could happen only
when streaming = 'parallel'? Can we restrict some use cases to avoid
the problem or can we have a safeguard against these conflicts? We
could find a new problematic scenario in the future and if it happens,
logical replication gets stuck, it cannot be resolved only by apply
workers themselves.

Regards,

-- 
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: autovacuum_freeze_max_age reloption seems broken

2022-10-10 Thread Peter Geoghegan

On Mon, Oct 10, 2022 at 4:46 PM Peter Geoghegan  wrote:
> There is no reason to think that the user will also (say) set the
> autovacuum_freeze_table_age reloption separately (not to be confused
> with the vacuum_freeze_table_age GUC!). We'll usually just work off
> the GUC (I mean why wouldn't we?). I don't see why vacuumlazy.c
> doesn't just force aggressive mode whenever it sees an antiwraparound
> autovacuum, no matter what.

Actually, even forcing every antiwraparound autovacuum to use
aggressive mode isn't enough to stop autovacuum.c from spinning. It
might be a good start, but it still leaves the freeze_min_age issue.

The only way that autovacuum.c is going to be satisfied and back off
with launching antiwraparound autovacuums is if relfrozenxid is
advanced, and advanced by a significant amount. But what if the
autovacuum_freeze_max_age reloption happens to have been set to
something that's significantly less than the value of the
vacuum_freeze_min_age GUC (or the autovacuum_freeze_min_age reloption,
even)? Most of the time we can rely on vacuum_set_xid_limits() making
sure that the FreezeLimit cutoff (cutoff that determines which XID
we'll freeze) isn't unreasonably old relative to other cutoffs. But
that won't work if we're forcing an aggressive VACUUM in vacuumlazy.c.

I suppose that this separate freeze_min_age issue could be fixed by
teaching autovacuum.c's table_recheck_autovac() function to set
freeze_min_age to something less than the current value of reloptions
like autovacuum_freeze_min_age and autovacuum_freeze_table_age for the
same table (when either of the table-level reloptions happened to be
set). In other words, autovacuum.c could be taught to make sure that
these reloption-based cutoffs have sane values relative to each other
by applying roughly the same approach taken in vacuum_set_xid_limits()
for the GUCs.

-- 
Peter Geoghegan

Re: Suppressing useless wakeups in walreceiver

2022-10-10 Thread Nathan Bossart

On Mon, Oct 10, 2022 at 11:10:14AM -0700, Nathan Bossart wrote:
> On Mon, Oct 10, 2022 at 10:51:14AM -0700, Nathan Bossart wrote:
>>> +/* Find the soonest wakeup time, to limit our nap. */
>>> +nextWakeup = INT64_MAX;
>>> +for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
>>> +nextWakeup = Min(state.wakeup[i], nextWakeup);
>>> +nap = Max(0, (nextWakeup - now + 999) / 1000);
> 
> Hm.  We should probably be more cautious here since nextWakeup is an int64
> and nap is an int.  My guess is that we should just set nap to -1 if
> nextWakeup > INT_MAX.

Here's an attempt at fixing that.  I ended up just changing "nap" to an
int64 and then ensuring it's no larger than INT_MAX in the call to
WaitLatchOrSocket().  IIUC we can't use -1 here because WL_TIMEOUT is set.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From c844cf35d483bc7ab847fdc430057295d0f949c9 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 5 Oct 2022 10:31:35 -0700
Subject: [PATCH v6 1/2] Move WAL receivers' non-shared state to a new struct.

This is preparatory work for a follow-up change that will revamp
the wakeup mechanism for periodic tasks that WAL receivers must
perform.
---
 src/backend/replication/walreceiver.c | 90 ++-
 1 file changed, 48 insertions(+), 42 deletions(-)

diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 6cbb67c92a..89985c54cf 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -116,6 +116,14 @@ static struct
 	XLogRecPtr	Flush;			/* last byte + 1 flushed in the standby */
 }			LogstreamResult;
 
+/*
+ * A struct to keep track of non-shared state.
+ */
+typedef struct WalRcvInfo
+{
+	TimeLineID	startpointTLI;
+} WalRcvInfo;
+
 static StringInfoData reply_message;
 static StringInfoData incoming_message;
 
@@ -123,12 +131,12 @@ static StringInfoData incoming_message;
 static void WalRcvFetchTimeLineHistoryFiles(TimeLineID first, TimeLineID last);
 static void WalRcvWaitForStartPosition(XLogRecPtr *startpoint, TimeLineID *startpointTLI);
 static void WalRcvDie(int code, Datum arg);
-static void XLogWalRcvProcessMsg(unsigned char type, char *buf, Size len,
- TimeLineID tli);
-static void XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr,
-			TimeLineID tli);
-static void XLogWalRcvFlush(bool dying, TimeLineID tli);
-static void XLogWalRcvClose(XLogRecPtr recptr, TimeLineID tli);
+static void XLogWalRcvProcessMsg(WalRcvInfo *state, unsigned char type,
+ char *buf, Size len);
+static void XLogWalRcvWrite(WalRcvInfo *state, char *buf, Size nbytes,
+			XLogRecPtr recptr);
+static void XLogWalRcvFlush(WalRcvInfo *state, bool dying);
+static void XLogWalRcvClose(WalRcvInfo *state, XLogRecPtr recptr);
 static void XLogWalRcvSendReply(bool force, bool requestReply);
 static void XLogWalRcvSendHSFeedback(bool immed);
 static void ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime);
@@ -175,7 +183,6 @@ WalReceiverMain(void)
 	char		slotname[NAMEDATALEN];
 	bool		is_temp_slot;
 	XLogRecPtr	startpoint;
-	TimeLineID	startpointTLI;
 	TimeLineID	primaryTLI;
 	bool		first_stream;
 	WalRcvData *walrcv = WalRcv;
@@ -185,6 +192,7 @@ WalReceiverMain(void)
 	char	   *err;
 	char	   *sender_host = NULL;
 	int			sender_port = 0;
+	WalRcvInfo	state = {0};
 
 	/*
 	 * WalRcv should be set up already (if we are a backend, we inherit this
@@ -238,7 +246,7 @@ WalReceiverMain(void)
 	strlcpy(slotname, (char *) walrcv->slotname, NAMEDATALEN);
 	is_temp_slot = walrcv->is_temp_slot;
 	startpoint = walrcv->receiveStart;
-	startpointTLI = walrcv->receiveStartTLI;
+	state.startpointTLI = walrcv->receiveStartTLI;
 
 	/*
 	 * At most one of is_temp_slot and slotname can be set; otherwise,
@@ -258,7 +266,7 @@ WalReceiverMain(void)
 	pg_atomic_write_u64(&WalRcv->writtenUpto, 0);
 
 	/* Arrange to clean up at walreceiver exit */
-	on_shmem_exit(WalRcvDie, PointerGetDatum(&startpointTLI));
+	on_shmem_exit(WalRcvDie, PointerGetDatum(&state));
 
 	/* Properly accept or ignore signals the postmaster might send us */
 	pqsignal(SIGHUP, SignalHandlerForConfigReload); /* set flag to read config
@@ -345,11 +353,11 @@ WalReceiverMain(void)
 		 * Confirm that the current timeline of the primary is the same or
 		 * ahead of ours.
 		 */
-		if (primaryTLI < startpointTLI)
+		if (primaryTLI < state.startpointTLI)
 			ereport(ERROR,
 	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 	 errmsg("highest timeline %u of the primary is behind recovery timeline %u",
-			primaryTLI, startpointTLI)));
+			primaryTLI, state.startpointTLI)));
 
 		/*
 		 * Get any missing history files. We do this always, even when we're
@@ -361,7 +369,7 @@ WalReceiverMain(void)
 		 * but let's avoid the confusion of timeline id collisions where we
 		 * can.
 		 */
-		WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);

Re: Suppressing useless wakeups in walreceiver

2022-10-10 Thread Bharath Rupireddy

On Mon, Oct 10, 2022 at 11:21 PM Nathan Bossart
 wrote:
>

Thanks for the updated patches.

> > 2. With the below change, the time walreceiver spends in
> > XLogWalRcvSendReply() is also included for XLogWalRcvSendHSFeedback
> > right? I think it's a problem given that XLogWalRcvSendReply() can
> > take a while. Earlier, this wasn't the case, each function calculating
> > 'now' separately. Any reason for changing this behaviour? I know that
> > GetCurrentTimestamp(); isn't cheaper all the time, but here it might
> > result in a change in the behaviour.
>
> Yes, if XLogWalRcvSendReply() takes a long time, we might defer sending the
> hot_standby_feedback message until a later time.  The only reason I changed
> this was to avoid extra calls to GetCurrentTimestamp(), which might be
> expensive on some platforms.

Agree. However...

> Outside of the code snippet you pointed out,
> I think WalReceiverMain() has a similar problem.  That being said, I'm
> generally skeptical that this sort of thing is detrimental given the
> current behavior (i.e., wake up every 100ms), the usual values of these
> GUCs (i.e., tens of seconds), and the fact that any tasks that are
> inappropriately skipped will typically be retried in the next iteration of
> the loop.

AFICS, the aim of the patch isn't optimizing around
GetCurrentTimestamp() calls and the patch does seem to change quite a
bit of them which may cause a change of the behaviour. I think that
the GetCurrentTimestamp() optimizations can go to 0003 patch in this
thread itself or we can discuss it as a separate thread to seek more
thoughts.

The 'now' in many instances in the patch may not actually represent
the true current time and it includes time taken by other operations
as well. I think this will have problems.

+XLogWalRcvSendReply(state, now, false, false);
+XLogWalRcvSendHSFeedback(state, now, false);

+now = GetCurrentTimestamp();
+for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
+WalRcvComputeNextWakeup(&state, i, now);
+XLogWalRcvSendHSFeedback(&state, now, true);

+now = GetCurrentTimestamp();
+for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
+WalRcvComputeNextWakeup(&state, i, now);
+XLogWalRcvSendHSFeedback(&state, now, true);

> > 4. How about simplifying WalRcvComputeNextWakeup() something like below?
>
> Other than saving a couple lines of code, IMO this doesn't meaningfully
> simplify the function or improve readability.

IMO, the duplicate lines of code aren't necessary in
WalRcvComputeNextWakeup() and that function's code isn't that complex
by removing them.

> > 5. Can we move below code snippets to respective static functions for
> > better readability and code reuse?
> > This:
> > +/* Find the soonest wakeup time, to limit our nap. */
> > +nextWakeup = INT64_MAX;
> > +for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
> > +nextWakeup = Min(state.wakeup[i], nextWakeup);
> > +nap = Max(0, (nextWakeup - now + 999) / 1000);
> >
> > And this:
> >
> > +now = GetCurrentTimestamp();
> > +for (int i = 0; i < NUM_WALRCV_WAKEUPS; ++i)
> > +WalRcvComputeNextWakeup(&state, i, now);
>
> This did cross my mind, but I opted to avoid creating new functions because
> 1) they aren't likely to be reused very much, and 2) I actually think it
> might hurt readability by forcing developers to traipse around the file to
> figure out what these functions are actually doing.  It's not like it would
> save many lines of code, either.

The second snippet is being used twice already. IMO having static
functions (perhaps inline) is much better here.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Adding Support for Copy callback functionality on COPY TO api

2022-10-10 Thread Michael Paquier

On Wed, Sep 30, 2020 at 01:48:12PM +0500, Andrey V. Lepikhov wrote:
> Your code almost exactly the same as proposed in [1] as part of 'Fast COPY
> FROM' command. But it seems there are differences.
> 
> [1] 
> https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru

I have been looking at what you have here while reviewing the contents
of this thread, and it seems to me that you should basically be able
to achieve the row-level control that your patch is doing with the
callback to do the per-row processing posted here.  The main
difference, though, is that you want to have more control at the
beginning and the end of the COPY TO processing which explains the
split of DoCopyTo().  I am a bit surprised to see this much footprint
in the backend code once there are two FDW callbacks to control the
beginning and the end of the COPY TO, to be honest, sacrifying a lot
the existing symmetry between the COPY TO and COPY FROM code paths
where there is currently a strict control on the pre-row and post-row
processing like the per-row memory context.
--
Michael


signature.asc
Description: PGP signature

Re: Suppressing useless wakeups in walreceiver

2022-10-10 Thread Nathan Bossart

On Tue, Oct 11, 2022 at 07:01:26AM +0530, Bharath Rupireddy wrote:
> On Mon, Oct 10, 2022 at 11:21 PM Nathan Bossart
>> Outside of the code snippet you pointed out,
>> I think WalReceiverMain() has a similar problem.  That being said, I'm
>> generally skeptical that this sort of thing is detrimental given the
>> current behavior (i.e., wake up every 100ms), the usual values of these
>> GUCs (i.e., tens of seconds), and the fact that any tasks that are
>> inappropriately skipped will typically be retried in the next iteration of
>> the loop.
> 
> AFICS, the aim of the patch isn't optimizing around
> GetCurrentTimestamp() calls and the patch does seem to change quite a
> bit of them which may cause a change of the behaviour. I think that
> the GetCurrentTimestamp() optimizations can go to 0003 patch in this
> thread itself or we can discuss it as a separate thread to seek more
> thoughts.
> 
> The 'now' in many instances in the patch may not actually represent
> the true current time and it includes time taken by other operations
> as well. I think this will have problems.

What problems do you think this will cause?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Adding Support for Copy callback functionality on COPY TO api

2022-10-10 Thread Michael Paquier

On Mon, Oct 10, 2022 at 05:06:39PM -0700, Nathan Bossart wrote:
> Yeah, I named it that way because I figured we might want a test for the
> COPY FROM callback someday.

Okay.  So, I have reviewed the whole thing, added a description of all
the fields of BeginCopyTo() in its top comment, tweaked a few things
and added in the module an extra NOTICE with the number of processed
rows.  The result seemed fine, so applied.
--
Michael


signature.asc
Description: PGP signature

Re: Query Jumbling for CALL and SET utility statements

2022-10-10 Thread Michael Paquier

On Mon, Oct 10, 2022 at 09:16:47PM +0800, Julien Rouhaud wrote:
> Unless I'm missing something both can be handled using pg_node_attr()
> annotations that already exists?

Indeed, that should work for the locators.
--
Michael


signature.asc
Description: PGP signature

Re: Adding Support for Copy callback functionality on COPY TO api

2022-10-10 Thread Nathan Bossart

On Tue, Oct 11, 2022 at 11:52:03AM +0900, Michael Paquier wrote:
> Okay.  So, I have reviewed the whole thing, added a description of all
> the fields of BeginCopyTo() in its top comment, tweaked a few things
> and added in the module an extra NOTICE with the number of processed
> rows.  The result seemed fine, so applied.

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Switching XLog source from archive to streaming when primary available

2022-10-10 Thread Nathan Bossart

On Mon, Oct 10, 2022 at 11:33:57AM +0530, Bharath Rupireddy wrote:
> On Mon, Oct 10, 2022 at 3:17 AM Nathan Bossart  
> wrote:
>> I wonder if it would be better to simply remove this extra polling of
>> pg_wal as a prerequisite to your patch.  The existing commentary leads me
>> to think there might not be a strong reason for this behavior, so it could
>> be a nice way to simplify your patch.
> 
> I don't think it's a good idea to remove that completely. As said
> above, it might help someone, we never know.

It would be great to hear whether anyone is using this functionality.  If
no one is aware of existing usage and there is no interest in keeping it
around, I don't think it would be unreasonable to remove it in v16.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Reducing the chunk header sizes on all memory context types

2022-10-10 Thread John Naylor

On Tue, Oct 11, 2022 at 5:31 AM David Rowley  wrote:
>
> The proposed patches in [1] do aim to make additional usages of the
> slab allocator, and I have a feeling that we'll want to fix the
> performance of slab.c before those. Perhaps the Asserts are a better
> option if we're to get the proposed radix tree implementation.

Going by [1], that use case is not actually a natural fit for slab because
of memory fragmentation. The motivation to use slab there was that the
allocation sizes are just over a power of two, leading to a lot of wasted
space for aset. FWIW, I have proposed in that thread a scheme to squeeze
things into power-of-two sizes without wasting quite as much space. That's
not a done deal, of course, but it could work today without adding memory
management code.

[1]
https://www.postgresql.org/message-id/20220704220038.at2ane5xkymzzssb%40awork3.anarazel.de

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Non-decimal integer literals

2022-10-10 Thread Junwang Zhao

Hi Peter,

  /* process digits */
+ if (ptr[0] == '0' && (ptr[1] == 'x' || ptr[1] == 'X'))
+ {
+ ptr += 2;
+ while (*ptr && isxdigit((unsigned char) *ptr))
+ {
+ int8 digit = hexlookup[(unsigned char) *ptr];
+
+ if (unlikely(pg_mul_s16_overflow(tmp, 16, &tmp)) ||
+ unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
+ goto out_of_range;
+
+ ptr++;
+ }
+ }
+ else if (ptr[0] == '0' && (ptr[1] == 'o' || ptr[1] == 'O'))
+ {
+ ptr += 2;
+
+ while (*ptr && (*ptr >= '0' && *ptr <= '7'))
+ {
+ int8 digit = (*ptr++ - '0');
+
+ if (unlikely(pg_mul_s16_overflow(tmp, 8, &tmp)) ||
+ unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
+ goto out_of_range;
+ }
+ }
+ else if (ptr[0] == '0' && (ptr[1] == 'b' || ptr[1] == 'B'))
+ {
+ ptr += 2;
+
+ while (*ptr && (*ptr >= '0' && *ptr <= '1'))
+ {
+ int8 digit = (*ptr++ - '0');
+
+ if (unlikely(pg_mul_s16_overflow(tmp, 2, &tmp)) ||
+ unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
+ goto out_of_range;
+ }
+ }
+ else
+ {
  while (*ptr && isdigit((unsigned char) *ptr))
  {
  int8 digit = (*ptr++ - '0');
@@ -128,6 +181,7 @@ pg_strtoint16(const char *s)
  unlikely(pg_sub_s16_overflow(tmp, digit, &tmp)))
  goto out_of_range;
  }
+ }

What do you think if we move these code into a static inline function? like:

static inline char*
process_digits(char *ptr, int32 *result)
{
...
}

On Mon, Oct 10, 2022 at 10:17 PM Peter Eisentraut
 wrote:
>
> On 16.02.22 11:11, Peter Eisentraut wrote:
> > The remaining patches are material for PG16 at this point, and I will
> > set the commit fest item to returned with feedback in the meantime.
>
> Time to continue with this.
>
> Attached is a rebased and cleaned up patch for non-decimal integer
> literals.  (I don't include the underscores-in-numeric literals patch.
> I'm keeping that for later.)
>
> Two open issues from my notes:
>
> Technically, numeric_in() should be made aware of this, but that seems
> relatively complicated and maybe not necessary for the first iteration.
>
> Taking another look around ecpg to see how this interacts with C-syntax
> integer literals.  I'm not aware of any particular issues, but it's
> understandably tricky.
>
> Other than that, this seems pretty complete as a start.



-- 
Regards
Junwang Zhao

Re: subtransaction performance

2022-10-10 Thread Nathan Bossart

On Mon, Oct 10, 2022 at 02:20:37PM -0400, Bruce Momjian wrote:
> On Fri, Oct  7, 2022 at 03:23:27PM -0700, Zhihong Yu wrote:
>> I wonder if SAVEPOINT / subtransaction performance has been boosted since the
>> blog was written.
> 
> No, I have not seen any changes in this area since then.  Seems there
> are two problems --- the 64 cache per session and the 64k on the
> replica.  In both cases, it seems sizing is not optimal, but sizing is
> never optimal.  I guess we can look at allowing manual size adjustment,
> automatic size adjustment, or a different approach that is more graceful
> for larger savepoint workloads.

I believe the following commitfest entries might be relevant to this
discussion:

https://commitfest.postgresql.org/39/2627/
https://commitfest.postgresql.org/39/3514/
https://commitfest.postgresql.org/39/3806/

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: create subscription - improved warning message

2022-10-10 Thread Amit Kapila

On Mon, Oct 10, 2022 at 8:14 PM Tom Lane  wrote:
>
> Peter Smith  writes:
> > On Mon, Oct 10, 2022 at 4:04 PM Amit Kapila  wrote:
> >> I think the below gives accurate information:
> >> WARNING: subscription was created, but is not connected
> >> HINT: You should create the slot manually, enable the subscription,
> >> and run %s to initiate replication.
>
> > +1
>
> It feels a little strange to me that we describe two steps rather vaguely
> and then give exact SQL for the third step.  How about, say,
>
> HINT: To initiate replication, you must manually create the replication
> slot, enable the subscription, and refresh the subscription.
>

LGTM. BTW, do we want to slightly adjust the documentation for the
connect option on CREATE SUBSCRIPTION page [1]? It says: "Since no
connection is made when this option is false, no tables are
subscribed, and so after you enable the subscription nothing will be
replicated. You will need to then run ALTER SUBSCRIPTION ... REFRESH
PUBLICATION for tables to be subscribed."

It doesn't say anything about manually creating the slot and probably
the wording can be made similar.

[1] - https://www.postgresql.org/docs/devel/sql-createsubscription.html

-- 
With Regards,
Amit Kapila.

RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2022-10-10 Thread kuroda.hay...@fujitsu.com

Dear Önder,

Thanks for updating the patch! I checked yours and almost good.
Followings are just cosmetic comments.

===
01. relation.c - GetCheapestReplicaIdentityFullPath

```
 * The reason that the planner would not pick partial indexes and 
indexes
 * with only expressions based on the way currently baserestrictinfos 
are
 * formed (e.g., col_1 = $1 ... AND col_N = $2).
```

Is "col_N = $2" a typo? I think it should be "col_N = $N" or "attr1 = $1 ... 
AND attrN = $N".

===
02. 032_subscribe_use_index.pl

If a table has a primary key on the subscriber, it will be used even if 
enable_indexscan is false(legacy behavior).
Should we test it?

~~~
03. 032_subscribe_use_index.pl -  SUBSCRIPTION RE-CALCULATES INDEX AFTER 
CREATE/DROP INDEX

I think this test seems to be not trivial, so could you write down the 
motivation?

~~~
04. 032_subscribe_use_index.pl -  SUBSCRIPTION RE-CALCULATES INDEX AFTER 
CREATE/DROP INDEX

```
# wait until the index is created
$node_subscriber->poll_query_until(
'postgres', q{select count(*)=1 from pg_stat_all_indexes where 
indexrelname = 'test_replica_id_full_idx';}
) or die "Timed out while waiting for check subscriber tap_sub_rep_full_0 
updates one row via index";
```

CREATE INDEX is a synchronous behavior, right? If so we don't have to wait here.
...And the comment in case of die may be wrong.
(There are some cases like this)

~~~
05. 032_subscribe_use_index.pl - SUBSCRIPTION USES INDEX UPDATEs MULTIPLE ROWS

```
# Testcase start: SUBSCRIPTION USES INDEX UPDATEs MULTIPLE ROWS
#
# Basic test where the subscriber uses index
# and touches 50 rows with UPDATE
```

"touches 50 rows with UPDATE" -> "updates 50 rows", per other tests.

~~~
06. 032_subscribe_use_index.pl - SUBSCRIPTION CAN UPDATE THE INDEX IT USES 
AFTER ANALYZE

I think this test seems to be not trivial, so could you write down the 
motivation?
(Same as Re-calclate)

~~~
07. 032_subscribe_use_index.pl - SUBSCRIPTION CAN UPDATE THE INDEX IT USES 
AFTER ANALYZE

```
# show that index_b is not used
$node_subscriber->poll_query_until(
'postgres', q{select idx_scan=0 from pg_stat_all_indexes where 
indexrelname = 'index_b';}
) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates 
two rows via index scan with index on high cardinality column-2";
```

I think we don't have to wait here, is() should be used instead. 
poll_query_until() should be used only when idx_scan>0 is checked.
(There are some cases like this)

~~~
08. 032_subscribe_use_index.pl - SUBSCRIPTION USES INDEX ON PARTITIONED TABLES

```
# make sure that the subscriber has the correct data
$node_subscriber->poll_query_until(
'postgres', q{select sum(user_id+value_1+value_2)=550070 AND 
count(DISTINCT(user_id,value_1, value_2))=981 from users_table_part;}
) or die "ensure subscriber has the correct data at the end of the test";
```

I think we can replace it to wait_for_catchup() and is()...
Moreover, we don't have to wait here because in above line we wait until the 
index is used on the subscriber.
(There are some cases like this)


Best Regards,
Hayato Kuroda
FUJITSU LIMITED

PostgreSQL Logical decoding

2022-10-10 Thread Ankit Oza

Hello,

We are looking for an example on how to consume the changes of WAL produced
by logical decoding (streaming or SQL interface) in another postgres server.

Basically, we are trying to create a replica/standby postgre server to a
primary progre server. Between Logical replication and Logical Decoding we
came up with Logical decoding as the choice due to limitation of logical
replication (materialized views, external views/tables, sequences not
replicated). However we are not finding a good example with instructions on
how to set up a consumer postgre server.

Thanks
Ankit

Re: Suppressing useless wakeups in walreceiver

2022-10-10 Thread Bharath Rupireddy

On Tue, Oct 11, 2022 at 8:20 AM Nathan Bossart  wrote:
>
> > AFICS, the aim of the patch isn't optimizing around
> > GetCurrentTimestamp() calls and the patch does seem to change quite a
> > bit of them which may cause a change of the behaviour. I think that
> > the GetCurrentTimestamp() optimizations can go to 0003 patch in this
> > thread itself or we can discuss it as a separate thread to seek more
> > thoughts.
> >
> > The 'now' in many instances in the patch may not actually represent
> > the true current time and it includes time taken by other operations
> > as well. I think this will have problems.
>
> What problems do you think this will cause?

now = t1;
XLogWalRcvSendReply() /* say it ran for a minute or so for whatever reasons */
XLogWalRcvSendHSFeedback() /* with patch walrecevier sends hot standby
feedback more often without properly honouring
wal_receiver_status_interval because the 'now' isn't actually the
current time as far as that function is concerned, it is
t1 + XLogWalRcvSendReply()'s time. */

Well, is this really a problem? I'm not sure about that. Let's hear from others.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: use has_privs_of_role() for pg_hba.conf

2022-10-10 Thread Michael Paquier

On Sun, Oct 09, 2022 at 02:13:48PM -0700, Nathan Bossart wrote:
> Here you go.

Thanks, applied.  It took me a few minutes to note that
regress_regression_* is required in the object names because we need
to use the same name for the parent role and the database, with
"regress_" being required for the role and "regression" being required
for the database.  I have added an extra section where pg_hba.conf is
set to match only the parent role, while on it.  perltidy has reshaped
things in an interesting way, because the generated log_[un]like is
long, it seems.
--
Michael


signature.asc
Description: PGP signature

Re: PostgreSQL Logical decoding

2022-10-10 Thread Ashutosh Bapat

Hi Ankit,

On Tue, Oct 11, 2022 at 9:32 AM Ankit Oza  wrote:
>
> Hello,
>
> We are looking for an example on how to consume the changes of WAL produced 
> by logical decoding (streaming or SQL interface) in another postgres server.

built-in logical replication is good example to start looking for.
https://www.postgresql.org/docs/current/logical-replication.html

>
> Basically, we are trying to create a replica/standby postgre server to a 
> primary progre server. Between Logical replication and Logical Decoding we 
> came up with Logical decoding as the choice due to limitation of logical 
> replication (materialized views, external views/tables, sequences not 
> replicated). However we are not finding a good example with instructions on 
> how to set up a consumer postgre server.
>

Logical decoding is the process to convert WAL to a logical change,
logical replication deals with transferring these changes to another
server and applying those there. So they work in tandem; just one
without the other can not be used. So I am confused about your
requirements.

-- 
Best Wishes,
Ashutosh Bapat

Re: Remove an unnecessary LSN calculation while validating WAL page header

2022-10-10 Thread Kyotaro Horiguchi

At Mon, 10 Oct 2022 08:53:55 +0530, Bharath Rupireddy 
 wrote in 
> It looks like we have an unnecessary XLogSegNoOffsetToRecPtr() in
> XLogReaderValidatePageHeader(). We pass the start LSN of the WAL page
> and check if it matches with the LSN that was stored in the WAL page
> header (xlp_pageaddr). We find segno, offset and LSN again using
> XLogSegNoOffsetToRecPtr(). This happens to be the same as the passed
> in LSN 'recptr'.

Yeah, that's obviously useless. It looks like a thinko in pg93 when
recptr became to be directly passed from the caller instead of
calculating from static variables for file, segment and in-segment
offset.

> Here's a tiny patch removing the unnecessary XLogSegNoOffsetToRecPtr()
> and using the passed in 'recptr'.

Looks good to me.

# Mysteriously, I didn't find a code to change readId in the pg92 tree..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: src/test/perl/PostgreSQL/Test/*.pm not installed

2022-10-10 Thread Michael Paquier

On Mon, Oct 10, 2022 at 11:34:15AM +0200, Alvaro Herrera wrote:
> I noticed that the new Perl test modules are not installed, so if you
> try to use PostgreSQL/Test/Cluster.pm in an external test from pgxs, it
> fails with the modules not being found.
> 
> I see no reason for this other than having overseen it in b235d41d9646,
> so I propose the attached (for all branches, naturally.)

+1, good catch.  The patch looks fine.
--
Michael


signature.asc
Description: PGP signature

Re: Fast COPY FROM based on batch insert

2022-10-10 Thread Andrey Lepikhov


On 10/7/22 11:18, Etsuro Fujita wrote:

On Tue, Sep 27, 2022 at 6:03 PM Etsuro Fujita  wrote:

I will review the patch a bit more, but I feel that it is
in good shape.


One thing I noticed is this bit added to CopyMultiInsertBufferFlush()
to run triggers on the foreign table.

+   /* Run AFTER ROW INSERT triggers */
+   if (resultRelInfo->ri_TrigDesc != NULL &&
+   (resultRelInfo->ri_TrigDesc->trig_insert_after_row ||
+resultRelInfo->ri_TrigDesc->trig_insert_new_table))
+   {
+   Oid relid =
RelationGetRelid(resultRelInfo->ri_RelationDesc);
+
+   for (i = 0; i < inserted; i++)
+   {
+   TupleTableSlot *slot = rslots[i];
+
+   /*
+* AFTER ROW Triggers might reference the tableoid column,
+* so (re-)initialize tts_tableOid before evaluating them.
+*/
+   slot->tts_tableOid = relid;
+
+   ExecARInsertTriggers(estate, resultRelInfo,
+slot, NIL,
+cstate->transition_capture);
+   }
+   }

Since foreign tables cannot have transition tables, we have
trig_insert_new_table=false.  So I simplified the if test and added an
assertion ensuring trig_insert_new_table=false.  Attached is a new
version of the patch.  I tweaked some comments a bit as well.  I think
the patch is committable.  So I plan on committing it next week if
there are no objections.
I reviewed the patch one more time. Only one question: bistate and 
ri_FdwRoutine are strongly bounded. Maybe to add some assertion on 
(ri_FdwRoutine XOR bistate) ? Just to prevent possible errors in future.


--
Regards
Andrey Lepikhov
Postgres Professional

Re: PostgreSQL Logical decoding

2022-10-10 Thread Amit Kapila

On Tue, Oct 11, 2022 at 9:32 AM Ankit Oza  wrote:
>
> Hello,
>
> We are looking for an example on how to consume the changes of WAL produced 
> by logical decoding (streaming or SQL interface) in another postgres server.
>
> Basically, we are trying to create a replica/standby postgre server to a 
> primary progre server. Between Logical replication and Logical Decoding we 
> came up with Logical decoding as the choice due to limitation of logical 
> replication (materialized views, external views/tables, sequences not 
> replicated). However we are not finding a good example with instructions on 
> how to set up a consumer postgre server.
>

I think from a code perspective, you can look at contrib/test_decoding
and src\backend\replication\pgoutput to see how to consume changes and
send them to the replica. You can refer to docs [1] for SQL functions
to consume changes.

[1] - 
https://www.postgresql.org/docs/devel/functions-admin.html#FUNCTIONS-REPLICATION

-- 
With Regards,
Amit Kapila.

Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf

2022-10-10 Thread Michael Paquier

On Mon, Oct 10, 2022 at 09:00:06AM +0200, Drouvot, Bertrand wrote:
>   foreach(cell, tokens)
>   {
> [...]
> + tokreg = lfirst(cell);
> + if (!token_is_regexp(tokreg))
>   {
> - if (strcmp(dbname, role) == 0)
> + if (am_walsender && !am_db_walsender)
> + {
> + /*
> +  * physical replication walsender connections 
> can only match
> +  * replication keyword
> +  */
> + if (token_is_keyword(tokreg->authtoken, 
> "replication"))
> + return true;
> + }
> + else if (token_is_keyword(tokreg->authtoken, "all"))
>   return true;

When checking the list of databases in check_db(), physical WAL
senders (aka am_walsender && !am_db_walsender) would be able to accept
regexps, but these should only accept "replication" and never a
regexp, no?  The second check on "replication" placed in the branch
for token_is_regexp() in your patch would be correctly placed, though.
This is kind of special in the HBA logic, coming back to 9.0 where
physical replication and this special role property have been
introduced.  WAL senders have gained an actual database property later
on in 9.4 with logical decoding, keeping "replication" for
compatibility (connection strings can use replication=database to
connect as a non-physical WAL sender and connect to a specific
database).

> +typedef struct AuthToken
> +{
> + char   *string;
> + boolquoted;
> +} AuthToken;
> +
> +/*
> + * Distinguish the case a token has to be treated as a regular
> + * expression or not.
> + */
> +typedef struct AuthTokenOrRegex
> +{
> + boolis_regex;
> +
> + /*
> +  * Not an union as we still need the token string for fill_hba_line().
> +  */
> + AuthToken  *authtoken;
> + regex_t*regex;
> +} AuthTokenOrRegex;

Hmm.  With is_regex to check if a regex_t exists, both structures may
not be necessary.  I have not put my hands on that directly, but if
I guess that I would shape things to have only AuthToken with
(enforcing regex_t in priority if set in the list of elements to check
for a match):
- the string
- quoted
- regex_t
A list member should never have (regex_t != NULL && quoted), right?
Hostnames would never be quoted, as well.

> +# test with a comma in the regular expression
> +reset_pg_hba($node, 'all', '"/^.*5,.*e$"', 'password');
> +test_conn($node, 'user=md5,role', 'password', 'matching regexp for username',
> + 0);

So, we check here that the role includes "5," in its name.  This is
getting fun to parse ;)

>  elsif ($ENV{PG_TEST_EXTRA} !~ /\bssl\b/)
>  {
> - plan skip_all => 'Potentially unsafe test SSL not enabled in 
> PG_TEST_EXTRA';
> + plan skip_all =>
> +   'Potentially unsafe test SSL not enabled in PG_TEST_EXTRA';
>  }

Unrelated noise from perltidy.
--
Michael

signature.asc
Description: PGP signature

Re: create subscription - improved warning message

2022-10-10 Thread Peter Smith

On Tue, Oct 11, 2022 at 2:46 PM Amit Kapila  wrote:
>
> On Mon, Oct 10, 2022 at 8:14 PM Tom Lane  wrote:
> >
> > Peter Smith  writes:
> > > On Mon, Oct 10, 2022 at 4:04 PM Amit Kapila  
> > > wrote:
> > >> I think the below gives accurate information:
> > >> WARNING: subscription was created, but is not connected
> > >> HINT: You should create the slot manually, enable the subscription,
> > >> and run %s to initiate replication.
> >
> > > +1
> >
> > It feels a little strange to me that we describe two steps rather vaguely
> > and then give exact SQL for the third step.  How about, say,
> >
> > HINT: To initiate replication, you must manually create the replication
> > slot, enable the subscription, and refresh the subscription.
> >
>
> LGTM.

PSA patch v3-0001 where the message/hint is worded as suggested above

> BTW, do we want to slightly adjust the documentation for the
> connect option on CREATE SUBSCRIPTION page [1]? It says: "Since no
> connection is made when this option is false, no tables are
> subscribed, and so after you enable the subscription nothing will be
> replicated. You will need to then run ALTER SUBSCRIPTION ... REFRESH
> PUBLICATION for tables to be subscribed."
>
> It doesn't say anything about manually creating the slot and probably
> the wording can be made similar.
>

PSA patch v3-0002 which changes CREATE SUBSCRIPTION pgdocs to use the
same wording as in the HINT message.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v3-0002-create-subscription-pgdocs.patch
Description: Binary data


v3-0001-create-subscription-warning-message.patch
Description: Binary data

Re: Startup process on a hot standby crashes with an error "invalid memory alloc request size 1073741824" while replaying "Standby/LOCK" records

2022-10-10 Thread Kyotaro Horiguchi

> On Wed, 5 Oct 2022 at 16:30, Tom Lane  wrote:
> > One other point to discuss: should we consider back-patching?  I've
> > got mixed feelings about that myself.  I don't think that cases where
> > this helps significantly are at all mainstream, so I'm kind of leaning
> > to "patch HEAD only".

At Mon, 10 Oct 2022 13:24:34 +0100, Simon Riggs  
wrote in 
> It looks fine to eventually backpatch, since StandbyReleaseLockTree()
> was optimized to only be called when the transaction had actually done
> some AccessExclusiveLocks.
> 
> So the performance loss is minor and isolated to the users of such
> locks, so I see no problems with it.

At Wed, 5 Oct 2022 12:00:55 -0700, Nathan Bossart  
wrote in 
> +1.  It can always be back-patched in the future if there are additional
> reports.

The third +1 from me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

93 matches

Mail list logo