date:20210607

Re: Duplicate history file?

2021-06-07 Thread Tatsuro Yamada


Hi Horiguchi-san,



(To recap: In a replication set using archive, startup tries to
restore WAL files from archive before checking pg_wal directory for
the desired file.  The behavior itself is intentionally designed and
reasonable. However, the restore code notifies of a restored file
regardless of whether it has been already archived or not.  If
archive_command is written so as to return error for overwriting as we
suggest in the documentation, that behavior causes archive failure.)

After playing with this, I see the problem just by restarting a
standby even in a simple archive-replication set after making
not-special prerequisites.  So I think this is worth fixing.

With this patch, KeepFileRestoredFromArchive compares the contents of
just-restored file and the existing file for the same segment only
when:

  - archive_mode = always
  and - the file to restore already exists in pgwal
  and - it has a .done and/or .ready status file.

which doesn't happen usually.  Then the function skips archive
notification if the contents are identical.  The included TAP test is
working both on Linux and Windows.



Thank you for the analysis and the patch.
I'll try the patch tomorrow.

I just noticed that this thread is still tied to another thread
(it's not an independent thread). To fix that, it may be better to
create a new thread again.


Regards,
Tatsuro Yamada

back-port one-line gcc-10+ warning fix to REL_10_STABLE

2021-06-07 Thread Anton Voloshin


Hello, hackers,

Currently, REL_10_STABLE can't be compiled with gcc-10 or 11, -Werror 
and "./configure" without arguments. E.g. gcc-11 gives an error:


objectaddress.c:1618:99: error: ‘typeoids’ may be used uninitialized 
[-Werror=maybe-uninitialized]
 1618 | 
  ObjectIdGetDatum(typeoids[1]),

...
objectaddress.c: In function ‘get_object_address’:
objectaddress.c:1578:33: note: ‘typeoids’ declared here
 1578 | Oid typeoids[2];
  | ^~~~

gcc-10 gives a similar error.

I propose to back-port a small part of Tom Lane's commit 9a725f7b5cb7, 
which was somehow never back-ported to REL_10_STABLE. The fix is

explicit initialization to InvalidOid for the typeoids[2] variable involved.

Even if, technically, the initialization is probably not required (or
so I've heard), in PostgreSQL 11+ it was deemed that explicit 
initialization is acceptable here to avoid compiler warning.


Please note that above-mentioned commit 9a725f7b5cb7 adds initialization 
for a variable from the previous line, typenames[2] as well, but since 
gcc 10 and 11 don't warn on that, I guess there is no need to add that 
initialization as well.


The proposed one-line patch is attached, but basically it is:
diff --git a/src/backend/catalog/objectaddress.c 
b/src/backend/catalog/objectaddress.c

index b0ff255a593..8cc9dc003c8 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -1591,6 +1591,7 @@ get_object_address_opf_member(ObjectType objtype,
famaddr = get_object_address_opcf(OBJECT_OPFAMILY, copy, false);

/* find out left/right type names and OIDs */
+   typeoids[0] = typeoids[1] = InvalidOid;
i = 0;
foreach(cell, lsecond(object))
{

I've verified that all other current branches, i.e. 
REL9_6_STABLE..REL_13_STABLE (excluding REL_10_STABLE) and master can 
compile cleanly even with bare ./configure without arguments using gcc-11.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index b0ff255a593..8cc9dc003c8 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -1591,6 +1591,7 @@ get_object_address_opf_member(ObjectType objtype,
 	famaddr = get_object_address_opcf(OBJECT_OPFAMILY, copy, false);
 
 	/* find out left/right type names and OIDs */
+	typeoids[0] = typeoids[1] = InvalidOid;
 	i = 0;
 	foreach(cell, lsecond(object))
 	{

Re: Logical replication keepalive flood

2021-06-07 Thread Kyotaro Horiguchi

At Sat, 5 Jun 2021 16:08:00 +0500, Abbas Butt  
wrote in 
> Hi,
> I have observed the following behavior with PostgreSQL 13.3.
> 
> The WAL sender process sends approximately 500 keepalive messages per
> second to pg_recvlogical.
> These keepalive messages are totally un-necessary.
> Keepalives should be sent only if there is no network traffic and a certain
> time (half of wal_sender_timeout) passes.
> These keepalive messages not only choke the network but also impact the
> performance of the receiver,
> because the receiver has to process the received message and then decide
> whether to reply to it or not.
> The receiver remains busy doing this activity 500 times a second.

I can reproduce the problem.

> On investigation it is revealed that the following code fragment in
> function WalSndWaitForWal in file walsender.c is responsible for sending
> these frequent keepalives:
> 
> if (MyWalSnd->flush < sentPtr &&
> MyWalSnd->write < sentPtr &&
> !waiting_for_ping_response)
> WalSndKeepalive(false);

The immediate cause is pg_recvlogical doesn't send a reply before
sleeping. Currently it sends replies every 10 seconds intervals.

So the attached first patch stops the flood.

That said, I don't think it is not intended that logical walsender
sends keep-alive packets with such a high frequency.  It happens
because walsender actually doesn't wait at all because it waits on
WL_SOCKET_WRITEABLE because the keep-alive packet inserted just before
is always pending.

So as the attached second, we should try to flush out the keep-alive
packets if possible before checking pg_is_send_pending().

Any one can "fix" the issue but I think each of them is reasonable by
itself.

Any thoughts, suggestions and/or opinions?

regareds.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c
index 5efec160e8..4497ff1071 100644
--- a/src/bin/pg_basebackup/pg_recvlogical.c
+++ b/src/bin/pg_basebackup/pg_recvlogical.c
@@ -362,6 +362,10 @@ StreamLogicalLog(void)
 goto error;
 			}
 
+			/* sned reply for all writes so far */
+			if (!flushAndSendFeedback(conn, &now))
+goto error;
+			
 			FD_ZERO(&input_mask);
 			FD_SET(PQsocket(conn), &input_mask);
 
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 109c723f4e..fcea56d1c1 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1469,6 +1469,9 @@ WalSndWaitForWal(XLogRecPtr loc)
 		/* Send keepalive if the time has come */
 		WalSndKeepaliveIfNecessary();
 
+		/* We may have queued a keep alive packet. flush it before sleeping. */
+		pq_flush_if_writable();
+		
 		/*
 		 * Sleep until something happens or we time out.  Also wait for the
 		 * socket becoming writable, if there's still pending output.

Re: Duplicate history file?

2021-06-07 Thread Kyotaro Horiguchi

At Mon, 07 Jun 2021 16:13:08 +0900, Tatsuro Yamada 
 wrote in 
> I just noticed that this thread is still tied to another thread
> (it's not an independent thread). To fix that, it may be better to
> create a new thread again.

Mmm. Maybe my mailer automatically inserted In-Reply-To field for the
cited messsage.  Do we (the two of us) bother re-launching a new
thread?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS

2021-06-07 Thread Masahiko Sawada

On Mon, Jun 7, 2021 at 11:15 AM Tom Lane  wrote:
>
> husky just reported $SUBJECT:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky&dt=2021-06-05%2013%3A42%3A17
>
> and I find I can reproduce that locally:
>
> diff -U3 
> /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out 
> /home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out
> --- /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out 
>   2021-01-20 11:12:24.854346717 -0500
> +++ /home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out  
>   2021-06-06 22:12:07.948890104 -0400
> @@ -215,7 +215,8 @@
>   0 | f   | f
>   1 | f   | f
>   2 | t   | t
> -(3 rows)
> + 3 | t   | t
> +(4 rows)
>
>  select * from pg_check_frozen('copyfreeze');
>   t_ctid
> @@ -235,7 +236,8 @@
>   0 | t   | t
>   1 | f   | f
>   2 | t   | t
> -(3 rows)
> + 3 | t   | t
> +(4 rows)
>
>  select * from pg_check_frozen('copyfreeze');
>   t_ctid
>
>
> The test cases that are failing date back to January (7db0cd2145f),
> so I think this is some side-effect of a recent commit, but I have
> no idea which one.

It seems like the recent revert (8e03eb92e9a) is relevant.

After committing 7db0cd2145f we had the same regression test failure
in January[1]. Then we fixed that issue by 39b66a91b. But since we
recently reverted most of 39b66a91b, the same issue happened again.

Regards,

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2021-01-19+20%3A27%3A46

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Duplicate history file?

2021-06-07 Thread Kyotaro Horiguchi

At Mon, 07 Jun 2021 15:57:00 +0900, Tatsuro Yamada 
 wrote in 
> Hi Horiguchi-san,
> 
> >> Regarding "test ! -f",
> >> I am wondering how many people are using the test command for
> >> archive_command. If I remember correctly, the guide provided by
> >> NTT OSS Center that we are using does not recommend using the test
> >> command.
> > I think, as the PG-REX documentation says, the simple cp works well as
> > far as the assumption of PG-REX - no double failure happenes, and
> > following the instruction - holds.
> 
> 
> I believe that this assumption started to be wrong after
> archive_mode=always was introduced. As far as I can tell, it doesn't
> happen when it's archive_mode=on.

?? Doesn't *simple* cp (without "test") work for you?  I meant that
the operating assumption of PG-REX ensures that overwriting doesn't
cause a problem.

> > Otherwise the documentation would need someting like the following if
> > we assume the current behavior.
> > 
> >> The archive command should generally be designed to refuse to
> >> overwrite any pre-existing archive file. This is an important safety
> >> feature to preserve the integrity of your archive in case of
> >> administrator error (such as sending the output of two different
> >> servers to the same archive directory).
> > + For standby with the setting archive_mode=always, there's a case
> > where
> > + the same file is archived more than once.  For safety, it is
> > + recommended that when the destination file exists, the
> > archive_command
> > + returns zero if it is byte-identical to the source file.
> 
> 
> Agreed.
> That is same solution as I mentioned earlier.
> If possible, it also would better to write it postgresql.conf (that
> might
> be overkill?!).

Mmmm, I didn't noticed that. I don't think such a complex caveat fits
the configuration file. And if we need such a caveart there, it might
be the sign that we need to fix the causal behavior...

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: DELETE CASCADE

2021-06-07 Thread Peter Eisentraut


On 05.06.21 14:21, David Christensen wrote:



On Jun 5, 2021, at 2:30 AM, Peter Eisentraut 
 wrote:

On 03.06.21 22:49, David Christensen wrote:

Presented for discussion is a POC for a DELETE CASCADE functionality, which will allow 
you one-shot usage of treating existing NO ACTION and RESTRICT FK constraints as if they 
were originally defined as CASCADE constraints.  I can't tell you how many times this 
functionality would have been useful in the field, and despite the expected answer of 
"define your constraints right in the first place", this is not always an 
option, nor is the ability to change that easily (or create new constraints that need to 
revalidate against big tables) always the best option.


I think, if we think this is useful, the other way around would also be useful: 
Override a foreign key defined as ON DELETE CASCADE to behave as RESTRICT for a 
particular command.


I am not opposed to this, but I am struggling to come up with a use case. Where 
would this be useful?


If you suspect a primary key row is no longer used, you want to delete 
it, but don't want to accidentally delete it if it's still used.


I sense more complicated concurrency and permission issues, however.

Re: Duplicate history file?

2021-06-07 Thread Tatsuro Yamada


On 2021/06/07 16:31, Kyotaro Horiguchi wrote:

At Mon, 07 Jun 2021 16:13:08 +0900, Tatsuro Yamada 
 wrote in

I just noticed that this thread is still tied to another thread
(it's not an independent thread). To fix that, it may be better to
create a new thread again.


Mmm. Maybe my mailer automatically inserted In-Reply-To field for the
cited messsage.  Do we (the two of us) bother re-launching a new
thread?



The reason I suggested it was because I thought it might be
confusing if the threads were not independent when registered in
a commitfest. If that is not a problem, then I'm fine with it as is. :-D

Regards,
Tatsuro Yamada

Re: DELETE CASCADE

2021-06-07 Thread Peter Eisentraut


On 05.06.21 14:25, David Christensen wrote:



On Jun 5, 2021, at 2:29 AM, Peter Eisentraut 
 wrote:

On 04.06.21 22:24, David Christensen wrote:

So what are the necessary and sufficient conditions to check at this point?  
The constraint already exists, so what permissions would we need to check 
against which table(s) in order to grant this action?


I think you would need DELETE privilege on all affected tables.


So basically where we are dispatching to the CASCADE guts, first check session 
user’s DELETE permission and throw the normal permissions error if they can’t 
delete?


Actually, you also need appropriate SELECT permissions that correspond 
to the WHERE clause of the DELETE statement.

Re: logical decoding bug: segfault in ReorderBufferToastReplace()

2021-06-07 Thread Dilip Kumar

On Sat, Jun 5, 2021 at 5:41 AM Alvaro Herrera  wrote:
>
> > This indicates that a toast record was present for that relation,
> > despite:
>
> Can you explain what it is you saw that indicates that a toast record
> was present for the relation?  I may be misreading the code, but there's
> nothing obvious that says that if we reach there, then a toast datum
> exists anywhere for this relation.  We only know that txn->toast_hash is
> set, but that could be because the transaction touched a toast record in
> some other table.  Right?

Is this problem is related to the thread [1], where due to spec abort
the toast hash was not deleted and after that, if the next record is
for some other relation which is not having a toast table you will see
this error.  There are a few other problems if the toast hash is not
cleaned due to spec abort.  I have submitted patches with 2 approached
in that thread.

[1] 
https://www.postgresql.org/message-id/CAFiTN-szfpMXF2H%2Bmk3m_9AB610v103NTv7Z1E8uDBr9iQg1gg%40mail.gmail.com

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] logical decoding of two-phase transactions

2021-06-07 Thread Greg Nancarrow

On Wed, Jun 2, 2021 at 9:04 AM Peter Smith  wrote:
>
> Please find attached the latest patch set v82*
>

Some suggested changes to the 0001 patch comments (and note also the
typo "doumentation"):
diff of before and after follows:


8c8
< built-in logical replication, we need to do the below things:
---
> built-in logical replication, we need to do the following things:
16,17c16,17
< * Add a new SUBSCRIPTION option "two_phase" to allow users to enable it.
< We enable the two_phase once the initial data sync is over.
---
> * Add a new SUBSCRIPTION option "two_phase" to allow users to enable two-phase
> transactions. We enable the two_phase once the initial data sync is over.
23c23
< * Adds new subscription TAP tests, and new subscription.sql regression tests.
---
> * Add new subscription TAP tests, and new subscription.sql regression tests.
25c25
< * Updates PG doumentation.
---
> * Update PG documentation.
33c33
< * Prepare API for in-progress transactions is not supported.
---
> * Prepare API for in-progress transactions.


Regards,
Greg Nancarrow
Fujitsu Australia

Duplicate history file?

2021-06-07 Thread Kyotaro Horiguchi

So, this is the new new thread.

This thread should have been started here:

https://www.postgresql.org/message-id/20210531.165825.921389284096975508.horikyota.ntt%40gmail.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Duplicate history file?

2021-06-07 Thread Kyotaro Horiguchi

(Sorry for the noise on the old thread..)

At Mon, 07 Jun 2021 16:54:49 +0900, Tatsuro Yamada 
 wrote in 
> On 2021/06/07 16:31, Kyotaro Horiguchi wrote:
> > At Mon, 07 Jun 2021 16:13:08 +0900, Tatsuro Yamada
> >  wrote in
> >> I just noticed that this thread is still tied to another thread
> >> (it's not an independent thread). To fix that, it may be better to
> >> create a new thread again.
> > Mmm. Maybe my mailer automatically inserted In-Reply-To field for the
> > cited messsage.  Do we (the two of us) bother re-launching a new
> > thread?
> 
> 
> The reason I suggested it was because I thought it might be
> confusing if the threads were not independent when registered in
> a commitfest. If that is not a problem, then I'm fine with it as
> is. :-D

(You can freely do that, too:p)

Hmm. I found that the pgsql-hackers archive treats the new thread as a
part of the old thread, so CF-app would do the same.

Anyway I re-launched a new standalone thread.

https://www.postgresql.org/message-id/20210607.173108.348241508233844279.horikyota.ntt%40gmail.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: SQL-standard function body

2021-06-07 Thread Peter Eisentraut


On 06.06.21 09:32, Julien Rouhaud wrote:

On Sat, Jun 05, 2021 at 09:44:18PM -0700, Noah Misch wrote:

I get a NULL pointer dereference if the function body has a doubled semicolon:

   create function f() returns int language sql begin atomic select 1;; end;


You don't even need a statements to reproduce the problem, a body containing
only semi-colon(s) will behave the same.

Attached patch should fix the problem.


Your patch filters out empty statements at the parse transformation 
phase, so they are no longer present when you dump the body back out. 
So your edits in the test expected files don't fit.


I suggest we just prohibit empty statements at the parse stage.  I don't 
see a strong reason to allow them, and if we wanted to, we'd have to do 
more work, e.g., in ruleutils.c to print them back out correctly.
>From 19817fa97da9afa4fd35365c6ecb968b53aff7fe Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 7 Jun 2021 10:49:36 +0200
Subject: [PATCH] Prevent empty statement in unquoted SQL function body

This currently crashes.  It was not meant to be supported, and there
is no support for example in ruleutils.c, so just prohibit it in the
parser.

Reported-by: Noah Misch 
Discussion: 
https://www.postgresql.org/message-id/20210606044418.ga297...@rfd.leadboat.com
---
 src/backend/parser/gram.y   | 7 ---
 src/test/regress/expected/create_function_3.out | 8 
 src/test/regress/sql/create_function_3.sql  | 6 ++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 9ee90e3f13..3cd30c15ec 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -893,11 +893,14 @@ stmtmulti:stmtmulti ';' toplevel_stmt
 
 /*
  * toplevel_stmt includes BEGIN and END.  stmt does not include them, because
- * those words have different meanings in function bodys.
+ * those words have different meanings in function bodys.  Also, empty
+ * statements are allowed at the top level but not in function bodies.
  */
 toplevel_stmt:
stmt
| TransactionStmtLegacy
+   | /*EMPTY*/
+   { $$ = NULL; }
;
 
 stmt:
@@ -1024,8 +1027,6 @@ stmt:
| VariableSetStmt
| VariableShowStmt
| ViewStmt
-   | /*EMPTY*/
-   { $$ = NULL; }
;
 
 /*
diff --git a/src/test/regress/expected/create_function_3.out 
b/src/test/regress/expected/create_function_3.out
index 5b6bc5eddb..41d8e49a23 100644
--- a/src/test/regress/expected/create_function_3.out
+++ b/src/test/regress/expected/create_function_3.out
@@ -290,6 +290,14 @@ CREATE FUNCTION functest_S_xx(x anyarray) RETURNS 
anyelement
 LANGUAGE SQL
 RETURN x[1];
 ERROR:  SQL function with unquoted function body cannot have polymorphic 
arguments
+-- error: empty statement
+CREATE FUNCTION functest_S_xx() RETURNS boolean
+BEGIN ATOMIC
+RETURN false;;
+END;
+ERROR:  syntax error at or near ";"
+LINE 3: RETURN false;;
+ ^
 -- check reporting of parse-analysis errors
 CREATE FUNCTION functest_S_xx(x date) RETURNS boolean
 LANGUAGE SQL
diff --git a/src/test/regress/sql/create_function_3.sql 
b/src/test/regress/sql/create_function_3.sql
index 4b778999ed..5c28ddcbce 100644
--- a/src/test/regress/sql/create_function_3.sql
+++ b/src/test/regress/sql/create_function_3.sql
@@ -191,6 +191,12 @@ CREATE FUNCTION functest_S_xx(x anyarray) RETURNS 
anyelement
 LANGUAGE SQL
 RETURN x[1];
 
+-- error: empty statement
+CREATE FUNCTION functest_S_xx() RETURNS boolean
+BEGIN ATOMIC
+RETURN false;;
+END;
+
 -- check reporting of parse-analysis errors
 CREATE FUNCTION functest_S_xx(x date) RETURNS boolean
 LANGUAGE SQL
-- 
2.31.1

Re: installcheck failure in indirect_toast with default_toast_compression = lz4

2021-06-07 Thread Michael Paquier

On Sun, Jun 06, 2021 at 03:52:57PM -0500, Justin Pryzby wrote:
> See also a prior discussion:
> https://www.postgresql.org/message-id/CAFiTN-sm8Dpx3q92g5ohTdZu1_wKsw96-KiEMf3SoK8DhRPfWw%40mail.gmail.com

Ah, thanks for the reference.  So this was discussed but not actually
fixed.  I can see the data getting stored inline rather than
externalized with lz4.  So, as the goal of the test is to stress the
case of externalized values, we'd better make sure that pglz is used.
I'll push something doing that with more comments added to the test.
--
Michael


signature.asc
Description: PGP signature

Re: missing GRANT on pg_subscription columns

2021-06-07 Thread Amit Kapila

On Thu, Jun 3, 2021 at 10:39 PM Tom Lane  wrote:
>
> "Euler Taveira"  writes:
> > I was checking the GRANT on pg_subscription and noticed that the command is 
> > not
> > correct. There is a comment that says "All columns of pg_subscription except
> > subconninfo are readable". However, there are columns that aren't included: 
> > oid
> > and subsynccommit. It seems an oversight in the commits 6f236e1eb8c and
> > 887227a1cc8.
>
> Ugh.
>
> > There are monitoring tools and data collectors that aren't using a
> > superuser to read catalog information (I usually recommend using 
> > pg_monitor).
> > Hence, you cannot join pg_subscription with relations such as
> > pg_subscription_rel or pg_stat_subscription because column oid has no
> > column-level privilege. I'm attaching a patch to fix it (indeed, 2 patches
> > because of additional columns for v14). We should add instructions in the 
> > minor
> > version release notes too.
>
> I agree with fixing this in HEAD.  But given that this has been wrong
> since v10 with zero previous complaints, I doubt that it is worth the
> complication of trying to do something about it in the back branches.
> Maybe we could just adjust the docs there, instead.
>

This sounds reasonable to me. Euler, can you provide the doc updates
for back-branches?

-- 
With Regards,
Amit Kapila.

Re: SQL-standard function body

2021-06-07 Thread Julien Rouhaud

On Mon, Jun 7, 2021 at 4:52 PM Peter Eisentraut
 wrote:
>
> Your patch filters out empty statements at the parse transformation
> phase, so they are no longer present when you dump the body back out.
> So your edits in the test expected files don't fit.

Oh, somehow the tests aren't failing here, I'm not sure what I did wrong.

> I suggest we just prohibit empty statements at the parse stage.  I don't
> see a strong reason to allow them, and if we wanted to, we'd have to do
> more work, e.g., in ruleutils.c to print them back out correctly.

I always thought extraneous semicolons were tokens to be ignored,
which happens to be internally implemented as empty statements, so
deparsing them is not required, similar to deparsing extraneous
whitespaces.  If the spec says otherwise then I agree it's not worth
implementing, but otherwise I'm not sure if it's really helpful to
error out.

Re: locking [user] catalog tables vs 2pc vs logical rep

2021-06-07 Thread Amit Kapila

On Mon, Jun 7, 2021 at 10:44 AM Amit Kapila  wrote:
>
> One more comment is that for HEAD, first just create a patch with
> synchronous replication-related doc changes and then write a separate
> patch for prepared transactions.
>

I noticed that docs for "Synchronous replication support for Logical
Decoding" has been introduced by commit
49c0864d7ef5227faa24f903902db90e5c9d5d69 which goes till 9.6. So, I
think you need to create a patch for 9.6 as well unless one of the
existing patches already applies in 9.6.

-- 
With Regards,
Amit Kapila.

Re: Asynchronous Append on postgres_fdw nodes.

2021-06-07 Thread Etsuro Fujita

On Fri, Jun 4, 2021 at 12:33 AM Andrey Lepikhov
 wrote:
> Good, this text is clear for me.

Cool!  I created a patch for that, which I'm attaching.  I'm planning
to commit the patch.

Thanks for reviewing!

Best regards,
Etsuro Fujita


note-about-sync-vs-async.patch
Description: Binary data

Re: Fix dropped object handling in pg_event_trigger_ddl_commands

2021-06-07 Thread Aleksander Alekseev

Hi hackers,

> Any opinions on the patch? Is this not worth the effort to fix or is
> there a better way to fix this?

I confirm that the bug still exists in master (be57f216). The patch
fixes it and looks good to me. I changed the wording a little and
added a regression test. The updated patch is in the attachment. I
added this thread to the CF and updated the status to "Ready for
Committer".

-- 
Best regards,
Aleksander Alekseev


v2-0001-Fix-pg_event_trigger_ddl_commands.patch
Description: Binary data

Re: SSL SNI

2021-06-07 Thread Peter Eisentraut


On 03.06.21 20:14, Tom Lane wrote:

I wrote:

It looks like the immediate problem can be resolved by just adding
a check for conn->pghost not being NULL,


... scratch that.  There's another problem here, which is that this
code should not be looking at conn->pghost AT ALL.  That will do the
wrong thing with a multi-element host list.  The right thing to be
looking at is conn->connhost[conn->whichhost].host --- with a test
to make sure it's not NULL or an empty string.  (I didn't stop to
study this code close enough to see if it'll ignore an empty
string without help.)


Patch attached.  Empty host string was handled implicitly by the IP 
detection expression, but I added an explicit check for sanity.  (I 
wasn't actually able to get an empty string to this point, but it's 
clearly better to be prepared for it.)


From 095bb64fc735c54d0ebf17ba6c13e0de5af44a36 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 7 Jun 2021 11:50:01 +0200
Subject: [PATCH] libpq: Fix SNI host handling

Fix handling of NULL host name (possibly by using hostaddr).  It
previously crashed.  Also, we should look at connhost, not pghost, to
handle multi-host specifications.

Reported-by: Jacob Champion 
Discussion: 
https://www.postgresql.org/message-id/504c276ab6eee000bb23d571ea9b0ced4250774e.ca...@vmware.com
---
 src/interfaces/libpq/fe-secure-openssl.c | 27 ++--
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/src/interfaces/libpq/fe-secure-openssl.c 
b/src/interfaces/libpq/fe-secure-openssl.c
index 00d43f3eff..36fec96ef1 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -1087,20 +1087,25 @@ initialize_SSL(PGconn *conn)
 * Per RFC 6066, do not set it if the host is a literal IP address (IPv4
 * or IPv6).
 */
-   if (conn->sslsni && conn->sslsni[0] &&
-   !(strspn(conn->pghost, "0123456789.") == strlen(conn->pghost) ||
- strchr(conn->pghost, ':')))
+   if (conn->sslsni && conn->sslsni[0])
{
-   if (SSL_set_tlsext_host_name(conn->ssl, conn->pghost) != 1)
+   const char *host = conn->connhost[conn->whichhost].host;
+
+   if (host && host[0] &&
+   !(strspn(host, "0123456789.") == strlen(host) ||
+ strchr(host, ':')))
{
-   char   *err = SSLerrmessage(ERR_get_error());
+   if (SSL_set_tlsext_host_name(conn->ssl, host) != 1)
+   {
+   char   *err = 
SSLerrmessage(ERR_get_error());
 
-   appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could 
not set SSL Server Name Indication (SNI): %s\n"),
- err);
-   SSLerrfree(err);
-   SSL_CTX_free(SSL_context);
-   return -1;
+   appendPQExpBuffer(&conn->errorMessage,
+ 
libpq_gettext("could not set SSL Server Name Indication (SNI): %s\n"),
+ err);
+   SSLerrfree(err);
+   SSL_CTX_free(SSL_context);
+   return -1;
+   }
}
}
 
-- 
2.31.1

Re: Tid scan improvements

2021-06-07 Thread Peter Eisentraut

This patch didn't add _outTidRangePath() even though we have outNode() 
coverage for most/all path nodes.  Was this just forgotten?  See 
attached patch.
From 3c696f812d4c6f8c66bc75105c3c1af79c3b2922 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 7 Jun 2021 12:04:49 +0200
Subject: [PATCH] Add _outTidRangePath()

We have outNode() coverage for all path nodes, but this one was
missed when it was added.
---
 src/backend/nodes/outfuncs.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 04696f613c..e32b92e299 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1859,6 +1859,16 @@ _outTidPath(StringInfo str, const TidPath *node)
WRITE_NODE_FIELD(tidquals);
 }
 
+static void
+_outTidRangePath(StringInfo str, const TidRangePath *node)
+{
+   WRITE_NODE_TYPE("TIDRANGEPATH");
+
+   _outPathInfo(str, (const Path *) node);
+
+   WRITE_NODE_FIELD(tidrangequals);
+}
+
 static void
 _outSubqueryScanPath(StringInfo str, const SubqueryScanPath *node)
 {
@@ -4166,6 +4176,9 @@ outNode(StringInfo str, const void *obj)
case T_TidPath:
_outTidPath(str, obj);
break;
+   case T_TidRangePath:
+   _outTidRangePath(str, obj);
+   break;
case T_SubqueryScanPath:
_outSubqueryScanPath(str, obj);
break;
-- 
2.31.1

Re: Logical replication keepalive flood

2021-06-07 Thread Amit Kapila

On Mon, Jun 7, 2021 at 12:54 PM Kyotaro Horiguchi
 wrote:
>
> At Sat, 5 Jun 2021 16:08:00 +0500, Abbas Butt  
> wrote in
> > Hi,
> > I have observed the following behavior with PostgreSQL 13.3.
> >
> > The WAL sender process sends approximately 500 keepalive messages per
> > second to pg_recvlogical.
> > These keepalive messages are totally un-necessary.
> > Keepalives should be sent only if there is no network traffic and a certain
> > time (half of wal_sender_timeout) passes.
> > These keepalive messages not only choke the network but also impact the
> > performance of the receiver,
> > because the receiver has to process the received message and then decide
> > whether to reply to it or not.
> > The receiver remains busy doing this activity 500 times a second.
>
> I can reproduce the problem.
>
> > On investigation it is revealed that the following code fragment in
> > function WalSndWaitForWal in file walsender.c is responsible for sending
> > these frequent keepalives:
> >
> > if (MyWalSnd->flush < sentPtr &&
> > MyWalSnd->write < sentPtr &&
> > !waiting_for_ping_response)
> > WalSndKeepalive(false);
>
> The immediate cause is pg_recvlogical doesn't send a reply before
> sleeping. Currently it sends replies every 10 seconds intervals.
>

Yeah, but one can use -s option to send it at lesser intervals.

> So the attached first patch stops the flood.
>

I am not sure sending feedback every time before sleep is a good idea,
this might lead to unnecessarily sending more messages. Can we try by
using one-second interval with -s option to see how it behaves? As a
matter of comparison the similar logic in workers.c uses
wal_receiver_timeout to send such an update message rather than
sending it every time before sleep.

> That said, I don't think it is not intended that logical walsender
> sends keep-alive packets with such a high frequency.  It happens
> because walsender actually doesn't wait at all because it waits on
> WL_SOCKET_WRITEABLE because the keep-alive packet inserted just before
> is always pending.
>
> So as the attached second, we should try to flush out the keep-alive
> packets if possible before checking pg_is_send_pending().
>

/* Send keepalive if the time has come */
  WalSndKeepaliveIfNecessary();

+ /* We may have queued a keep alive packet. flush it before sleeping. */
+ pq_flush_if_writable();

We already call pq_flush_if_writable() from WalSndKeepaliveIfNecessary
after sending the keep-alive message, so not sure how this helps?

-- 
With Regards,
Amit Kapila.

Re: Logical replication keepalive flood

2021-06-07 Thread Abbas Butt

On Mon, Jun 7, 2021 at 3:13 PM Amit Kapila  wrote:

> On Mon, Jun 7, 2021 at 12:54 PM Kyotaro Horiguchi
>  wrote:
> >
> > At Sat, 5 Jun 2021 16:08:00 +0500, Abbas Butt <
> abbas.b...@enterprisedb.com> wrote in
> > > Hi,
> > > I have observed the following behavior with PostgreSQL 13.3.
> > >
> > > The WAL sender process sends approximately 500 keepalive messages per
> > > second to pg_recvlogical.
> > > These keepalive messages are totally un-necessary.
> > > Keepalives should be sent only if there is no network traffic and a
> certain
> > > time (half of wal_sender_timeout) passes.
> > > These keepalive messages not only choke the network but also impact the
> > > performance of the receiver,
> > > because the receiver has to process the received message and then
> decide
> > > whether to reply to it or not.
> > > The receiver remains busy doing this activity 500 times a second.
> >
> > I can reproduce the problem.
> >
> > > On investigation it is revealed that the following code fragment in
> > > function WalSndWaitForWal in file walsender.c is responsible for
> sending
> > > these frequent keepalives:
> > >
> > > if (MyWalSnd->flush < sentPtr &&
> > > MyWalSnd->write < sentPtr &&
> > > !waiting_for_ping_response)
> > > WalSndKeepalive(false);
> >
> > The immediate cause is pg_recvlogical doesn't send a reply before
> > sleeping. Currently it sends replies every 10 seconds intervals.
> >
>
> Yeah, but one can use -s option to send it at lesser intervals.
>

That option can impact pg_recvlogical, it will not impact the server
sending keepalives too frequently.
By default the status interval is 10 secs, still we are getting 500
keepalives a second from the server.


>
> > So the attached first patch stops the flood.
> >
>
> I am not sure sending feedback every time before sleep is a good idea,
> this might lead to unnecessarily sending more messages. Can we try by
> using one-second interval with -s option to see how it behaves? As a
> matter of comparison the similar logic in workers.c uses
> wal_receiver_timeout to send such an update message rather than
> sending it every time before sleep.
>
> > That said, I don't think it is not intended that logical walsender
> > sends keep-alive packets with such a high frequency.  It happens
> > because walsender actually doesn't wait at all because it waits on
> > WL_SOCKET_WRITEABLE because the keep-alive packet inserted just before
> > is always pending.
> >
> > So as the attached second, we should try to flush out the keep-alive
> > packets if possible before checking pg_is_send_pending().
> >
>
> /* Send keepalive if the time has come */
>   WalSndKeepaliveIfNecessary();
>
> + /* We may have queued a keep alive packet. flush it before sleeping. */
> + pq_flush_if_writable();
>
> We already call pq_flush_if_writable() from WalSndKeepaliveIfNecessary
> after sending the keep-alive message, so not sure how this helps?
>
> --
> With Regards,
> Amit Kapila.
>


-- 
-- 
*Abbas*
Senior Architect


Ph: 92.334.5100153
Skype ID: gabbasb
edbpostgres.com

*Follow us on Twitter*
@EnterpriseDB

Re: Fast COPY FROM based on batch insert

2021-06-07 Thread Andrey Lepikhov

Second version of the patch fixes problems detected by the FDW 
regression tests and shows differences of error reports in 
tuple-by-tuple and batched COPY approaches.


--
regards,
Andrey Lepikhov
Postgres Professional
From 68ad02038d7477e005b65bf5aeeac4efbb41073e Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Fri, 4 Jun 2021 13:21:43 +0500
Subject: [PATCH] Implementation of a Bulk COPY FROM operation into foreign
 table.

---
 .../postgres_fdw/expected/postgres_fdw.out|  45 +++-
 contrib/postgres_fdw/sql/postgres_fdw.sql |  47 
 src/backend/commands/copyfrom.c   | 216 --
 src/backend/executor/execMain.c   |  45 
 src/backend/executor/execPartition.c  |   8 +
 src/include/commands/copyfrom_internal.h  |  10 -
 src/include/executor/executor.h   |   1 +
 src/include/nodes/execnodes.h |   8 +-
 8 files changed, 246 insertions(+), 134 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index f320a7578d..146a3be576 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -8143,6 +8143,7 @@ drop table loct2;
 -- ===
 -- test COPY FROM
 -- ===
+alter server loopback options (add batch_size '2');
 create table loc2 (f1 int, f2 text);
 alter table loc2 set (autovacuum_enabled = 'false');
 create foreign table rem2 (f1 int, f2 text) server loopback options(table_name 
'loc2');
@@ -8165,7 +8166,7 @@ copy rem2 from stdin; -- ERROR
 ERROR:  new row for relation "loc2" violates check constraint "loc2_f1positive"
 DETAIL:  Failing row contains (-1, xyzzy).
 CONTEXT:  remote SQL command: INSERT INTO public.loc2(f1, f2) VALUES ($1, $2)
-COPY rem2, line 1: "-1 xyzzy"
+COPY rem2, line 2
 select * from rem2;
  f1 | f2  
 +-
@@ -8176,6 +8177,19 @@ select * from rem2;
 alter foreign table rem2 drop constraint rem2_f1positive;
 alter table loc2 drop constraint loc2_f1positive;
 delete from rem2;
+create table foo (a int) partition by list (a);
+create table foo1 (like foo);
+create foreign table ffoo1 partition of foo for values in (1)
+   server loopback options (table_name 'foo1');
+create table foo2 (like foo);
+create foreign table ffoo2 partition of foo for values in (2)
+   server loopback options (table_name 'foo2');
+create function print_new_row() returns trigger language plpgsql as $$
+   begin raise notice '%', new; return new; end; $$;
+create trigger ffoo1_br_trig before insert on ffoo1
+   for each row execute function print_new_row();
+copy foo from stdin;
+NOTICE:  (1)
 -- Test local triggers
 create trigger trig_stmt_before before insert on rem2
for each statement execute procedure trigger_func();
@@ -8284,6 +8298,34 @@ drop trigger rem2_trig_row_before on rem2;
 drop trigger rem2_trig_row_after on rem2;
 drop trigger loc2_trig_row_before_insert on loc2;
 delete from rem2;
+alter table loc2 drop column f1;
+alter table loc2 drop column f2;
+copy rem2 from stdin;
+ERROR:  column "f1" of relation "loc2" does not exist
+CONTEXT:  remote SQL command: INSERT INTO public.loc2(f1, f2) VALUES ($1, $2), 
($3, $4)
+COPY rem2, line 3
+alter table loc2 add column f1 int;
+alter table loc2 add column f2 int;
+select * from rem2;
+ f1 | f2 
++
+(0 rows)
+
+-- dropped columns locally and on the foreign server
+alter table rem2 drop column f1;
+alter table rem2 drop column f2;
+copy rem2 from stdin;
+select * from rem2;
+--
+(2 rows)
+
+alter table loc2 drop column f1;
+alter table loc2 drop column f2;
+copy rem2 from stdin;
+select * from rem2;
+--
+(4 rows)
+
 -- test COPY FROM with foreign table created in the same transaction
 create table loc3 (f1 int, f2 text);
 begin;
@@ -8300,6 +8342,7 @@ select * from rem3;
 
 drop foreign table rem3;
 drop table loc3;
+alter server loopback options (drop batch_size);
 -- ===
 -- test for TRUNCATE
 -- ===
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql 
b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 17dba77d7e..8371d16c6a 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -2226,6 +2226,7 @@ drop table loct2;
 -- test COPY FROM
 -- ===
 
+alter server loopback options (add batch_size '2');
 create table loc2 (f1 int, f2 text);
 alter table loc2 set (autovacuum_enabled = 'false');
 create foreign table rem2 (f1 int, f2 text) server loopback options(table_name 
'loc2');
@@ -2258,6 +2259,23 @@ alter table loc2 drop constraint loc2_f1positive;
 
 delete from rem2;
 
+create table foo (a int) partition by list (a);
+create table foo1 (like

Confused about extension and shared_preload_libraries

2021-06-07 Thread Japin Li



Hi, hackers

When we write a extension using C language, we often add the dynamic library
into shared_preload_libraries, however, I found that the bloom, btree_gist and
btree_gin do not follow this rule.  I'm a bit confused with this, could anybody
explain it for me?

Thanks in advance.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: Confused about extension and shared_preload_libraries

2021-06-07 Thread Aleksander Alekseev

Hi Japin,

> When we write a extension using C language, we often add the dynamic
library
> into shared_preload_libraries, however, I found that the bloom,
btree_gist and
> btree_gin do not follow this rule.  I'm a bit confused with this, could
anybody
> explain it for me?

In the general case, you don't need to modify shared_preload_libraries to
use an extension, regardless of the language in which it's implemented.
That's it.

Some extensions may however require this. See the description of the GUC
[1].

[1]:
https://www.postgresql.org/docs/13/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES

-- 
Best regards,
Aleksander Alekseev

RE: Skip partition tuple routing with constant partition key

2021-06-07 Thread houzj.f...@fujitsu.com

Hi Amit-san

From: Amit Langote 
> On Fri, Jun 4, 2021 at 6:05 PM Amit Langote 
> wrote:
> > On Fri, Jun 4, 2021 at 4:38 PM Amit Langote 
> wrote:
> > > On Thu, Jun 3, 2021 at 8:48 PM Amit Langote 
> wrote:
> > > > On Tue, Jun 1, 2021 at 5:43 PM houzj.f...@fujitsu.com
> > > >  wrote:
> > > > > So, If we want to share the cached partition between statements,
> > > > > we seems cannot use ExecPartitionCheck. Instead, I tried
> > > > > directly invoke the partition support
> > > > > function(partsupfunc) to check If the cached info is correct. In
> > > > > this approach I tried cache the *bound offset* in
> > > > > PartitionDescData, and we can use the bound offset to get the
> > > > > bound datum from PartitionBoundInfoData and invoke the
> partsupfunc to do the CHECK.
> > > > >
> > > > > Attach a POC patch about it. Just to share an idea about sharing
> > > > > cached partition info between statements.
> > > >
> > > > I have not looked at your patch yet, but yeah that's what I would
> > > > imagine doing it.
> > >
> > > Just read it and think it looks promising.
> > >
> > > On code, I wonder why not add the rechecking-cached-offset code
> > > directly in get_partiiton_for_tuple(), instead of adding a whole new
> > > function for that.  Can you please check the attached revised version?
> 
> I should have clarified a bit more on why I think a new function looked
> unnecessary to me.  The thing about that function that bothered me was that
> it appeared to duplicate a lot of code fragments of get_partition_for_tuple().
> That kind of duplication often leads to bugs of omission later if something 
> from
> either function needs to change.

Thanks for the patch and explanation, I think you are right that it’s better add
the rechecking-cached-offset code directly in get_partiiton_for_tuple().

And now, I think maybe it's time to try to optimize the performance.
Currently, if every row to be inserted in a statement belongs to different
partition, then the cache check code will bring a slight performance
degradation(AFAICS: 2% ~ 4%).

So, If we want to solve this, then we may need 1) a reloption to let user 
control whether use the cache.
Or, 2) introduce some simple strategy to control whether use cache 
automatically.

I have not write a patch about 1) reloption, because I think it will be nice if 
we can
enable this cache feature by default. So, I attached a WIP patch about approach 
2).

The rough idea is to check the average batch number every 1000 rows.
If the average batch num is greater than 1, then we enable the cache check,
if not, disable cache check. This is similar to what 0d5f05cde0 did.

Thoughts ?

Best regards,
houzj

0001-WIP-cache-bound-offset-adaptively_v4.patch
Description: 0001-WIP-cache-bound-offset-adaptively_v4.patch

Re: Tid scan improvements

2021-06-07 Thread Edmund Horner

On Mon, 7 Jun 2021 at 22:11, Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> This patch didn't add _outTidRangePath() even though we have outNode()
> coverage for most/all path nodes.  Was this just forgotten?  See
> attached patch.
>

Yes, it looks like an omission.  Thanks for spotting it.  Patch looks good
to me.

Edmund

Re: Tid scan improvements

2021-06-07 Thread David Rowley

On Mon, 7 Jun 2021 at 23:46, Edmund Horner  wrote:
>
> On Mon, 7 Jun 2021 at 22:11, Peter Eisentraut 
>  wrote:
>>
>> This patch didn't add _outTidRangePath() even though we have outNode()
>> coverage for most/all path nodes.  Was this just forgotten?  See
>> attached patch.
>
>
> Yes, it looks like an omission.  Thanks for spotting it.  Patch looks good to 
> me.

Yeah. That was forgotten.  Patch also looks fine to me.  Do you want
to push it, Peter?

David

Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS

2021-06-07 Thread Masahiko Sawada

On Mon, Jun 7, 2021 at 4:30 PM Masahiko Sawada  wrote:
>
> On Mon, Jun 7, 2021 at 11:15 AM Tom Lane  wrote:
> >
> > husky just reported $SUBJECT:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky&dt=2021-06-05%2013%3A42%3A17
> >
> > and I find I can reproduce that locally:
> >
> > diff -U3 
> > /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out 
> > /home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out
> > --- /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out   
> > 2021-01-20 11:12:24.854346717 -0500
> > +++ /home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out
> > 2021-06-06 22:12:07.948890104 -0400
> > @@ -215,7 +215,8 @@
> >   0 | f   | f
> >   1 | f   | f
> >   2 | t   | t
> > -(3 rows)
> > + 3 | t   | t
> > +(4 rows)
> >
> >  select * from pg_check_frozen('copyfreeze');
> >   t_ctid
> > @@ -235,7 +236,8 @@
> >   0 | t   | t
> >   1 | f   | f
> >   2 | t   | t
> > -(3 rows)
> > + 3 | t   | t
> > +(4 rows)
> >
> >  select * from pg_check_frozen('copyfreeze');
> >   t_ctid
> >
> >
> > The test cases that are failing date back to January (7db0cd2145f),
> > so I think this is some side-effect of a recent commit, but I have
> > no idea which one.
>
> It seems like the recent revert (8e03eb92e9a) is relevant.
>
> After committing 7db0cd2145f we had the same regression test failure
> in January[1]. Then we fixed that issue by 39b66a91b. But since we
> recently reverted most of 39b66a91b, the same issue happened again.
>

So the cause of this failure seems the same as before. The failed test is,

begin;
truncate copyfreeze;
copy copyfreeze from stdin freeze;
1   '1'
2   '2'
3   '3'
4   '4'
5   '5'
\.
copy copyfreeze from stdin;
6   '6'
\.
copy copyfreeze from stdin freeze;
7   '7'
8   '8'
9   '9'
10  '10'
11  '11'
12  '12'
\.
commit;

If the target block cache is invalidated before the third COPY, we
will start to insert the frozen tuple into a new page, resulting in
adding two blocks in total during the third COPY. I think we still
need the following part of the reverted code so that we don't leave
the page partially empty after relcache invalidation:

--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -407,19 +407,19 @@ RelationGetBufferForTuple(Relation relation, Size len,
 * target.
 */
targetBlock = GetPageWithFreeSpace(relation, targetFreeSpace);
-   }

-   /*
-* If the FSM knows nothing of the rel, try the last page before we give
-* up and extend.  This avoids one-tuple-per-page syndrome during
-* bootstrapping or in a recently-started system.
-*/
-   if (targetBlock == InvalidBlockNumber)
-   {
-   BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
+   /*
+* If the FSM knows nothing of the rel, try the last page before we
+* give up and extend.  This avoids one-tuple-per-page syndrome during
+* bootstrapping or in a recently-started system.
+*/
+   if (targetBlock == InvalidBlockNumber)
+   {
+   BlockNumber nblocks = RelationGetNumberOfBlocks(relation);

-   if (nblocks > 0)
-   targetBlock = nblocks - 1;
+   if (nblocks > 0)
+   targetBlock = nblocks - 1;
+   }
}

Attached the patch that brings back the above change.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


use_last_block.patch
Description: Binary data

Re: speed up verifying UTF-8

2021-06-07 Thread Heikki Linnakangas


On 03/06/2021 21:58, John Naylor wrote:


 > What test set have you been using for performance testing this? I'd like

The microbenchmark is the same one you attached to [1], which I extended 
with a 95% multibyte case.


Could you share the exact test you're using? I'd like to test this on my 
old raspberry pi, out of curiosity.


- Heikki

Re: Decoding speculative insert with toast leaks memory

2021-06-07 Thread Dilip Kumar

On Mon, Jun 7, 2021 at 8:46 AM Dilip Kumar  wrote:

> On Mon, 7 Jun 2021 at 8:30 AM, Amit Kapila 
> wrote:
>
>> On Wed, Jun 2, 2021 at 11:52 AM Amit Kapila 
>> wrote:
>> >
>> > On Wed, Jun 2, 2021 at 11:38 AM Dilip Kumar 
>> wrote:
>> > >
>> > > On Wed, Jun 2, 2021 at 11:25 AM Amit Kapila 
>> wrote:
>> > > >
>> > > > I think the same relation case might not create a problem because it
>> > > > won't find the entry for it in the toast_hash, so it will return
>> from
>> > > > there but the other two problems will be there.
>> > >
>> > > Right
>> > >
>> > > So, one idea could be
>> > > > to just avoid those two cases (by simply adding return for those
>> > > > cases) and still we can rely on toast clean up on the next
>> > > > insert/update/delete. However, your approach looks more natural to
>> me
>> > > > as that will allow us to clean up memory in all cases instead of
>> > > > waiting till the transaction end. So, I still vote for going with
>> your
>> > > > patch's idea of cleaning at spec_abort but I am fine if you and
>> others
>> > > > decide not to process spec_abort message. What do you think? Tomas,
>> do
>> > > > you have any opinion on this matter?
>> > >
>> > > I agree that processing with spec abort looks more natural and ideally
>> > > the current code expects it to be getting cleaned after the change,
>> > > that's the reason we have those assertions and errors.
>> > >
>>
>> Okay, so, let's go with that approach. I have thought about whether it
>> creates any problem in back-branches but couldn't think of any
>> primarily because we are not going to send anything additional to
>> plugin/subscriber. Do you see any problems with back branches if we go
>> with this approach?
>
>
> I will check this and let you know.
>
>
>> > >  OTOH I agree
>> > > that we can just return from those conditions because now we know that
>> > > with the current code those situations are possible.  My vote is with
>> > > handling the spec abort option (Option1) because that looks more
>> > > natural way of handling these issues and we also don't have to clean
>> > > up the hash in "ReorderBufferReturnTXN" if no followup change after
>> > > spec abort.
>> > >
>> >
>> > Even, if we decide to go with spec_abort approach, it might be better
>> > to still keep the toastreset call in ReorderBufferReturnTXN so that it
>> > can be freed in case of error.
>> >
>>
>> Please take care of this as well.
>
>
> Ok
>

I have fixed all pending review comments and also added a test case which
is working fine.  I haven't yet checked on the back branches.  Let's
discuss if we think this patch looks fine then I can apply and test on the
back branches.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 0b9c93398ef108a3d71cbac6f793a0314964aaa2 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Tue, 1 Jun 2021 19:53:47 +0530
Subject: [PATCH v3] Bug fix for speculative abort

If speculative insert has a toast table insert then if that tuple
is not confirmed then the toast hash is not cleaned and that is
creating various problem like a) memory leak b) next insert is
using these uncleaned toast data for its insertion and other
error and assersion failure.  So this patch handle that by
queuing the spec abort changes and cleaning up the toast hash
on spec abort.  Currently, in this patch we are queuing up all
the spec abort changes, but as an optimization we can avoid
queuing the spec abort for toast tables but for that we need to
log that as a flag in WAL.
---
 contrib/test_decoding/Makefile |  2 +-
 .../test_decoding/expected/speculative_abort.out   | 64 +
 contrib/test_decoding/specs/speculative_abort.spec | 67 ++
 src/backend/replication/logical/decode.c   | 14 ++---
 src/backend/replication/logical/reorderbuffer.c| 43 +-
 src/include/replication/reorderbuffer.h|  1 +
 6 files changed, 180 insertions(+), 11 deletions(-)
 create mode 100644 contrib/test_decoding/expected/speculative_abort.out
 create mode 100644 contrib/test_decoding/specs/speculative_abort.spec

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 9a31e0b..1cab935 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -8,7 +8,7 @@ REGRESS = ddl xact rewrite toast permissions decoding_in_xact \
 	spill slot truncate stream stats twophase twophase_stream
 ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \
 	oldest_xmin snapshot_transfer subxact_without_top concurrent_stream \
-	twophase_snapshot
+	twophase_snapshot speculative_abort
 
 REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
 ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
diff --git a/contrib/test_decoding/expected/speculative_abort.out b/contrib/test_decoding/expected/speculative_abort.out
new file mode 100644
index 000..d672deb
--- /dev/null
+++ b/contrib/test

Re: speed up verifying UTF-8

2021-06-07 Thread John Naylor

On Mon, Jun 7, 2021 at 8:24 AM Heikki Linnakangas  wrote:
>
> On 03/06/2021 21:58, John Naylor wrote:
> > The microbenchmark is the same one you attached to [1], which I extended
> > with a 95% multibyte case.
>
> Could you share the exact test you're using? I'd like to test this on my
> old raspberry pi, out of curiosity.

Sure, attached.

--
John Naylor
EDB: http://www.enterprisedb.com

日本學習漢語的標準從南京官話轉為北京官話，選用教材時，最初只能採用威妥瑪《語言自邇集》，後來改編西人教材，然後才開始自編教材，吳啟太、鄭永邦的《官話指南》（1881）是較早的自編教材之一。
此二人作為唐通事的後代，編寫的《官話指南》也產生了極大的影響，到時治末期仍有譯本流傳。全書分《應對須知》、《官商吐屬》、《使令通話》和《官話問答》四卷，以對話的形式編寫，很好地反映了北京官話口語的面貌。後期又有《改訂官話指南》問世。本卷擬對以上原書重新錄入整理，並加以校釋
吳啟太、鄭永邦：日本駐華使館的翻譯生，其父輩均從事與漢語翻譯相關的工作。吳啟太之父吳碩三郎為日籍華裔，曾任上海領事館翻譯。鄭永邦之父鄭永寧先後擔 
任過日本外務省漢語大譯官，漢語學所督長，駐華公使館翻譯、參贊等職。吳啟太與鄭永邦分別于1879年和1880年被北京公使館錄取為翻譯生，帶薪學習漢 
語，兼作見習翻譯
溫馨提醒您：若您訂單中有購買簡體館無庫存/預售書或庫存於海外廠商的書籍，建議與其他商品分開下單，以避免等待時間過長，謝謝。
大陸出版品書況：因裝幀品質及貨運條件未臻完善，書況與台灣出版品落差甚大，封面老舊、出現磨痕、凹痕等均屬常態，故簡體字館除封面破損、內頁脫落...等較嚴重的狀態外，其餘所有商品將正常出貨。
 
請注意，部分書籍附贈之內容(如音頻mp3或影片dvd等)已無實體光碟提供，需以QR CODE 連結至當地網站註冊“並通過驗證程序”，方可下載使用。
調貨時間：若您購買海外庫存之商品，於您完成訂購後，商品原則上約45個工作天內抵台(若有將延遲另行告知)。為了縮短等待的時間，建議您將簡體書與其它商品分開訂購，以利一般商品快速出貨。
 
若您具有法人身份為常態性且大量購書者，或有特殊作業需求，建議您可洽詢「企業採購」。 
退換貨說明 
會員所購買的商品均享有到貨十天的猶豫期（含例假日）。退回之商品必須於猶豫期內寄回。 
辦理退換貨時，商品必須是全新狀態與完整包裝(請注意保持商品本體、配件、贈品、保證書、原廠包裝及所有附隨文件或資料的完整性，切勿缺漏任何配件或損毀原廠外盒)。退回商品無法回復原狀者，恐將影響退貨權益或需負擔部分費用。
 
訂購本商品前請務必詳閱商品退換貨原則
日本學習漢語的標準從南京官話轉為北京官話，選用教材時，最初只能採用威妥瑪《語言自邇集》，後來改編西人教材，然後才開始自編教材，吳啟太、鄭永邦的《官話指南》（1881）是較早的自編教材之一。
此二人作為唐通事的後代，編寫的《官話指南》也產生了極大的影響，到時治末期仍有譯本流傳。全書分《應對須知》、《官商吐屬》、《使令通話》和《官話問答》四卷，以對話的形式編寫，很好地反映了北京官話口語的面貌。後期又有《改訂官話指南》問世。本卷擬對以上原書重新錄入整理，並加以校釋
吳啟太、鄭永邦：日本駐華使館的翻譯生，其父輩均從事與漢語翻譯相關的工作。吳啟太之父吳碩三郎為日籍華裔，曾任上海領事館翻譯。鄭永邦之父鄭永寧先後擔 
任過日本外務省漢語大譯官，漢語學所督長，駐華公使館翻譯、參贊等職。吳啟太與鄭永邦分別于1879年和1880年被北京公使館錄取為翻譯生，帶薪學習漢 
語，兼作見習翻譯
溫馨提醒您：若您訂單中有購買簡體館無庫存/預售書或庫存於海外廠商的書籍，建議與其他商品分開下單，以避免等待時間過長，謝謝。
大陸出版品書況：因裝幀品質及貨運條件未臻完善，書況與台灣出版品落差甚大，封面老舊、出現磨痕、凹痕等均屬常態，故簡體字館除封面破損、內頁脫落...等較嚴重的狀態外，其餘所有商品將正常出貨。
 
請注意，部分書籍附贈之內容(如音頻mp3或影片dvd等)已無實體光碟提供，需以QR CODE 連結至當地網站註冊“並通過驗證程序”，方可下載使用。
調貨時間：若您購買海外庫存之商品，於您完成訂購後，商品原則上約45個工作天內抵台(若有將延遲另行告知)。為了縮短等待的時間，建議您將簡體書與其它商品分開訂購，以利一般商品快速出貨。
 
若您具有法人身份為常態性且大量購書者，或有特殊作業需求，建議您可洽詢「企業採購」。 
退換貨說明 
會員所購買的商品均享有到貨十天的猶豫期（含例假日）。退回之商品必須於猶豫期內寄回。 
辦理退換貨時，商品必須是全新狀態與完整包裝(請注意保持商品本體、配件、贈品、保證書、原廠包裝及所有附隨文件或資料的完整性，切勿缺漏任何配件或損毀原廠外盒)。退回商品無法回復原狀者，恐將影響退貨權益或需負擔部分費用。
 
訂購本商品前請務必詳閱商品退換貨原則
日本學習漢語的標準從南京官話轉為北京官話，選用教材時，最初只能採用威妥瑪《語言自邇集》，後來改編西人教材，然後才開始自編教材，吳啟太、鄭永邦的《官話指南》（1881）是較早的自編教材之一。
此二人作為唐通事的後代，編寫的《官話指南》也產生了極大的影響，到時治末期仍有譯本流傳。全書分《應對須知》、《官商吐屬》、《使令通話》和《官話問答》四卷，以對話的形式編寫，很好地反映了北京官話口語的面貌。後期又有《改訂官話指南》問世。本卷擬對以上原書重新錄入整理，並加以校釋
吳啟太、鄭永邦：日本駐華使館的翻譯生，其父輩均從事與漢語翻譯相關的工作。吳啟太之父吳碩三郎為日籍華裔，曾任上海領事館翻譯。鄭永邦之父鄭永寧先後擔 
任過日本外務省漢語大譯官，漢語學所督長，駐華公使館翻譯、參贊等職。吳啟太與鄭永邦分別于1879年和1880年被北京公使館錄取為翻譯生，帶薪學習漢 
語，兼作見習翻譯
溫馨提醒您：若您訂單中有購買簡體館無庫存/預售書或庫存於海外廠商的書籍，建議與其他商品分開下單，以避免等待時間過長，謝謝。
大陸出版品書況：因裝幀品質及貨運條件未臻完善，書況與台灣出版品落差甚大，封面老舊、出現磨痕、凹痕等均屬常態，故簡體字館除封面破損、內頁脫落...等較嚴重的狀態外，其餘所有商品將正常出貨。
 
請注意，部分書籍附贈之內容(如音頻mp3或影片dvd等)已無實體光碟提供，需以QR CODE 連結至當地網站註冊“並通過驗證程序”，方可下載使用。
調貨時間：若您購買海外庫存之商品，於您完成訂購後，商品原則上約45個工作天內抵台(若有將延遲另行告知)。為了縮短等待的時間，建議您將簡體書與其它商品分開訂購，以利一般商品快速出貨。
 
若您具有法人身份為常態性且大量購書者，或有特殊作業需求，建議您可洽詢「企業採購」。 
退換貨說明 
會員所購買的商品均享有到貨十天的猶豫期（含例假日）。退回之商品必須於猶豫期內寄回。 
辦理退換貨時，商品必須是全新狀態與完整包裝(請注意保持商品本體、配件、贈品、保證書、原廠包裝及所有附隨文件或資料的完整性，切勿缺漏任何配件或損毀原廠外盒)。退回商品無法回復原狀者，恐將影響退貨權益或需負擔部分費用。
 
訂購本商品前請務必詳閱商品退換貨原則
日本學習漢語的標準從南京官話轉為北京官話，選用教材時，最初只能採用威妥瑪《語言自邇集》，後來改編西人教材，然後才開始自編教材，吳啟太、鄭永邦的《官話指南》（1881）是較早的自編教材之一。
此二人作為唐通事的後代，編寫的《官話指南》也產生了極大的影響，到時治末期仍有譯本流傳。全書分《應對須知》、《官商吐屬》、《使令通話》和《官話問答》四卷，以對話的形式編寫，很好地反映了北京官話口語的面貌。後期又有《改訂官話指南》問世。本卷擬對以上原書重新錄入整理，並加以校釋
吳啟太、鄭永邦：日本駐華使館的翻譯生，其父輩均從事與漢語翻譯相關的工作。吳啟太之父吳碩三郎為日籍華裔，曾任上海領事館翻譯。鄭永邦之父鄭永寧先後擔 
任過日本外務省漢語大譯官，漢語學所督長，駐華公使館翻譯、參贊等職。吳啟太與鄭永邦分別于1879年和1880年被北京公使館錄取為翻譯生，帶薪學習漢 
語，兼作見習翻譯
溫馨提醒您：若您訂單中有購買簡體館無庫存/預售書或庫存於海外廠商的書籍，建議與其他商品分開下單，以避免等待時間過長，謝謝。
大陸出版品書況：因裝幀品質及貨運條件未臻完善，書況與台灣出版品落差甚大，封面老舊、出現磨痕、凹痕等均屬常態，故簡體字館除封面破損、內頁脫落...等較嚴重的狀態外，其餘所有商品將正常出貨。
 
請注意，部分書籍附贈之內容(如音頻mp3或影片dvd等)已無實體光碟提供，需以QR CODE 連結至當地網站註冊“並通過驗證程序”，方可下載使用。
調貨時間：若您購買海外庫存之商品，於您完成訂購後，商品原則上約45個工作天內抵台(若有將延遲另行告知)。為了縮短等待的時間，建議您將簡體書與其它商品分開訂購，以利一般商品快速出貨。
 
若您具有法人身份為常態性且大量購書者，或有特殊作業需求，建議您可洽詢「企業採購」。 
退換貨說明 
會員所購買的商品均享有到貨十天的猶豫期（含例假日）。退回之商品必須於猶豫期內寄回。 
辦理退換貨時，商品必須是全新狀態與完整包裝(請注意保持商品本體、配件、贈品、保證書、原廠包裝及所有附隨文件或資料的完整性，切勿缺漏任何配件或損毀原廠外盒)。退回商品無法回復原狀者，恐將影響退貨權益或需負擔部分費用。
 
訂購本商品前請務必詳閱商品退換貨原則
日本學習漢語的標準從南京官話轉為北京官話，選用教材時，最初只能採用威妥瑪《語言自邇集》，後來改編西人教材，然後才開始自編教材，吳啟太、鄭永邦的《官話指南》（1881）是較早的自編教材之一。
此二人作為唐通事的後代，編寫的《官話指南》也產生了極大的影響，到時治末期仍有譯本流傳。全書分《應對須知》、《官商吐屬》、《使令通話》和《官話問答》四卷，以對話的形式編寫，很好地反映了北京官話口語的面貌。後期又有《改訂官話指南》問世。本卷擬對以上原書重新錄入整理，並加以校釋
吳啟太、鄭永邦：日本駐華使館的翻譯生，其父輩均從事與漢語翻譯相關的工作。吳啟太之父吳碩三郎為日籍華裔，曾任上海領事館翻譯。鄭永邦之父鄭永寧先後擔 
任過日本外務省漢語大譯官，漢語學所督長，駐華公使館翻譯、參贊等職。吳啟太與鄭永邦分別于1879年和1880年被北京公使館錄取為翻譯生，帶薪學習漢 
語，兼作見習翻譯
溫馨提醒您：若您訂單中有購買簡體館無庫存/預售書或庫存於海外廠商的書籍，建議與其他商品分開下單，以避免等待時間過長，謝謝。
大陸出版品書況：因裝幀品質及貨運條件未臻完善，書況與台灣出版品落差甚大，封面老舊、出現磨痕、凹痕等均屬常態，故簡體字館除封面破損、內頁脫落...等較嚴重的狀態外，其餘所有商品將正常出貨。
 
請注意，部分書籍附贈之內容(如音頻mp3或影片dvd等)已無實體光碟提供，需以QR CODE 連結至當地網站註冊“並通過驗證程序”，方可下載使用。
調貨時間：若您購買海外庫存之商品，於您完成訂購後，商品原則上約45個工作天內抵台(若有將延遲另行告知)。為了縮短等待的時間，建議您將簡體書與其它商品分開訂購，以利一般商品快速出貨。
 
若您具有法人身份為常態性且大量購書者，或有特殊作業需求，建議您可洽詢「企業採購」。 
退換貨說明 
會員所購買的商品均享有到貨十天的猶豫期（含例假日）。退回之商品必須於猶豫期內寄回。 
辦理退換貨時，商品必須是全新狀態與完整包裝(請注意保持商品本體、配件、贈品、保證書、原廠包裝及所有附隨文件或資料

Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS

2021-06-07 Thread Tomas Vondra




On 6/7/21 2:11 PM, Masahiko Sawada wrote:
> On Mon, Jun 7, 2021 at 4:30 PM Masahiko Sawada  wrote:
>>
>> On Mon, Jun 7, 2021 at 11:15 AM Tom Lane  wrote:
>>>
>>> husky just reported $SUBJECT:
>>>
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky&dt=2021-06-05%2013%3A42%3A17
>>>
>>> and I find I can reproduce that locally:
>>>
>>> diff -U3 
>>> /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out 
>>> /home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out
>>> --- /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out   
>>> 2021-01-20 11:12:24.854346717 -0500
>>> +++ /home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out
>>> 2021-06-06 22:12:07.948890104 -0400
>>> @@ -215,7 +215,8 @@
>>>   0 | f   | f
>>>   1 | f   | f
>>>   2 | t   | t
>>> -(3 rows)
>>> + 3 | t   | t
>>> +(4 rows)
>>>
>>>  select * from pg_check_frozen('copyfreeze');
>>>   t_ctid
>>> @@ -235,7 +236,8 @@
>>>   0 | t   | t
>>>   1 | f   | f
>>>   2 | t   | t
>>> -(3 rows)
>>> + 3 | t   | t
>>> +(4 rows)
>>>
>>>  select * from pg_check_frozen('copyfreeze');
>>>   t_ctid
>>>
>>>
>>> The test cases that are failing date back to January (7db0cd2145f),
>>> so I think this is some side-effect of a recent commit, but I have
>>> no idea which one.
>>
>> It seems like the recent revert (8e03eb92e9a) is relevant.
>>
>> After committing 7db0cd2145f we had the same regression test failure
>> in January[1]. Then we fixed that issue by 39b66a91b. But since we
>> recently reverted most of 39b66a91b, the same issue happened again.
>>
> 
> So the cause of this failure seems the same as before. The failed test is,
> 
> begin;
> truncate copyfreeze;
> copy copyfreeze from stdin freeze;
> 1   '1'
> 2   '2'
> 3   '3'
> 4   '4'
> 5   '5'
> \.
> copy copyfreeze from stdin;
> 6   '6'
> \.
> copy copyfreeze from stdin freeze;
> 7   '7'
> 8   '8'
> 9   '9'
> 10  '10'
> 11  '11'
> 12  '12'
> \.
> commit;
> 
> If the target block cache is invalidated before the third COPY, we
> will start to insert the frozen tuple into a new page, resulting in
> adding two blocks in total during the third COPY. I think we still
> need the following part of the reverted code so that we don't leave
> the page partially empty after relcache invalidation:
> 
> --- a/src/backend/access/heap/hio.c
> +++ b/src/backend/access/heap/hio.c
> @@ -407,19 +407,19 @@ RelationGetBufferForTuple(Relation relation, Size len,
>  * target.
>  */
> targetBlock = GetPageWithFreeSpace(relation, targetFreeSpace);
> -   }
> 
> -   /*
> -* If the FSM knows nothing of the rel, try the last page before we give
> -* up and extend.  This avoids one-tuple-per-page syndrome during
> -* bootstrapping or in a recently-started system.
> -*/
> -   if (targetBlock == InvalidBlockNumber)
> -   {
> -   BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
> +   /*
> +* If the FSM knows nothing of the rel, try the last page before we
> +* give up and extend.  This avoids one-tuple-per-page syndrome during
> +* bootstrapping or in a recently-started system.
> +*/
> +   if (targetBlock == InvalidBlockNumber)
> +   {
> +   BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
> 
> -   if (nblocks > 0)
> -   targetBlock = nblocks - 1;
> +   if (nblocks > 0)
> +   targetBlock = nblocks - 1;
> +   }
> }
> 
> Attached the patch that brings back the above change.
> 

Thanks for the analysis! I think you're right - this bit should have
been kept. Partial reverts are tricky :-(

I'll get this fixed / pushed later today, after a bit more testing. I'd
swear I ran tests with CCA, but it's possible I skipped contrib.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Decoding speculative insert with toast leaks memory

2021-06-07 Thread Amit Kapila

On Mon, Jun 7, 2021 at 6:04 PM Dilip Kumar  wrote:
>
> I have fixed all pending review comments and also added a test case which is 
> working fine.
>

Few observations and questions on testcase:
1.
+step "s1_lock_s2" { SELECT pg_advisory_lock(2); }
+step "s1_lock_s3" { SELECT pg_advisory_lock(2); }
+step "s1_session" { SET spec.session = 1; }
+step "s1_begin" { BEGIN; }
+step "s1_insert_tbl1" { INSERT INTO tbl1 VALUES(1, repeat('a', 4000))
ON CONFLICT DO NOTHING; }
+step "s1_abort" { ROLLBACK; }
+step "s1_unlock_s2" { SELECT pg_advisory_unlock(2); }
+step "s1_unlock_s3" { SELECT pg_advisory_unlock(2); }

Here, s1_lock_s3 and s1_unlock_s3 uses 2 as identifier. Don't you need
to use 3 in that part of the test?

2. In the test, there seems to be an assumption that we can unlock s2
and s3 one after another, and then both will start waiting on s-1 but
isn't it possible that before s2 start waiting on s1, s3 completes its
insertion and then s2 will never proceed for speculative insertion?

>  I haven't yet checked on the back branches.  Let's discuss if we think this 
> patch looks fine then I can apply and test on the back branches.
>

Sure, that makes sense.

-- 
With Regards,
Amit Kapila.

Re: Strangeness with UNIQUE indexes and UTF-8

2021-06-07 Thread Magnus Hagander

On Sun, Jun 6, 2021 at 11:20 PM Omar Kilani  wrote:
>
> I mean, maybe it's because I've been awake since... 7am yesterday, but
> it seems to me that if Postgres fails catastrophically silently (and I
> would say "it looks like all your data in this table disappeared
> because of some arcane locale / btree issue that no one except Tom
> Lane even knows exists" -- see the replies about hardware issues and
> ON CONFLICT as an example) -- then maybe that is... not good, and
> Postgres shouldn't do that?

It is most definitely not an "arcane issue no one except Tom lane even
knows exists". I would assume most people who work with consulting or
support around PostgreSQL know it exists, because some of their
customers have hit it :/

I think it's more in the other direction -- those people are more
likely to dismiss that issue as "the person reporting this will
already have checked this, it must be something else"...

> Not only that, it's only indices which have non-ASCII or whatever in
> them that silently fail, so it's like 95% of your indices work just
> fine, but the ones that don't... look fine. They're not corrupt on
> disk, they have their full size, etc.

No it's not. ASCII will also fail in many cases. Did you read the page
that you were linked to? It even includes an example of why ASCII
cases will also fail.

It's only non-text indexes that are "safe".

> How is anyone supposed to know about this issue? I've been using
> Postgres since 1999, built the Postgres website, worked with Neil and
> Gavin on Postgres, submitted patches to Postgres and various
> Postgres-related projects, and this is the first time I've become
> aware of it. I mean, maybe I'm dumb, and... fine. But your average
> user is going to have no idea about this.

This problem has been around before, just usually doesn't affect the
English locale. Surely if you've spent that much time around Postgres
and in the community you must've heard about it before?

And this particular issue has been written about numerous times, which
has been published through the postgres website and blog aggregators.

It is definitely a weakness in how PostgreSQL does things, but it's a
pretty well known weakness by now.

> Why can't some "locale signature" or something be encoded into the
> index so Postgres can at least warn you? Or not use the messed up
> index altogether instead of silently returning no data?

If you use ICU for your text indexes, it does exactly that. The page
at https://www.postgresql.org/docs/13/sql-altercollation.html shows
you examples of what wouldh appen in that case. (This page also
documents that there is no version tracking for the built-in
collections, but I definitely agree that's pretty well hidden-away by
being on the reference page of alter collation..)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: Decoding speculative insert with toast leaks memory

2021-06-07 Thread Dilip Kumar

On Mon, Jun 7, 2021 at 6:34 PM Amit Kapila  wrote:

> On Mon, Jun 7, 2021 at 6:04 PM Dilip Kumar  wrote:
> >
> > I have fixed all pending review comments and also added a test case
> which is working fine.
> >
>
> Few observations and questions on testcase:
> 1.
> +step "s1_lock_s2" { SELECT pg_advisory_lock(2); }
> +step "s1_lock_s3" { SELECT pg_advisory_lock(2); }
> +step "s1_session" { SET spec.session = 1; }
> +step "s1_begin" { BEGIN; }
> +step "s1_insert_tbl1" { INSERT INTO tbl1 VALUES(1, repeat('a', 4000))
> ON CONFLICT DO NOTHING; }
> +step "s1_abort" { ROLLBACK; }
> +step "s1_unlock_s2" { SELECT pg_advisory_unlock(2); }
> +step "s1_unlock_s3" { SELECT pg_advisory_unlock(2); }
>
> Here, s1_lock_s3 and s1_unlock_s3 uses 2 as identifier. Don't you need
> to use 3 in that part of the test?
>

Yeah this should be 3.


>
> 2. In the test, there seems to be an assumption that we can unlock s2
> and s3 one after another, and then both will start waiting on s-1 but
> isn't it possible that before s2 start waiting on s1, s3 completes its
> insertion and then s2 will never proceed for speculative insertion?
>

I agree, such race conditions are possible.  Currently, I am not able to
think what we can do here, but I will think more on this.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: when the startup process doesn't

2021-06-07 Thread Robert Haas

On Sun, Jun 6, 2021 at 6:23 PM Justin Pryzby  wrote:
> Should it show the rusage ?  It's shown at startup completion since 10a5b35a0,
> so it seems strange not to show it here.

I don't know, that seems like it's going to make the messages awfully
long, and I'm not sure of what use it is to see that for every report.

I don't like the name very much. log_min_duration_startup_process
seems to have been chosen to correspond to log_min_duration_statement,
but the semantics are different. That one is a threshold, whereas this
one is an interval. Maybe something like
log_startup_progress_interval?

As far as the patch itself goes, I think that the overhead of this
approach is going to be unacceptably high. I was imagining having a
timer running in the background that fires periodically, with the
interval handler just setting a flag. Then in the foreground we just
need to check whether the flag is set. I doubt that we can get away
with a GetCurrentTimestamp() after applying every WAL record ... that
seems like it will be slow.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: when the startup process doesn't

2021-06-07 Thread Tom Lane

Robert Haas  writes:
> ... I doubt that we can get away
> with a GetCurrentTimestamp() after applying every WAL record ... that
> seems like it will be slow.

Yeah, that's going to be pretty awful even on machines with fast
gettimeofday, never mind ones where it isn't.

It should be possible to use utils/misc/timeout.c to manage the
interrupt, I'd think.

regards, tom lane

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

2021-06-07 Thread Robert Haas

On Fri, Jun 4, 2021 at 6:17 AM Amit Kapila  wrote:
> Thoughts?

As far as I can see, trying to error out at function call time if the
function is parallel-safe doesn't fix any problem we have, and just
makes the design of this part of the system less consistent with what
we've done elsewhere. For example, if you create a stable function
that internally calls a volatile function, you don't get an error. You
can use your stable function in an index definition if you wish. That
may break, but if so, that's your problem. Also, when it breaks, it
probably won't blow up the entire world; you'll just have a messed-up
index. Currently, the parallel-safety stuff works the same way. If we
notice that something is marked parallel-unsafe, we'll skip
parallelism. But you can lie to us and claim that things are safe when
they're not, and if you do, it may break, but that's your problem.
Mostly likely your query will just error out, and there will be no
worse consequences than that, though if your parallel-unsafe function
is written in C, it could do horrible things like crash, which is
unavoidable because C code can do anything.

Now, the reason for all of this work, as I understand it, is because
we want to enable parallel inserts, and the problem there is that a
parallel insert could involve a lot of different things: it might need
to compute expressions, or fire triggers, or check constraints, and
any of those things could be parallel-unsafe. If we enable parallelism
and then find out that we need to do to one of those things, we have a
problem. Something probably will error out. The thing is, with this
proposal, that issue is not solved. Something will definitely error
out. You'll probably get the error in a different place, but nobody
fires off an INSERT hoping to get one error message rather than
another. What they want is for it to work. So I'm kind of confused how
we ended up going in this direction which seems to me at least to be a
tangent from the real issue, and somewhat at odds with the way the
rest of PostgreSQL is designed.

It seems to me that we could simply add a flag to each relation saying
whether or not we think that INSERT operations - or perhaps DML
operations generally - are believed to be parallel-safe for that
relation. Like the marking on functions, it would be the user's
responsibility to get that marking correct. If they don't, they might
call a parallel-unsafe function in parallel mode, and that will
probably error out. But that's no worse than what we already have in
existing cases, so I don't see why it requires doing what's proposed
here first. Now, it does have the advantage of being not very
convenient for users, who, I'm sure, would prefer that the system
figure out for them automatically whether or not parallel inserts are
likely to be safe, rather than making them declare it, especially
since presumably the default declaration would have to be "unsafe," as
it is for functions. But I don't have a better idea right now.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Multiple hosts in connection string failed to failover in non-hot standby mode

2021-06-07 Thread Tom Lane

Michael Paquier  writes:
> On Sun, Jun 06, 2021 at 05:27:49PM -0400, Tom Lane wrote:
>> So I took
>> a look at disabling GSSENC in these test cases to try to silence
>> hamerkop's test failure that way.  Here's a proposed patch.
>> It relies on setenv() being available, but I think that's fine
>> because we link the ECPG test programs with libpgport.

> No, that's not it.  The compilation of the tests happens when
> triggering the tests as of ecpgcheck() in vcregress.pl so I think that
> this is going to fail.  This requires at least the addition of a
> reference to libpgport in ecpg_regression.proj, perhaps more.

Hmm.  We do include "-lpgcommon -lpgport" when building the ecpg test
programs on Unix, so I'd assumed that the MSVC scripts did the same.
Is there a good reason not to make them do so?

regards, tom lane

Re: Race condition in recovery?

2021-06-07 Thread Robert Haas

On Mon, Jun 7, 2021 at 12:57 AM Kyotaro Horiguchi
 wrote:
> Unfortunately no. The backslashes in the binary path need to be
> escaped. (taken from PostgresNode.pm:1008)
>
> > (my $perlbin = $^X) =~ s{\\}{}g if ($TestLib::windows_os);
> > $node_primary->append_conf(
> >   'postgresql.conf', qq(
> > archive_command = '$perlbin "$FindBin::RealBin/cp_history_files" "%p" 
> > "$archivedir_primary/%f"'
> > ));
>
> This works for me.

Hmm, OK. Do you think we also need to use perl2host in this case?

> Ugh! Sorry. I meant "The explicit teardowns are useless". That's not
> harmful but it is done by PostgresNode.pm automatically(implicitly)
> and we don't do that in the existing scripts.

OK. I don't think it's a big deal, but we can remove them.

> As I said upthread the relationship between receiveTLI and
> recoveryTargetTLI is not confirmed yet at the point.
> findNewestTimeLine() simply searches for the history file with the
> largest timeline id so the returned there's a case where the timeline
> id that the function returns is not a future of the latest checkpoint
> TLI.  I think that the fact that rescanLatestTimeLine() checks the
> relationship is telling us that we need to do the same in the path as
> well.
>
> In my previous proposal, it is done just after the line the patch
> touches but it can be in the if (fetching_ckpt) branch.

I went back and looked at your patch again, now that I understand the
issue better. I believe it's not necessary to do this here, because
StartupXLOG() already contains a check for the same thing:

/*
 * If the location of the checkpoint record is not on the expected
 * timeline in the history of the requested timeline, we cannot proceed:
 * the backup is not part of the history of the requested timeline.
 */
Assert(expectedTLEs);   /* was initialized by reading checkpoint
 * record */
if (tliOfPointInHistory(checkPointLoc, expectedTLEs) !=
checkPoint.ThisTimeLineID)
...

This code is always run after ReadCheckpointRecord() returns. And I
think that your only concern here is about the case where the
checkpoint record is being fetched, because otherwise expectedTLEs
must already be set.

By the way, I also noticed that your version of the patch contains a
few words which are spelled incorrectly: hearafter, and incosistent.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: back-port one-line gcc-10+ warning fix to REL_10_STABLE

2021-06-07 Thread Alvaro Herrera

On 2021-Jun-07, Anton Voloshin wrote:

> Hello, hackers,
> 
> Currently, REL_10_STABLE can't be compiled with gcc-10 or 11, -Werror and
> "./configure" without arguments. E.g. gcc-11 gives an error:

Hi, thanks for the report.  I noticed that the commit that introduced
this (41306a511c01) was introduced in 9.5, so I was surprised that you
report it doesn't complain in 9.6.  Turns out that Peter E had fixed the
issue, but only in 9.5 and 9.6; I don't really understand why no fix was
applied to 10.  I forward-ported that commit to 10, which should also
fix the problem.  Branches 11 and up already have Tom Lane's fix.

-- 
Álvaro Herrera   Valdivia, Chile
"[PostgreSQL] is a great group; in my opinion it is THE best open source
development communities in existence anywhere."(Lamar Owen)

Re: SQL-standard function body

2021-06-07 Thread Tom Lane

Julien Rouhaud  writes:
> On Mon, Jun 7, 2021 at 4:52 PM Peter Eisentraut
>  wrote:
>> Your patch filters out empty statements at the parse transformation
>> phase, so they are no longer present when you dump the body back out.
>> So your edits in the test expected files don't fit.

> Oh, somehow the tests aren't failing here, I'm not sure what I did wrong.

Modulo getting the tests right ...

>> I suggest we just prohibit empty statements at the parse stage.  I don't
>> see a strong reason to allow them, and if we wanted to, we'd have to do
>> more work, e.g., in ruleutils.c to print them back out correctly.

> I always thought extraneous semicolons were tokens to be ignored,

... I tend to agree with Julien's position here.  It seems really ugly
to prohibit empty statements just for implementation convenience.
However, the way I'd handle it is to have the grammar remove them,
which is what it does in other contexts.  I don't think there's any
need to preserve them in ruleutils output --- there's a lot of other
normalization we do on the way to that, and this seems to fit in.

BTW, is it just me, or does SQL:2021 fail to permit multiple
statements in a procedure at all?  After much searching, I found the
BEGIN ATOMIC ... END syntax, but it's in ,
in other words the body of a trigger not a procedure.  I cannot find
any production that connects a  to that.  There's an
example showing use of BEGIN ATOMIC as a procedure statement, so
they clearly *meant* to allow it, but it looks like somebody messed
up the grammar.

regards, tom lane

Re: SSL SNI

2021-06-07 Thread Tom Lane

Peter Eisentraut  writes:
> Patch attached.  Empty host string was handled implicitly by the IP 
> detection expression, but I added an explicit check for sanity.  (I 
> wasn't actually able to get an empty string to this point, but it's 
> clearly better to be prepared for it.)

Yeah, I'd include the empty-string test just because it's standard
practice in this area of libpq.  Whether those tests are actually
triggerable in every case is obscure, but ...

Patch looks sane by eyeball, though I didn't test it.

regards, tom lane

Re: libpq debug log

2021-06-07 Thread Robert Haas

On Sat, Jun 5, 2021 at 11:03 AM Alvaro Herrera  wrote:
> > This added a PQtraceSetFlags() function.  We have a dozen PQset*() 
> > functions,
> > but this and PQresultSetInstanceData() are the only PQSomethingSet()
> > functions.  Is it okay if I rename it to PQsetTraceFlags()?  I think that
> > would be more idiomatic for libpq.
>
> Hmm, there is an obvious parallel between PQtrace() and
> PQtraceSetFlags() which is lost with your proposed rename.  I'm not
> wedded to the name though, so I'm just -0.1 on the rename.  If you feel
> strongly about it, I won't oppose it.

I'm +1 for the rename. I think it's a lot more idiomatic.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: SQL-standard function body

2021-06-07 Thread Julien Rouhaud

On Mon, Jun 7, 2021 at 11:27 PM Tom Lane  wrote:
>
> Julien Rouhaud  writes:
> > On Mon, Jun 7, 2021 at 4:52 PM Peter Eisentraut
> >  wrote:
> >> Your patch filters out empty statements at the parse transformation
> >> phase, so they are no longer present when you dump the body back out.
> >> So your edits in the test expected files don't fit.
>
> > Oh, somehow the tests aren't failing here, I'm not sure what I did wrong.
>
> Modulo getting the tests right ...

I can certainly accept that my patch broke the tests, but I just ran
another make check-world and it passed without any problem.  What am I
missing?

Re: please update ps display for recovery checkpoint

2021-06-07 Thread Bossart, Nathan

On 6/6/21, 7:14 PM, "Justin Pryzby"  wrote:
> Now, I wonder whether the startup process should also include some detail 
> about
> "syncing data dir".  It's possible to strace the process to see what it's
> doing, but most DBA would probably not know that, and it's helpful to know the
> status of recovery and what part of recovery is slow: sync, replay, or
> checkpoint.  commit df9274adf improved the situation between replay and
> ckpoint, but it's still not clear what "postgres: startup" means before replay
> starts.

I've seen a few functions cause lengthy startups, including
SyncDataDirectory() (for which I was grateful to see 61752afb),
StartupReorderBuffer(), and RemovePgTempFiles().  I like the idea of
adding additional information in the ps title, but I also think it is
worth exploring additional ways to improve on these O(n) startup
tasks.

Nathan

Re: please update ps display for recovery checkpoint

2021-06-07 Thread Robert Haas

On Mon, Jun 7, 2021 at 12:02 PM Bossart, Nathan  wrote:
> On 6/6/21, 7:14 PM, "Justin Pryzby"  wrote:
> > Now, I wonder whether the startup process should also include some detail 
> > about
> > "syncing data dir".  It's possible to strace the process to see what it's
> > doing, but most DBA would probably not know that, and it's helpful to know 
> > the
> > status of recovery and what part of recovery is slow: sync, replay, or
> > checkpoint.  commit df9274adf improved the situation between replay and
> > ckpoint, but it's still not clear what "postgres: startup" means before 
> > replay
> > starts.
>
> I've seen a few functions cause lengthy startups, including
> SyncDataDirectory() (for which I was grateful to see 61752afb),
> StartupReorderBuffer(), and RemovePgTempFiles().  I like the idea of
> adding additional information in the ps title, but I also think it is
> worth exploring additional ways to improve on these O(n) startup
> tasks.

See also the nearby thread entitled "when the startup process doesn't"
which touches on this same issue.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Duplicate history file?

2021-06-07 Thread Stephen Frost

Greetings,

* Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> So, this is the new new thread.

This is definitely not the way I would recommend starting up a new
thread as you didn't include the actual text of the prior discussion for
people to be able to read and respond to, instead making them go hunt
for the prior discussion on the old thread and negating the point of
starting a new thread..

Still, I went and found the other email-

* Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> At Mon, 31 May 2021 11:52:05 +0900, Tatsuro Yamada 
>  wrote in 
> > Since the above behavior is different from the behavior of the
> > test command in the following example in postgresql.conf, I think
> > we should write a note about this example.
> > 
> > # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p
> > # /mnt/server/archivedir/%f'
> >
> > Let me describe the problem we faced.
> > - When archive_mode=always, archive_command is (sometimes) executed
> >   in a situation where the history file already exists on the standby
> >   side.
> > 
> > - In this case, if "test ! -f" is written in the archive_command of
> >   postgresql.conf on the standby side, the command will keep failing.
> > 
> >   Note that this problem does not occur when archive_mode=on.
> > 
> > So, what should we do for the user? I think we should put some notes
> > in postgresql.conf or in the documentation. For example, something
> > like this:

First off, we should tell them to not use test or cp in their actual
archive command because they don't do things like make sure that the WAL
that's been archived has actually been fsync'd.  Multiple people have
tried to make improvements in this area but the long and short of it is
that trying to provide a simple archive command in the documentation
that actually *works* isn't enough- you need a real tool.  Maybe someone
will write one some day that's part of core, but it's not happened yet
and instead there's external solutions which actually do the correct
things.

The existing documentation should be taken as purely "this is how the
variables which are passed in get expanded" not as "this is what you
should do", because it's very much not the latter in any form.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: Make unlogged table resets detectable

2021-06-07 Thread Robert Haas

On Fri, Jun 4, 2021 at 8:41 PM Jeff Davis  wrote:
> Stepping back, maybe unlogged tables are the wrong level to solve this
> problem. We could just have a "crash counter" in pg_control that would
> be incremented every time a crash happened (and all unlogged tables are
> reset). It might be a number or maybe the LSN of the startup checkpoint
> after the most recent crash.
>
> A SQL function could read the value. Perhaps we'd also have a SQL
> function to reset it, but I don't see a use case for it.
>
> Then, it's up to the client to check it against a stored value, and
> clear/repopulate unlogged tables as necessary.

I think this would be useful for a variety of purposes. Both being
able to know the last time that it happened and being able to know the
number of times that it happened could be useful, depending on the
scenario. For example, if one of my employer's customers began
complaining about a problem that started happening recently, it would
be useful to be able to establish whether there had also been a crash
recently, and a timestamp or LSN would help a lot. On the other hand,
if we had a counter, we'd probably find out some interesting things,
too. Maybe someone would report that the value of the counter was
surprisingly large. For example, if a customer's pg_control output
showed that the database cluster had performed crash recovery 162438
times, I might have some, err, followup questions.

This is not a vote for or against any specific proposal; it's just a
general statement that I support trying to do something in this area,
and that it feels like anything we do will likely have some value.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Make unlogged table resets detectable

2021-06-07 Thread Tom Lane

Robert Haas  writes:
> On Fri, Jun 4, 2021 at 8:41 PM Jeff Davis  wrote:
>> Stepping back, maybe unlogged tables are the wrong level to solve this
>> problem. We could just have a "crash counter" in pg_control that would
>> be incremented every time a crash happened (and all unlogged tables are
>> reset). It might be a number or maybe the LSN of the startup checkpoint
>> after the most recent crash.

> I think this would be useful for a variety of purposes. Both being
> able to know the last time that it happened and being able to know the
> number of times that it happened could be useful, depending on the
> scenario.

+1.  I'd support recording the time of the last crash recovery, as
well as having a counter.  I think an LSN would not be as useful
as a timestamp.

regards, tom lane

Re: Race condition in recovery?

2021-06-07 Thread Robert Haas

Hi,

I tried back-porting my version of this patch to 9.6 to see what would
happen there. One problem is that some of the functions have different
names before v10. So 9.6 needs this:

-"SELECT pg_walfile_name(pg_current_wal_lsn());");
+"SELECT pg_xlogfile_name(pg_current_xlog_location());");

But there's also another problem, which is that this doesn't work before v12:

$node_standby->psql('postgres', 'SELECT pg_promote()');

So I tried changing it to this:

$node_standby->promote;

But then the test fails, because pg_promote() has logic built into it
to wait until the promotion actually happens, but ->promote doesn't,
so SELECT pg_walfile_name(pg_current_wal_lsn()) errors out because the
system is still in recovery. I'm not sure what to do about that. I
quickly tried adding -w to 'sub promote' in PostgresNode.pm, but that
didn't fix it, so I guess we'll have to find some other way to wait
until the promotion is complete.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: SQL-standard function body

2021-06-07 Thread Tom Lane

I wrote:
> ... I tend to agree with Julien's position here.  It seems really ugly
> to prohibit empty statements just for implementation convenience.
> However, the way I'd handle it is to have the grammar remove them,
> which is what it does in other contexts.

Concretely, I think the right fix is per attached.

Like Julien, I don't see any additional change in regression test outputs.
Maybe Peter thinks there should be some?  But I think the reverse-listing
we get for functest_S_3a is fine.

regards, tom lane

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 9ee90e3f13..52a254928f 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7990,7 +7990,11 @@ opt_routine_body:
 routine_body_stmt_list:
 			routine_body_stmt_list routine_body_stmt ';'
 {
-	$$ = lappend($1, $2);
+	/* As in stmtmulti, discard empty statements */
+	if ($2 != NULL)
+		$$ = lappend($1, $2);
+	else
+		$$ = $1;
 }
 			| /*EMPTY*/
 {
diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out
index 5b6bc5eddb..5955859bb5 100644
--- a/src/test/regress/expected/create_function_3.out
+++ b/src/test/regress/expected/create_function_3.out
@@ -267,7 +267,7 @@ CREATE FUNCTION functest_S_3() RETURNS boolean
 RETURN false;
 CREATE FUNCTION functest_S_3a() RETURNS boolean
 BEGIN ATOMIC
-RETURN false;
+;;RETURN false;;
 END;
 CREATE FUNCTION functest_S_10(a text, b date) RETURNS boolean
 LANGUAGE SQL
diff --git a/src/test/regress/sql/create_function_3.sql b/src/test/regress/sql/create_function_3.sql
index 4b778999ed..6e8b838ff2 100644
--- a/src/test/regress/sql/create_function_3.sql
+++ b/src/test/regress/sql/create_function_3.sql
@@ -165,7 +165,7 @@ CREATE FUNCTION functest_S_3() RETURNS boolean
 RETURN false;
 CREATE FUNCTION functest_S_3a() RETURNS boolean
 BEGIN ATOMIC
-RETURN false;
+;;RETURN false;;
 END;
 
 CREATE FUNCTION functest_S_10(a text, b date) RETURNS boolean

Re: Tid scan improvements

2021-06-07 Thread Peter Eisentraut


On 07.06.21 13:50, David Rowley wrote:

On Mon, 7 Jun 2021 at 23:46, Edmund Horner  wrote:


On Mon, 7 Jun 2021 at 22:11, Peter Eisentraut 
 wrote:


This patch didn't add _outTidRangePath() even though we have outNode()
coverage for most/all path nodes.  Was this just forgotten?  See
attached patch.



Yes, it looks like an omission.  Thanks for spotting it.  Patch looks good to 
me.


Yeah. That was forgotten.  Patch also looks fine to me.  Do you want
to push it, Peter?


done

Re: SQL-standard function body

2021-06-07 Thread Peter Eisentraut




On 07.06.21 17:27, Tom Lane wrote:

... I tend to agree with Julien's position here.  It seems really ugly
to prohibit empty statements just for implementation convenience.
However, the way I'd handle it is to have the grammar remove them,
which is what it does in other contexts.  I don't think there's any
need to preserve them in ruleutils output --- there's a lot of other
normalization we do on the way to that, and this seems to fit in.


Ok, if that's what people prefer.


BTW, is it just me, or does SQL:2021 fail to permit multiple
statements in a procedure at all?  After much searching, I found the
BEGIN ATOMIC ... END syntax, but it's in ,
in other words the body of a trigger not a procedure.  I cannot find
any production that connects a  to that.  There's an
example showing use of BEGIN ATOMIC as a procedure statement, so
they clearly*meant*  to allow it, but it looks like somebody messed
up the grammar.


It's in the SQL/PSM part.

Re: CALL versus procedures with output-only arguments

2021-06-07 Thread Peter Eisentraut


On 04.06.21 23:07, Tom Lane wrote:

I wrote:

It would likely not be very hard to fix pg_dump to include explicit
IN markers.  I don't think this results in a compatibility problem
for existing dumps, since they won't be taken from databases in
which there are procedures with OUT arguments.


Actually, all we have to do to fix pg_dump is to tweak ruleutils.c
(although this has some effects on existing regression test outputs,
of course).  So maybe it's not as bad as all that.

Here's a draft-quality patch to handle ALTER/DROP this way.  I think
the code may be finished, but I've not looked at the docs at all.

0001 is the same patch I posted earlier, 0002 is a delta to enable
handling ALTER/DROP per spec.


I checked these patches.  They appear to match what was talked about.  I 
didn't find anything surprising.  I couldn't apply the 0002 after 
applying 0001 to today's master, so I wasn't able to do more exploratory 
testing.  What are these patches based on?  Are there are any more open 
issues to focus on?


One thing I was wondering is whether we should force CALL arguments in 
direct SQL to be null rather than allowing arbitrary expressions.  Since 
there is more elaborate code now to process the CALL arguments, maybe it 
would be easier than before to integrate that.

automatically generating node support functions

2021-06-07 Thread Peter Eisentraut

I wrote a script to automatically generate the node support functions 
(copy, equal, out, and read, as well as the node tags enum) from the 
struct definitions.


The first eight patches are to clean up various inconsistencies to make 
parsing or generation easier.


The interesting stuff is in patch 0009.

For each of the four node support files, it creates two include files, 
e.g., copyfuncs.inc1.c and copyfuncs.inc2.c to include in the main file. 
 All the scaffolding of the main file stays in place.


In this patch, I have only ifdef'ed out the code to could be removed, 
mainly so that it won't constantly have merge conflicts.  Eventually, 
that should all be changed to delete the code.  When we do that, some 
code comments should probably be preserved elsewhere, so that will need 
another pass of consideration.


I have tried to mostly make the coverage of the output match what is 
currently there.  For example, one could do out/read coverage of utility 
statement nodes easily with this, but I have manually excluded those for 
now.  The reason is mainly that it's easier to diff the before and 
after, and adding a bunch of stuff like this might require a separate 
analysis and review.


Subtyping (TidScan -> Scan) is supported.

For the hard cases, you can just write a manual function and exclude 
generating one.


For the not so hard cases, there is a way of annotating struct fields to 
get special behaviors.  For example, pg_node_attr(equal_ignore) has the 
field ignored in equal functions.


There are a couple of additional minor issues mentioned in the script 
source.  But basically, it all seems to work.
From c782871d6cc59e2fed232c78c307d63e72cbb3d5 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 7 Jun 2021 15:45:14 +0200
Subject: [PATCH v1 01/10] Rename NodeTag of ExprState

Rename from tag to type, for consistency with all other node structs.
---
 src/backend/executor/execExpr.c | 4 ++--
 src/include/nodes/execnodes.h   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 8c9f8a6aeb..c6ba11d035 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -363,7 +363,7 @@ ExecBuildProjectionInfo(List *targetList,
 
projInfo->pi_exprContext = econtext;
/* We embed ExprState into ProjectionInfo instead of doing extra palloc 
*/
-   projInfo->pi_state.tag = T_ExprState;
+   projInfo->pi_state.type = T_ExprState;
state = &projInfo->pi_state;
state->expr = (Expr *) targetList;
state->parent = parent;
@@ -531,7 +531,7 @@ ExecBuildUpdateProjection(List *targetList,
 
projInfo->pi_exprContext = econtext;
/* We embed ExprState into ProjectionInfo instead of doing extra palloc 
*/
-   projInfo->pi_state.tag = T_ExprState;
+   projInfo->pi_state.type = T_ExprState;
state = &projInfo->pi_state;
if (evalTargetList)
state->expr = (Expr *) targetList;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 7795a69490..8fa9c8aff6 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -60,7 +60,7 @@ typedef Datum (*ExprStateEvalFunc) (struct ExprState 
*expression,
 
 typedef struct ExprState
 {
-   NodeTag tag;
+   NodeTag type;
 
uint8   flags;  /* bitmask of EEO_FLAG_* bits, 
see above */
 
-- 
2.31.1

From 7746ddd4f2a9534322b2b9226007638d3142c0c7 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 7 Jun 2021 15:47:56 +0200
Subject: [PATCH v1 02/10] Rename argument of _outValue()

Rename from value to node, for consistency with similar functions.
---
 src/backend/nodes/outfuncs.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 04696f613c..b54c57d09f 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -3402,12 +3402,12 @@ _outAExpr(StringInfo str, const A_Expr *node)
 }
 
 static void
-_outValue(StringInfo str, const Value *value)
+_outValue(StringInfo str, const Value *node)
 {
-   switch (value->type)
+   switch (node->type)
{
case T_Integer:
-   appendStringInfo(str, "%d", value->val.ival);
+   appendStringInfo(str, "%d", node->val.ival);
break;
case T_Float:
 
@@ -3415,7 +3415,7 @@ _outValue(StringInfo str, const Value *value)
 * We assume the value is a valid numeric literal and 
so does not
 * need quoting.
 */
-   appendStringInfoString(str, value->val.str);
+   appendStringInfoString(str, node->val.str);
break;
case T_String:
 
@@ -3424,20 +3424,20 @@ _outValue(StringInfo str

Re: CALL versus procedures with output-only arguments

2021-06-07 Thread Tom Lane

Peter Eisentraut  writes:
> On 04.06.21 23:07, Tom Lane wrote:
>> 0001 is the same patch I posted earlier, 0002 is a delta to enable
>> handling ALTER/DROP per spec.

> I checked these patches.  They appear to match what was talked about.  I 
> didn't find anything surprising.  I couldn't apply the 0002 after 
> applying 0001 to today's master, so I wasn't able to do more exploratory 
> testing.  What are these patches based on?  Are there are any more open 
> issues to focus on?

Hmm, these are atop HEAD from a week or so back.  The cfbot seems to
think they still apply.  In any case, I was about to spend some effort
on the docs, so I'll post an updated version soon (hopefully today).

> One thing I was wondering is whether we should force CALL arguments in 
> direct SQL to be null rather than allowing arbitrary expressions.  Since 
> there is more elaborate code now to process the CALL arguments, maybe it 
> would be easier than before to integrate that.

Yeah.  We could possibly do that, but at first glance it seems like it
would be adding code for little purpose except nanny-ism.

One angle that maybe needs discussion is what about CALL in SQL-language
functions.  I see that's disallowed right now.  If we're willing to keep
it that way until somebody implements local variables a la SQL/PSM,
then we could transition smoothly to having the same definition as in
plpgsql, where you MUST write a variable.  If we wanted to open it up
sooner, we'd have to plan on ending with a definition like "write either
a variable, or NULL to discard the value", so that enforcing
must-be-NULL in the interim would make sense to prevent future
surprises.  But IMO that would be best done as a SQL-language-function
specific restriction.

I suppose if you imagine that we might someday have variables in
top-level SQL, then the same argument would apply there.  But we already
guaranteed ourselves some conversion pain for that scenario with respect
to INOUT parameters, so I doubt that locking down OUT parameters will
help much.

My inclination is to not bother adding the restriction, but it's
only a mild preference.

regards, tom lane

A modest proposal vis hierarchical queries: MINUS in the column list

2021-06-07 Thread Mark Zellers

One of the friction points I have found in migrating from Oracle to PostgreSQL 
is in the conversion of hierarchical queries from the Oracle START WITH/CONNECT 
BY/ORDER SIBLINGS by pattern to using the ANSI recursive subquery form.

Once you wrap your head around it, the ANSI form is not so bad with one major 
exception.  In order to achieve the equivalent of Oracle’s  ORDER SIBLINGS BY 
clause, you need to add an additional column containing an array with the 
accumulated path back to the root of the hierarchy for each row.  The problem 
with that is that it leaves you with an unfortunate choice: either accept the 
inefficiency of returning the array with the path back to the client (which the 
client doesn’t need or want), or requiring the application to explicitly list 
all of the columns that it wants just to exclude the hierarchy column, which 
can be hard to maintain, especially if your application needs to support both 
databases.  If you have a ORM model where there could be multiple queries that 
share the same client code to read the result set, you might have to change 
multiple queries when new columns are added to a table or view even though you 
have centralized the processing of the result set.

The ideal solution for this would be for PostgreSQL to support the Oracle 
syntax and internally convert it to the ANSI form.  Failing that, I have a 
modest suggestion that I would like to start a discussion around.  What if you 
could use the MINUS keyword in the column list of a select statement to remove 
a column from the result set returned to the client?  What I have in mind is 
something like this:

To achieve the equivalent of the following Oracle query:


  SELECT T.*
  FROM T
   START WITH T.ParentID IS NULL
   CONNECT BY T.ParentID = PRIOR T.ID
  ORDER SIBLINGS BY T.OrderVal

You could use

  WITH RECURSIVE TT AS (
  SELECT T0.*, ARRAY[]::INTEGER[] || T.OrderVal AS Sortable
 FROM T T0
 UNION ALL
SELECT T1.*, TT.Sortable || T1 AS Sortable
   FROM TT
  INNER JOIN T T1 ON (T1.ParentID = TT.ID)
)
   SELECT TT.* MINUS TT.Sortable
  FROM TT
ORDER BY TT.Sortable

Now the Sortable column can be used to order the result set but is not returned 
to the client.

Not knowing the internals of the parser, I’m assuming that the use of MINUS in 
this construct would be distinguishable from the set difference use case 
because the expression being subtracted is a column (or perhaps even a lst of 
columns) rather than a SELECT expression.

Re: security_definer_search_path GUC

2021-06-07 Thread Joel Jacobson

On Fri, Jun 4, 2021, at 18:03, Pavel Stehule wrote:
> 
> 
> pá 4. 6. 2021 v 17:43 odesílatel Joel Jacobson  napsal:
>> __
>> Maybe this could work:
>> CREATE SCHEMA schema_name UNQUALIFIED;
>> Which would explicitly make all the objects created in the schema accessible 
>> unqualified, but also enforce there are no conflicts with other objects in 
>> existence in all unqualified schemas, upon the creation of new objects.
> 
> Yes, it can work. I am not sure if "unqualified" is the best keyword, but the 
> idea is workable.

So maybe a combination of some kind of GUC to restrict search_path in some way,
and a settable/unsettable new boolean pg_namespace column
to control if the schema should be accessible unqualified or not?

If we don't like "UNQUALIFIED" as a keyword, maybe we could reuse "PUBLIC"?
Or will that be confusing since "PUBLIC" is also a role_specification?

I think unqualified=true should mean a schema doesn't need to be part of the 
search_path, to be accessible unqualified.

This means, "pg_catalog" and "public", might have unqualified=false, as their 
default values.

Setting unqualified=true for "pg_catalog" and "public" would enforce there are 
no overlapping objects between the two.

Marko, in your use-case with the "compat" schema, do you think it would work to 
just do
ALTER SCHEMA compat DROP UNQUALIFIED (or whatever the command should be)
when upgrading to the new major version, where compat isn't necessary,
similar to changing the GUC to not include "compat"?

IMO, the biggest disadvantage with this idea is that it undeniably increases 
complexity of name resolution further,
since it's then yet another thing to take into account. But maybe it's worth 
it, if the GUC to restrict search_path,
can effectively reduce complexity, when used in combination with this other 
proposed feature.

I think it's a really difficult question. I strongly feel something should be 
done in this area to improve the situation,
but it's far from obvious what the right thing to do is.

/Joel

Re: Add PortalDrop in exec_execute_message

2021-06-07 Thread Alvaro Herrera

On 2021-May-27, Yura Sokolov wrote:

> Alvaro Herrera писал 2021-05-26 23:59:

> > I don't think they should do that.  The portal remains open, and the
> > libpq interface does that.  The portal gets closed at end of transaction
> > without the need for any client message.  I think if the client wanted
> > to close the portal ahead of time, it would need a new libpq entry point
> > to send the message to do that.
> 
> - PQsendQuery issues Query message, and exec_simple_query closes its
>   portal.
> - people doesn't expect PQsendQueryParams to be different from
>   PQsendQuery aside of parameter sending. The fact that the portal
>   remains open is a significant, unexpected and undesired difference.
> - PQsendQueryGuts is used in PQsendQueryParams and PQsendQueryPrepared.
>   It is always sends empty portal name and always "send me all rows"
>   limit (zero). Both usages are certainly to just perform query and
>   certainly no one expects portal remains open.

Thinking about it some more, Yura's argument about PQsendQuery does make
sense -- since what we're doing is replacing the use of a 'Q' message
just because we can't use it when in pipeline mode, then it is
reasonable to think that the replacement ought to have the same
behavior.  Upon receipt of a 'Q' message, the portal is closed
automatically, and ISTM that that behavior should be preserved.

That change would not solve the problem he complains about, because IIUC
his framework is using PQsendQueryPrepared, which I'm not proposing to
change.  It just removes the other discrepancy that was discussed in the
thread.

The attached patch does it.  Any opinions?

-- 
Álvaro Herrera   Valdivia, Chile
"[PostgreSQL] is a great group; in my opinion it is THE best open source
development communities in existence anywhere."(Lamar Owen)

Re: security_definer_search_path GUC

2021-06-07 Thread David G. Johnston

On Fri, Jun 4, 2021 at 9:03 AM Pavel Stehule 
wrote:

> pá 4. 6. 2021 v 17:43 odesílatel Joel Jacobson  napsal:
>
>> Maybe this could work:
>> CREATE SCHEMA schema_name UNQUALIFIED;
>> Which would explicitly make all the objects created in the schema
>> accessible unqualified, but also enforce there are no conflicts with other
>> objects in existence in all unqualified schemas, upon the creation of new
>> objects.
>>
>
> Yes, it can work. I am not sure if "unqualified" is the best keyword, but
> the idea is workable.
>

Sounds like a job for an event trigger listening to CREATE/ALTER schema.

David J.

Re: security_definer_search_path GUC

2021-06-07 Thread David G. Johnston

On Mon, Jun 7, 2021 at 2:09 PM David G. Johnston 
wrote:

> On Fri, Jun 4, 2021 at 9:03 AM Pavel Stehule 
> wrote:
>
>> pá 4. 6. 2021 v 17:43 odesílatel Joel Jacobson 
>> napsal:
>>
>>> Maybe this could work:
>>> CREATE SCHEMA schema_name UNQUALIFIED;
>>> Which would explicitly make all the objects created in the schema
>>> accessible unqualified, but also enforce there are no conflicts with other
>>> objects in existence in all unqualified schemas, upon the creation of new
>>> objects.
>>>
>>
>> Yes, it can work. I am not sure if "unqualified" is the best keyword, but
>> the idea is workable.
>>
>
> Sounds like a job for an event trigger listening to CREATE/ALTER schema.
>

Never mind...I got mixed up a bit on what this all is purporting to do.  My
intent was to try and solve the problem with existing features (event
triggers) instead of inventing new ones, which is still desirable.

David J.

Re: security_definer_search_path GUC

2021-06-07 Thread David G. Johnston

On Mon, Jun 7, 2021 at 1:55 PM Joel Jacobson  wrote:

> If we don't like "UNQUALIFIED" as a keyword, maybe we could reuse "PUBLIC"?
> Or will that be confusing since "PUBLIC" is also a role_specification?
>
>
For me the concept resembles explicitly denoting certain schemas as being
simple tags, while the actual "namespace" is the GLOBAL namespace.  Today
there is no global namespace, all schemas generate their own individual
namespace in addition to "tagging" their objects with a textual label.

Avoiding "public" is highly desirable.

To access a global object you should be able to still specify its schema
tag.  Unqualified means "use search_path"; and "use search_path" includes
global.  But there is a truth table waiting to be created to detail what
combinations result in errors (including where those errors occur - runtime
or creation time).

David J.

Re: A modest proposal vis hierarchical queries: MINUS in the column list

2021-06-07 Thread David G. Johnston

On Mon, Jun 7, 2021 at 1:54 PM Mark Zellers 
wrote:

> Failing that, I have a modest suggestion that I would like to start a
> discussion around.  What if you could use the MINUS keyword in the column
> list of a select statement to remove a column from the result set returned
> to the client?
>

I asked this a decade ago and got no useful responses.

https://www.postgresql.org/message-id/flat/02e901cc2bb4%2476bc2090%24643461b0%24%40yahoo.com#3784fab26b0f946b3239266e3b70a6ce

I will say I've still had the itch to want it occasionally in the years
since, though not frequently.

David J.

Re: Add PortalDrop in exec_execute_message

2021-06-07 Thread Tom Lane

Alvaro Herrera  writes:
> The attached patch does it.  Any opinions?

My opinion is there's no patch here.

regards, tom lane

Re: Add PortalDrop in exec_execute_message

2021-06-07 Thread Alvaro Herrera

On 2021-Jun-07, Alvaro Herrera wrote:

> The attached patch does it.  Any opinions?

Eh, really attached.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"No es bueno caminar con un hombre muerto"
>From c5c6e8860e9d425ddea82e32868fedc7562ec51c Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 7 Jun 2021 18:06:28 -0400
Subject: [PATCH] Add 'Portal Close' to pipelined PQsendQuery()

---
 src/interfaces/libpq/fe-exec.c| 8 +++-
 .../modules/libpq_pipeline/traces/pipeline_abort.trace| 2 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 03592bdce9..213e5576a1 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1329,7 +1329,8 @@ PQsendQueryInternal(PGconn *conn, const char *query, bool newQuery)
 	{
 		/*
 		 * In pipeline mode we cannot use the simple protocol, so we send
-		 * Parse, Bind, Describe Portal, Execute.
+		 * Parse, Bind, Describe Portal, Execute, Close Portal (with the
+		 * unnamed portal).
 		 */
 		if (pqPutMsgStart('P', conn) < 0 ||
 			pqPuts("", conn) < 0 ||
@@ -1355,6 +1356,11 @@ PQsendQueryInternal(PGconn *conn, const char *query, bool newQuery)
 			pqPutInt(0, 4, conn) < 0 ||
 			pqPutMsgEnd(conn) < 0)
 			goto sendFailed;
+		if (pqPutMsgStart('C', conn) < 0 ||
+			pqPutc('P', conn) < 0 ||
+			pqPuts("", conn) < 0 ||
+			pqPutMsgEnd(conn) < 0)
+			goto sendFailed;
 
 		entry->queryclass = PGQUERY_EXTENDED;
 		entry->query = strdup(query);
diff --git a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
index 254e485997..3fce548b99 100644
--- a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
+++ b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
@@ -38,6 +38,7 @@ F	26	Parse	 "" "SELECT 1; SELECT 2" 0
 F	12	Bind	 "" "" 0 0 0
 F	6	Describe	 P ""
 F	9	Execute	 "" 0
+F	6	Close	 P ""
 F	4	Sync
 B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "42601" M "cannot insert multiple commands into a prepared statement" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
@@ -45,6 +46,7 @@ F	54	Parse	 "" "SELECT 1.0/g FROM generate_series(3, -1, -1) g" 0
 F	12	Bind	 "" "" 0 0 0
 F	6	Describe	 P ""
 F	9	Execute	 "" 0
+F	6	Close	 P ""
 F	4	Sync
 B	4	ParseComplete
 B	4	BindComplete
-- 
2.20.1

Re: A modest proposal vis hierarchical queries: MINUS in the column list

2021-06-07 Thread Tom Lane

"David G. Johnston"  writes:
> On Mon, Jun 7, 2021 at 1:54 PM Mark Zellers 
> wrote:
>> What if you could use the MINUS keyword in the column
>> list of a select statement to remove a column from the result set returned
>> to the client?

> I asked this a decade ago and got no useful responses.
> https://www.postgresql.org/message-id/flat/02e901cc2bb4%2476bc2090%24643461b0%24%40yahoo.com#3784fab26b0f946b3239266e3b70a6ce

I can recall more-recent requests for that too, though I'm too lazy
to go search the archives right now.

I'm fairly disinclined to do anything about it though, because I'm
afraid of the SQL committee standardizing some other syntax for the
same idea in future (or maybe worse, commandeering the same keyword
for some other feature).  It doesn't seem quite valuable enough to
take those risks for.

Note that it's not like SQL hasn't heard of projections before.
You can always do "SELECT a, b, d FROM subquery-yielding-a-b-c-d".
So the proposed syntax would save a small amount of typing, but
it's not adding any real new functionality.

regards, tom lane

Re: Add PortalDrop in exec_execute_message

2021-06-07 Thread Tom Lane

Alvaro Herrera  writes:
> On 2021-Jun-07, Alvaro Herrera wrote:
>> The attached patch does it.  Any opinions?

> Eh, really attached.

No particular objection.  I'm not sure this will behave quite the
same as simple-Query in error cases, but probably it's close enough.

I'm still wondering though why Yura is observing resources remaining
held by an executed-to-completion Portal.  I think investigating that
might be more useful than tinkering with pipeline mode.

regards, tom lane

logical decoding and replication of sequences

2021-06-07 Thread Tomas Vondra



Hi,

One of the existing limitations of logical decoding / replication is
that it does no care about sequences. The annoying consequence is that
after a failover to logical replica, all the table data may be
replicated but the sequences are still at the initial values, requiring
some custom solution that moves the sequences forward enough to prevent
duplicities.

There have been attempts to address this in the past, most recently [1],
but none of them got in due to various issues.

This is an attempt, based on [1] (but with many significant parts added
or reworked), aiming to deal with this. The primary purpose of sharing
it is getting feedback and opinions on the design decisions. It's still
a WIP - it works fine AFAICS, but some of the bits may be a bit hackish.

The overall goal is to have the same sequence data on the primary and
logical replica, or something sufficiently close to that, so that the
replica after a failover does not generate duplicate values.

This patch does a couple basic things:

1) extends the logical decoding to handle sequences. It adds a new
   callback, similarly to what we have for messages. There's a bit of
   complexity with transactional and non-transactional behavior, more
   about that later

2) extends test_decoding to support this new callback, printing the
   sequence increments (the decoded WAL records)

3) extends built-in replication to support sequences, so publications
   may contain both tables and sequences, etc., sequences data sync
   when creating subscriptions, etc.


transactional vs. non-transactional
---

The first part (extending logical decoding) is simple in principle. We
simply decode the sequence updates, but then comes a challenge - should
we just treat it transactionally and stash it in reorder buffer, or
just pass it to the output plugin right-away?

For messages, this can be specified as a flag when adding the message,
so the user can decide depending on the message purpose. For sequences,
all we do is nextval() and it depends on the context in which it's used,
we can't just pick one of those approaches.

Consider this, for example:

  CREATE SEQUENCE s;
  BEGIN;
SELECT nextval('s') FROM generate_series(1,1000) s(i);
  ROLLBACK;

If we handle this "transactionally", we'd stash the "nextval" increment
into the transaction, and then discard it due to the rollback, so the
output plugin (and replica) would never get it. So this is an argument
for non-transactional behavior.

On the other hand, consider this:

  CREATE SEQUENCE s;
  BEGIN;
 ALTER SEQUENCE s RESTART WITH 2000;
 SELECT nextval('s') FROM generate_series(1,1000) s(i);
  ROLLBACK;

In this case the ALTER creates a new relfilenode, and the ROLLBACK does
discard it including the effects of the nextval calls. So here we should
treat it transactionally, stash the increment(s) in the transaction and
just discard it all on rollback.

A somewhat similar example is this

  BEGIN;
 CREATE SEQUENCE s;
 SELECT nextval('s') FROM generate_series(1,1000) s(i);
  COMMIT;

Again - the decoded nextval needs to be handled transactionally, because
otherwise it's going to be very difficult for custom plugins to combine
this with DDL replication.

So the patch does a fairly simple thing:

1) By default, sequences are treated non-transactionally, i.e. sent to
   the output plugin right away.

2) We track sequences created in running (sub)transactions, and those
   are handled transactionally. This includes ALTER SEQUENCE cases,
   which create a new relfilenode, which is used as an identifier.

It's a bit more complex, because of cases like this:

  BEGIN;
 CREATE SEQUENCE s;
 SAVEPOINT a;
 SELECT nextval('s') FROM generate_series(1,1000) s(i);
 ROLLBACK TO a;
  COMMIT;

because we must not discard the nextval changes - this is handled by
always stashing the nextval changes to the subxact where the sequence
relfilenode was created.

The tracking is a bit cumbersome - there's a hash table with relfilenode
mapped to XID in which it was created. AFAIK that works, but might be
an issue with many sequences created in running transactions. Not sure.


detecting sequence creation
---

Detection that a sequence (or rather the relfilenode) was created is
done by adding a "created" flag into the xl_seq_rec, and setting it to
"true" in the first WAL record after the creation. There might be some
other way, but this seemed simple enough.


applying the sequence (ResetSequence2)
--

The decoding pretty much just extracts log_value, log_cnt and is_called
from the sequence, and passes them to the output plugin. On the replica
we extract those from the message, and write them to the local sequence
using a new ResetSequence2 function.

It's possible we don't really need log_cnt and is_called. After all,
log_cnt is zero most of the time anyway, and the worst thing that could
happen if we ignore it is w

Re: CALL versus procedures with output-only arguments

2021-06-07 Thread Tom Lane

I wrote:
> Hmm, these are atop HEAD from a week or so back.  The cfbot seems to
> think they still apply.  In any case, I was about to spend some effort
> on the docs, so I'll post an updated version soon (hopefully today).

Here is said update (rolled up into one patch this time; maybe that will
avoid the apply problems you had).

I noticed that there is one other loose end in the patch: should
LookupFuncName() really be passing OBJECT_ROUTINE to
LookupFuncNameInternal()?  This matches its old behavior, in which
no particular routine type restriction was applied; but I think that
the callers are nearly all expecting that only plain functions will
be returned.  That's more of a possible pre-existing bug than it
is the fault of the patch, but nonetheless this might be a good
time to resolve it.

regards, tom lane

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 16493209c6..f517a7d4af 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5905,9 +5905,8 @@ SCRAM-SHA-256$:&l
   
An array of the data types of the function arguments.  This includes
only input arguments (including INOUT and
-   VARIADIC arguments), as well as
-   OUT parameters of procedures, and thus represents
-   the call signature of the function or procedure.
+   VARIADIC arguments), and thus represents
+   the call signature of the function.
   
  
 
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 52f60c827c..0cd6e8b071 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -480,7 +480,7 @@ $$ LANGUAGE plpgsql;
 
  
   To call a function with OUT parameters, omit the
-  output parameter in the function call:
+  output parameter(s) in the function call:
 
 SELECT sales_tax(100.00);
 
@@ -523,16 +523,20 @@ $$ LANGUAGE plpgsql;
 
 
   In a call to a procedure, all the parameters must be specified.  For
-  output parameters, NULL may be specified.
+  output parameters, NULL may be specified when
+  calling the procedure from plain SQL:
 
 CALL sum_n_product(2, 4, NULL, NULL);
  sum | prod
 -+--
6 |8
 
-  Output parameters in procedures become more interesting in nested calls,
-  where they can be assigned to variables.  See  for details.
+
+  However, when calling a procedure
+  from PL/pgSQL, you should instead write a
+  variable for any output parameter; the variable will receive the result
+  of the call.  See 
+  for details.
  
 
  
@@ -2030,6 +2034,9 @@ BEGIN
 END;
 $$;
 
+ The variable corresponding to an output parameter can be a simple
+ variable or a field of a composite-type variable.  Currently,
+ it cannot be an element of an array.
 

 
diff --git a/doc/src/sgml/ref/alter_extension.sgml b/doc/src/sgml/ref/alter_extension.sgml
index 38fd60128b..c819c7bb4e 100644
--- a/doc/src/sgml/ref/alter_extension.sgml
+++ b/doc/src/sgml/ref/alter_extension.sgml
@@ -212,12 +212,11 @@ ALTER EXTENSION name DROP IN, OUT,
INOUT, or VARIADIC.
If omitted, the default is IN.
-   Note that ALTER EXTENSION does not actually pay any
-   attention to OUT arguments for functions and
-   aggregates (but not procedures), since only the input arguments are
-   needed to determine the function's identity.  So it is sufficient to
-   list the IN, INOUT, and
-   VARIADIC arguments for functions and aggregates.
+   Note that ALTER EXTENSION does not actually pay
+   any attention to OUT arguments, since only the input
+   arguments are needed to determine the function's identity.
+   So it is sufficient to list the IN, INOUT,
+   and VARIADIC arguments.
   
  
 
diff --git a/doc/src/sgml/ref/alter_procedure.sgml b/doc/src/sgml/ref/alter_procedure.sgml
index 9cbe2c7cea..033fda92ee 100644
--- a/doc/src/sgml/ref/alter_procedure.sgml
+++ b/doc/src/sgml/ref/alter_procedure.sgml
@@ -96,7 +96,7 @@ ALTER PROCEDURE name [ ( [ [ 
   The data type(s) of the procedure's arguments (optionally
   schema-qualified), if any.
+  See  for the details of how
+  the procedure is looked up using the argument data type(s).
  
 

diff --git a/doc/src/sgml/ref/call.sgml b/doc/src/sgml/ref/call.sgml
index abaa81c78b..9e83a77b7c 100644
--- a/doc/src/sgml/ref/call.sgml
+++ b/doc/src/sgml/ref/call.sgml
@@ -55,9 +55,24 @@ CALL name ( [ argument
 
  
-  An input argument for the procedure call.
-  See  for the full details on
-  function and procedure call syntax, including use of named parameters.
+  An argument expression for the procedure call.
+ 
+
+ 
+  Arguments can include parameter names, using the syntax
+  name => value.
+  This works the same as in ordinary function calls; see
+   for details.
+ 
+
+ 
+  Arguments must be supplied for all proc

Re: pgsql: Support parallel btree index builds.

2021-06-07 Thread Alvaro Herrera

On 2018-Feb-02, Robert Haas wrote:

> Support parallel btree index builds.

While looking at a complaint related to progress report of parallel
index builds[1], I noticed this comment

+   /*
+* Execute this worker's part of the sort.
+*
+* Unlike leader and serial cases, we cannot avoid calling
+* tuplesort_performsort() for spool2 if it ends up containing no dead
+* tuples (this is disallowed for workers by tuplesort).
+*/
+   tuplesort_performsort(btspool->sortstate);
+   if (btspool2)
+   tuplesort_performsort(btspool2->sortstate);

I've been trying to understand why this says "Unlike leader and serial
cases, ...".   I understand the "serial" part -- it refers to
_bt_leafbuild.  So I'm to understand that that one works differently;
see below.  But why does it say "the leader case"?  As far as I can see,
the leader executes exactly the same code, so what is the comment
talking about?

Now, if you do look at _bt_leafbuild(), it can be seen that nothing is
done differently there either; we're not actually skipping any calls to
tuplesort_performsort().  Any differentiation between serial/leader/
worker cases seems to be done inside that routine.  So the comment is
not very useful there either.

I am wondering if these comments are leftovers from early development
versions of this patch.  Maybe we could remove them -- or rewrite them
to indicate not that we avoid calling tuplesort_performsort(), but
instead to say that that function behaves differently.

[1] 
https://postgr.es/m/caeze2wgm-nnze3ronwjytvris8xsvhzzvbcgj34h06cdnua...@mail.gmail.com

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"

Re: A test for replay of regression tests

2021-06-07 Thread Anastasia Lubennikova

вт, 8 июн. 2021 г. в 02:25, Thomas Munro :

> Ok, here's a new version incorporating feedback so far.
>
> 1.  Invoke pg_regress directly (no make).
>
> 2.  Use PG_TEST_EXTRA="wal_consistency_checking" as a way to opt in to
> the more expensive test.
>
> 3.  Use parallel schedule rather than serial.  It's faster but also
> the non-determinism might discover more things.  This required
> changing the TAP test max_connections setting from 10 to 25.
>
> 4.  Remove some extraneous print statements and
> check-if-data-is-replicated-using-SELECT tests that are technically
> not needed (I had copied those from 001_stream_rep.pl).
>

Thank you for working on this test set!
I was especially glad to see the skip-tests option for pg_regress. I think
it will become a very handy tool for hackers.

To try the patch I had to resolve a few merge conflicts, see a rebased
version in attachments.

>   auth_extra   => [ '--create-role', 'repl_role' ]);
This line and the comment above it look like some copy-paste artifacts. Did
I get it right? If so, I suggest removing them.
Other than that, the patch looks good to me.

-- 
Best regards,
Lubennikova Anastasia
commit 2eeaf5802c060612831b3fdeb33401a07c230f83
Author: anastasia 
Date:   Tue Jun 8 02:01:35 2021 +0300

v3-0001-Test-replay-of-regression-tests.patch

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index cb401a45b3..7a10c83d8a 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -289,6 +289,17 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl'
   
  
 
+
+
+ wal_consistency_checking
+ 
+  
+   Uses wal_consistency_checking=all while running
+   some of the tests under src/test/recovery.  Not
+   enabled by default because it is resource intensive.
+  
+ 
+

 
Tests for features that are not supported by the current build
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 45d1636128..7b0af9fd78 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -484,7 +484,7 @@ sub init
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
 		print $conf "shared_buffers = 1MB\n";
-		print $conf "max_connections = 10\n";
+		print $conf "max_connections = 25\n";
 		# limit disk space consumption, too:
 		print $conf "max_wal_size = 128MB\n";
 	}
diff --git a/src/test/recovery/t/025_stream_rep_regress.pl b/src/test/recovery/t/025_stream_rep_regress.pl
new file mode 100644
index 00..8b125a1d67
--- /dev/null
+++ b/src/test/recovery/t/025_stream_rep_regress.pl
@@ -0,0 +1,62 @@
+# Run the standard regression tests with streaming replication
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 2;
+
+# Initialize primary node
+my $node_primary = get_new_node('primary');
+# A specific role is created to perform some tests related to replication,
+# and it needs proper authentication configuration.
+$node_primary->init(
+	allows_streaming => 1,
+	auth_extra   => [ '--create-role', 'repl_role' ]);
+
+# WAL consistency checking is resource intensive so require opt-in with the
+# PG_TEST_EXTRA environment variable.
+if ($ENV{PG_TEST_EXTRA} &&
+	$ENV{PG_TEST_EXTRA} =~ m/\bwal_consistency_checking\b/) {
+	$node_primary->append_conf('postgresql.conf',
+		'wal_consistency_checking = all');
+}
+
+$node_primary->start;
+is( $node_primary->psql(
+'postgres',
+qq[SELECT pg_create_physical_replication_slot('standby_1');]),
+0,
+'physical slot created on primary');
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = get_new_node('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf('postgresql.conf',
+"primary_slot_name = standby_1");
+$node_standby_1->start;
+
+# XXX The tablespace tests don't currently work when the standby shares a
+# filesystem with the primary, due to colliding absolute paths.  We'll skip
+# that for now.
+
+# Run the regression tests against the primary.
+system_or_bail("../regress/pg_regress",
+			   "--port=" . $node_primary->port,
+			   "--schedule=../regress/parallel_schedule",
+			   "--dlpath=../regress",
+			   "--inputdir=../regress",
+			   "--skip-tests=tablespace");
+
+# Wait for standby to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+ok(1, "caught up");
+
+$node_standby_1->stop;
+$node_primary->stop;
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index e04d365258..510afaadbf 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -83,6 +83,7 @@ static int	max_connections = 0;
 static int	max_concurrent_tests = 0;
 static char *encoding = NULL;
 static _stringlist *schedulelist = NULL;
+st

Remove server and libpq support for the version 2 wire protocol

2021-06-07 Thread Tatsuo Ishii

In release-14.sgml:



 
  Remove server and libpq support
  for the version 2 wire protocol
  (Heikki Linnakangas)
 

 
  This was last used as the default in Postgres 7.2 (year 2002).
 


I thought the last version which used the protocol as the default was
7.3, not 7.2? Because v3 was introduced in 7.4.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

RE: Transactions involving multiple postgres foreign servers, take 2

2021-06-07 Thread tsunakawa.ta...@fujitsu.com

From: Masahiko Sawada 
> I think we should not reinterpret the severity of the error and lower
> it. Especially, in this case, any kind of errors can be thrown. It
> could be such a serious error that FDW developer wants to report to
> the client. Do we lower even PANIC to a lower severity such as
> WARNING? That's definitely a bad idea. If we don’t lower PANIC whereas
> lowering ERROR (and FATAL) to WARNING, why do we regard only them as
> non-error?

Why does the client have to know the error on a remote server, whereas the 
global transaction itself is destined to commit?

FYI, the tx_commit() in the X/Open TX interface and the 
UserTransaction.commit() in JTA don't return such an error, IIRC.  Do TX_FAIL 
and SystemException serve such a purpose?  I don't feel like that.


[Tuxedo manual (Japanese)]
https://docs.oracle.com/cd/F25597_01/document/products/tuxedo/tux80j/atmi/rf3c91.htm


[JTA]
public interface javax.transaction.UserTransaction 
public void commit()
 throws RollbackException, HeuristicMixedException, 
HeuristicRollbackException, SecurityException, 
IllegalStateException, SystemException 

Throws: RollbackException 
Thrown to indicate that the transaction has been rolled back rather than 
committed. 

Throws: HeuristicMixedException 
Thrown to indicate that a heuristic decision was made and that some relevant 
updates have been 
committed while others have been rolled back. 

Throws: HeuristicRollbackException 
Thrown to indicate that a heuristic decision was made and that all relevant 
updates have been rolled 
back. 

Throws: SecurityException 
Thrown to indicate that the thread is not allowed to commit the transaction. 

Throws: IllegalStateException 
Thrown if the current thread is not associated with a transaction. 

Throws: SystemException 
Thrown if the transaction manager encounters an unexpected error condition. 


Regards
Takayuki Tsunakawa

Re: pgsql: Support parallel btree index builds.

2021-06-07 Thread Peter Geoghegan

On Mon, Jun 7, 2021 at 4:11 PM Alvaro Herrera  wrote:
> Now, if you do look at _bt_leafbuild(), it can be seen that nothing is
> done differently there either; we're not actually skipping any calls to
> tuplesort_performsort().  Any differentiation between serial/leader/
> worker cases seems to be done inside that routine.  So the comment is
> not very useful there either.
>
> I am wondering if these comments are leftovers from early development
> versions of this patch.  Maybe we could remove them -- or rewrite them
> to indicate not that we avoid calling tuplesort_performsort(), but
> instead to say that that function behaves differently.

It's talking about something described in the tuplesort.h contract. It
applies to a tuplesort state, not a process -- the leader always has two
tuplesort states (the leader tuplesort state, plus its own worker
tuplesort state).

The leader tuplesort is very much like a serial tuplesort. In
particular, as step 8 in tuplesort.h points out, the leader doesn't
have to call tuplesort_performsort() for the leader tuplesort state if
it already knows that there is no input to sort.

This matters less than it might in a world where we had a user of
parallel tuplesort that doesn't always simply make the leader
participate as a worker. There is a build-time testing option in
nbtsort.c that does this for parallel CREATE INDEX, actually -- see
DISABLE_LEADER_PARTICIPATION.

You kind of have a point about this being something that made more
sense in revisions of the patch from before commit, though. There was
a question about the cost model and the role of the leader that was
ultimately resolved by inventing the current simple behavior. So
feel free to change the wording now.

--
Peter Geoghegan

Re: Logical replication keepalive flood

2021-06-07 Thread Kyotaro Horiguchi

At Mon, 7 Jun 2021 15:26:05 +0500, Abbas Butt  
wrote in 
> On Mon, Jun 7, 2021 at 3:13 PM Amit Kapila  wrote:
> > > The immediate cause is pg_recvlogical doesn't send a reply before
> > > sleeping. Currently it sends replies every 10 seconds intervals.
> > >
> >
> > Yeah, but one can use -s option to send it at lesser intervals.
> >
> 
> That option can impact pg_recvlogical, it will not impact the server
> sending keepalives too frequently.
> By default the status interval is 10 secs, still we are getting 500
> keepalives a second from the server.
>
> > > So the attached first patch stops the flood.
> > >
> >
> > I am not sure sending feedback every time before sleep is a good idea,
> > this might lead to unnecessarily sending more messages. Can we try by
> > using one-second interval with -s option to see how it behaves? As a
> > matter of comparison the similar logic in workers.c uses
> > wal_receiver_timeout to send such an update message rather than
> > sending it every time before sleep.

Logical walreceiver sends a feedback when walrcv_eceive() doesn't
receive a byte.  If its' not good that pg_recvlogical does the same
thing, do we need to improve logical walsender's behavior as well?

> > > That said, I don't think it is not intended that logical walsender
> > > sends keep-alive packets with such a high frequency.  It happens
> > > because walsender actually doesn't wait at all because it waits on
> > > WL_SOCKET_WRITEABLE because the keep-alive packet inserted just before
> > > is always pending.
> > >
> > > So as the attached second, we should try to flush out the keep-alive
> > > packets if possible before checking pg_is_send_pending().
> > >
> >
> > /* Send keepalive if the time has come */
> >   WalSndKeepaliveIfNecessary();
> >
> > + /* We may have queued a keep alive packet. flush it before sleeping. */
> > + pq_flush_if_writable();
> >
> > We already call pq_flush_if_writable() from WalSndKeepaliveIfNecessary
> > after sending the keep-alive message, so not sure how this helps?

No. WalSndKeepaliveIfNecessary calls it only when walreceiver does not
receive a reply message for a long time. So the keepalive sent by the
direct call to WalSndKeepalive() from WalSndWaitForWal is not flushed
out in most cases, which causes the flood.

I rechecked all callers of WalSndKeepalive().

WalSndKeepalive()
+- *WalSndWaltForWal
+- ProcessStandbyReplyMessage
|+- ProcessStandbyMessage
| +- ProcessRepliesIfAny
|  +- $WalSndWriteData
|  +- *WalSndWaitForWal
|  +- WalSndLoop
|(calls pq_flush_if_writable() after sending the packet, but the
| keepalive packet prevents following stream data from being sent
| since the pending keepalive-packet causes pq_is_sned_pending()
| return (falsely) true.)
+- WalSndDone
 +- *WalSndLoop
+- WalSndKeepaliveIfNecessary
   (calls pq_flush_if_writable always only after calling WalSndKeepalive())

The callers prefixed by '*' above misunderstand that some of the data
sent by them are still pending even when the only pending bytes is the
keepalive packet. Of course the keepalive pakcets should be sent
*before* sleep and the unsent keepalive packet prevents the callers
from sleeping then they immediately retry sending another keepalive
pakcet and repeat it until the condition changes. (The callers
prevised by "$" also enters a sleep before flushing but doesn't repeat
sending keepalives.)

The caller is forgetting that a keepalive pakcet may be queued but not
flushed after calling WalSndKeepalive.  So more sensible fix would be
calling pq_flush_if_writable in WalSndKeepalive?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Race condition in recovery?

2021-06-07 Thread Kyotaro Horiguchi

At Mon, 7 Jun 2021 10:40:27 -0400, Robert Haas  wrote in 
> On Mon, Jun 7, 2021 at 12:57 AM Kyotaro Horiguchi
>  wrote:
> > Unfortunately no. The backslashes in the binary path need to be
> > escaped. (taken from PostgresNode.pm:1008)
> >
> > > (my $perlbin = $^X) =~ s{\\}{}g if ($TestLib::windows_os);
> > > $node_primary->append_conf(
> > >   'postgresql.conf', qq(
> > > archive_command = '$perlbin "$FindBin::RealBin/cp_history_files" "%p" 
> > > "$archivedir_primary/%f"'
> > > ));
> >
> > This works for me.
> 
> Hmm, OK. Do you think we also need to use perl2host in this case?

I understand that perl2host converts '/some/where' style path to the
native windows path 'X:/any/where' if needed. Since perl's $^X is
already in native style so I think we don't need to use it.

> > Ugh! Sorry. I meant "The explicit teardowns are useless". That's not
> > harmful but it is done by PostgresNode.pm automatically(implicitly)
> > and we don't do that in the existing scripts.
> 
> OK. I don't think it's a big deal, but we can remove them.

Thanks.

> I went back and looked at your patch again, now that I understand the
> issue better. I believe it's not necessary to do this here, because
> StartupXLOG() already contains a check for the same thing:
> 
> /*
>  * If the location of the checkpoint record is not on the expected
>  * timeline in the history of the requested timeline, we cannot proceed:
>  * the backup is not part of the history of the requested timeline.
>  */
> Assert(expectedTLEs);   /* was initialized by reading checkpoint
>  * record */
> if (tliOfPointInHistory(checkPointLoc, expectedTLEs) !=
> checkPoint.ThisTimeLineID)
> ...
> 
> This code is always run after ReadCheckpointRecord() returns. And I
> think that your only concern here is about the case where the
> checkpoint record is being fetched, because otherwise expectedTLEs
> must already be set.

Sure. Thanks for confirming that, and agreed.

> By the way, I also noticed that your version of the patch contains a
> few words which are spelled incorrectly: hearafter, and incosistent.

Mmm. Sorry for them..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: SQL-standard function body

2021-06-07 Thread Julien Rouhaud

On Mon, Jun 07, 2021 at 03:24:33PM -0400, Tom Lane wrote:
> 
> Concretely, I think the right fix is per attached.

+1, I agree that this approach is better.

Re: Confused about extension and shared_preload_libraries

2021-06-07 Thread Japin Li



On Mon, 07 Jun 2021 at 19:25, Aleksander Alekseev  
wrote:
> Hi Japin,
>
>> When we write a extension using C language, we often add the dynamic
> library
>> into shared_preload_libraries, however, I found that the bloom,
> btree_gist and
>> btree_gin do not follow this rule.  I'm a bit confused with this, could
> anybody
>> explain it for me?
>
> In the general case, you don't need to modify shared_preload_libraries to
> use an extension, regardless of the language in which it's implemented.
> That's it.
>

Thanks for your explanation.

> Some extensions may however require this. See the description of the GUC
> [1].
>
> [1]:
> https://www.postgresql.org/docs/13/runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES

Sorry for my poor reading of the documentation.


-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: Duplicate history file?

2021-06-07 Thread Tatsuro Yamada


On 2021/06/07 17:32, Kyotaro Horiguchi wrote:

I just noticed that this thread is still tied to another thread
(it's not an independent thread). To fix that, it may be better to
create a new thread again.

Mmm. Maybe my mailer automatically inserted In-Reply-To field for the
cited messsage.  Do we (the two of us) bother re-launching a new
thread?



The reason I suggested it was because I thought it might be
confusing if the threads were not independent when registered in
a commitfest. If that is not a problem, then I'm fine with it as
is. :-D


(You can freely do that, too:p)


I should have told you that I would be happy to create a new thread.

Why I didn't create new thread is that because I didn't want people to
think I had hijacked the thread. :)



Hmm. I found that the pgsql-hackers archive treats the new thread as a
part of the old thread, so CF-app would do the same.

Anyway I re-launched a new standalone thread.

https://www.postgresql.org/message-id/20210607.173108.348241508233844279.horikyota.ntt%40gmail.com


Thank you!


Regards,
Tatsuro Yamada

Re: security_definer_search_path GUC

2021-06-07 Thread Joel Jacobson

On Mon, Jun 7, 2021, at 23:26, David G. Johnston wrote:
> On Mon, Jun 7, 2021 at 1:55 PM Joel Jacobson  wrote:
>> __
>> If we don't like "UNQUALIFIED" as a keyword, maybe we could reuse "PUBLIC"?
>> Or will that be confusing since "PUBLIC" is also a role_specification?
>> 
> 
> For me the concept resembles explicitly denoting certain schemas as being 
> simple tags, while the actual "namespace" is the GLOBAL namespace.  Today 
> there is no global namespace, all schemas generate their own individual 
> namespace in addition to "tagging" their objects with a textual label.
> 
> 
> Avoiding "public" is highly desirable.
> 
> To access a global object you should be able to still specify its schema tag. 
>  Unqualified means "use search_path"; and "use search_path" includes global.  
> But there is a truth table waiting to be created to detail what combinations 
> result in errors (including where those errors occur - runtime or creation 
> time).

+1

/Joel

Re: Misplaced superuser check in pg_log_backend_memory_contexts()

2021-06-07 Thread Michael Paquier

On Sun, Jun 06, 2021 at 11:13:40AM -0400, Tom Lane wrote:
> Julien Rouhaud  writes:
>> However +1 for the patch, as it seems more consistent to always get a
>> permission failure if you're not a superuser.
> 
> Yeah, it's just weird if such a check is not the first thing
> in the function.  Even if you can convince yourself that the
> actions taken before that don't create any security issue today,
> it's not hard to imagine that innocent future code rearrangements
> could break that argument.  What's the value of postponing the
> check anyway?

Thanks for the input, I have applied the patch.
--
Michael


signature.asc
Description: PGP signature

Re: A modest proposal vis hierarchical queries: MINUS in the column list

2021-06-07 Thread Julien Rouhaud

On Mon, Jun 07, 2021 at 06:10:58PM -0400, Tom Lane wrote:
> 
> I'm fairly disinclined to do anything about it though, because I'm
> afraid of the SQL committee standardizing some other syntax for the
> same idea in future (or maybe worse, commandeering the same keyword
> for some other feature).  It doesn't seem quite valuable enough to
> take those risks for.

Also, isn't the OP problem already solved by the SEARCH / CYCLE grammar
handling added in 3696a600e2292?

Re: Make unlogged table resets detectable

2021-06-07 Thread Justin Pryzby

On Mon, Jun 07, 2021 at 02:56:57PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > On Fri, Jun 4, 2021 at 8:41 PM Jeff Davis  wrote:
> >> Stepping back, maybe unlogged tables are the wrong level to solve this
> >> problem. We could just have a "crash counter" in pg_control that would
> >> be incremented every time a crash happened (and all unlogged tables are
> >> reset). It might be a number or maybe the LSN of the startup checkpoint
> >> after the most recent crash.
> 
> > I think this would be useful for a variety of purposes. Both being
> > able to know the last time that it happened and being able to know the
> > number of times that it happened could be useful, depending on the
> > scenario.
> 
> +1.  I'd support recording the time of the last crash recovery, as
> well as having a counter.  I think an LSN would not be as useful
> as a timestamp.

+1

It's been suggested before ;)
https://www.postgresql.org/message-id/20180228221653.GB32095%40telsasoft.com

PS. I currently monitor for crashes by checking something hacky like:
| SELECT backend_start - pg_postmaster_start_time() ORDER BY 1

Re: Duplicate history file?

2021-06-07 Thread Kyotaro Horiguchi

(Mmm. thunderbird or gmail connects this thread to the previous one..)

At Mon, 7 Jun 2021 14:20:38 -0400, Stephen Frost  wrote in 
> Greetings,
> 
> * Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> > So, this is the new new thread.
> 
> This is definitely not the way I would recommend starting up a new
> thread as you didn't include the actual text of the prior discussion for
> people to be able to read and respond to, instead making them go hunt
> for the prior discussion on the old thread and negating the point of
> starting a new thread..

Sorry for that. I'll do that next time.

> Still, I went and found the other email-

Thanks!

> * Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> > At Mon, 31 May 2021 11:52:05 +0900, Tatsuro Yamada 
> >  wrote in 
> > > Since the above behavior is different from the behavior of the
> > > test command in the following example in postgresql.conf, I think
> > > we should write a note about this example.
> > > 
> > > # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p
> > > # /mnt/server/archivedir/%f'
> > >
> > > Let me describe the problem we faced.
> > > - When archive_mode=always, archive_command is (sometimes) executed
> > >   in a situation where the history file already exists on the standby
> > >   side.
> > > 
> > > - In this case, if "test ! -f" is written in the archive_command of
> > >   postgresql.conf on the standby side, the command will keep failing.
> > > 
> > >   Note that this problem does not occur when archive_mode=on.
> > > 
> > > So, what should we do for the user? I think we should put some notes
> > > in postgresql.conf or in the documentation. For example, something
> > > like this:
> 
> First off, we should tell them to not use test or cp in their actual
> archive command because they don't do things like make sure that the WAL
> that's been archived has actually been fsync'd.  Multiple people have
> tried to make improvements in this area but the long and short of it is
> that trying to provide a simple archive command in the documentation
> that actually *works* isn't enough- you need a real tool.  Maybe someone
> will write one some day that's part of core, but it's not happened yet
> and instead there's external solutions which actually do the correct
> things.

Ideally I agree that it is definitely right. But the documentation
doesn't say a bit of "don't use the simple copy command in any case
(or at least the cases where more than a certain level of durability
and integrity guarantee is required).".

Actually many people are satisfied with just "cp/copy" and I think
they know that the command doesn't guarantee on the integrity of
archived files on , say, some disastrous situation like a sudden power
cut.

However, the use of "test !  -f..." is in a bit different kind of
suggestion.

https://www.postgresql.org/docs/13/continuous-archiving.html
| The archive command should generally be designed to refuse to
| overwrite any pre-existing archive file. This is an important safety
| feature to preserve the integrity of your archive in case of
| administrator error (such as sending the output of two different
| servers to the same archive directory)

This implies that no WAL segment are archived more than once at least
under any valid operation. Some people are following this suggestion
to prevent archive from breaking by some *wrong* operations.

> The existing documentation should be taken as purely "this is how the
> variables which are passed in get expanded" not as "this is what you
> should do", because it's very much not the latter in any form.

It describes "how archive_command should be like" and showing examples
among the description implies that the example conforms the
should-be's.

Nevertheless, the issue here is that there's a case where archiving
stalls when following the suggestion above under a certain condition.
Even if it is written premising "set .. archive_mode to on", I don't
believe that people can surmise that the same archive_command might
fail when setting archive_mode to always, because the description
implies

So I think we need to revise the documentation, or need to *fix* the
revealed problem that is breaking the assumption of the documentation.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Multiple hosts in connection string failed to failover in non-hot standby mode

2021-06-07 Thread Michael Paquier

On Mon, Jun 07, 2021 at 10:38:03AM -0400, Tom Lane wrote:
> Hmm.  We do include "-lpgcommon -lpgport" when building the ecpg test
> programs on Unix, so I'd assumed that the MSVC scripts did the same.
> Is there a good reason not to make them do so?

I was looking at that this morning, and yes we need to add more
references here.  Actually, adding only libpgport.lib allows the
compilation and the tests to work, but I agree to add also
libpgcommon.lib so as we don't fall into the same compilation trap
again in the future.

Now, I also see that using pgwin32_setenv() instead of
src/port/setenv.c causes cl to be confused once we update
ecpg_regression.proj because it cannot find setenv().  Bringing the
question, why is it necessary to have both setenv.c and
pgwin32_setenv() on HEAD?  setenv.c should be enough once you have the
fallback implementation of putenv() available.

Attached is the patch I am finishing with, that also brings all this
stuff closer to what I did in 12 and 13 for hamerkop.  The failing
test is passing for me now with MSVC and GSSAPI builds.

Thoughts?
--
Michael
diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h
index 05c5a53442..e25f65b054 100644
--- a/src/include/port/win32_port.h
+++ b/src/include/port/win32_port.h
@@ -490,11 +490,9 @@ extern void _dosmaperr(unsigned long);
 
 /* in port/win32env.c */
 extern int	pgwin32_putenv(const char *);
-extern int	pgwin32_setenv(const char *name, const char *value, int overwrite);
 extern int	pgwin32_unsetenv(const char *name);
 
 #define putenv(x) pgwin32_putenv(x)
-#define setenv(x,y,z) pgwin32_setenv(x,y,z)
 #define unsetenv(x) pgwin32_unsetenv(x)
 
 /* in port/win32security.c */
diff --git a/src/port/win32env.c b/src/port/win32env.c
index a03556078c..c8d43af381 100644
--- a/src/port/win32env.c
+++ b/src/port/win32env.c
@@ -1,7 +1,7 @@
 /*-
  *
  * win32env.c
- *	  putenv(), setenv(), and unsetenv() for win32.
+ *	  putenv() and unsetenv() for win32.
  *
  * These functions update both the process environment and caches in
  * (potentially multiple) C run-time library (CRT) versions.
@@ -117,35 +117,6 @@ pgwin32_putenv(const char *envval)
 	return _putenv(envval);
 }
 
-int
-pgwin32_setenv(const char *name, const char *value, int overwrite)
-{
-	int			res;
-	char	   *envstr;
-
-	/* Error conditions, per POSIX */
-	if (name == NULL || name[0] == '\0' || strchr(name, '=') != NULL ||
-		value == NULL)
-	{
-		errno = EINVAL;
-		return -1;
-	}
-
-	/* No work if variable exists and we're not to replace it */
-	if (overwrite == 0 && getenv(name) != NULL)
-		return 0;
-
-	envstr = (char *) malloc(strlen(name) + strlen(value) + 2);
-	if (!envstr)/* not much we can do if no memory */
-		return -1;
-
-	sprintf(envstr, "%s=%s", name, value);
-
-	res = pgwin32_putenv(envstr);
-	free(envstr);
-	return res;
-}
-
 int
 pgwin32_unsetenv(const char *name)
 {
diff --git a/src/interfaces/ecpg/test/connect/test5.pgc b/src/interfaces/ecpg/test/connect/test5.pgc
index e712fa8778..f7263935ce 100644
--- a/src/interfaces/ecpg/test/connect/test5.pgc
+++ b/src/interfaces/ecpg/test/connect/test5.pgc
@@ -18,6 +18,9 @@ exec sql begin declare section;
 	char *user="regress_ecpg_user1";
 exec sql end declare section;
 
+	/* disable GSSENC to ensure stability of connection failure reports */
+	setenv("PGGSSENCMODE", "disable", 1);
+
 	ECPGdebug(1, stderr);
 
 	exec sql connect to ecpg2_regression as main;
diff --git a/src/interfaces/ecpg/test/expected/connect-test5.c b/src/interfaces/ecpg/test/expected/connect-test5.c
index 6ae5b589de..a86a5e4331 100644
--- a/src/interfaces/ecpg/test/expected/connect-test5.c
+++ b/src/interfaces/ecpg/test/expected/connect-test5.c
@@ -38,37 +38,33 @@ main(void)
 #line 19 "test5.pgc"
 
 
+	/* disable GSSENC to ensure stability of connection failure reports */
+	setenv("PGGSSENCMODE", "disable", 1);
+
 	ECPGdebug(1, stderr);
 
 	{ ECPGconnect(__LINE__, 0, "ecpg2_regression" , NULL, NULL , "main", 0); }
-#line 23 "test5.pgc"
-
-	{ ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "alter user regress_ecpg_user2 encrypted password 'insecure'", ECPGt_EOIT, ECPGt_EORT);}
-#line 24 "test5.pgc"
-
-	{ ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "alter user regress_ecpg_user1 encrypted password 'connectpw'", ECPGt_EOIT, ECPGt_EORT);}
-#line 25 "test5.pgc"
-
-	{ ECPGtrans(__LINE__, NULL, "commit");}
 #line 26 "test5.pgc"
 
-	{ ECPGdisconnect(__LINE__, "CURRENT");}
+	{ ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "alter user regress_ecpg_user2 encrypted password 'insecure'", ECPGt_EOIT, ECPGt_EORT);}
 #line 27 "test5.pgc"
+
+	{ ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "alter user regress_ecpg_user1 encrypted password 'connectpw'", ECPGt_EOIT, ECPGt_EORT);}
+#line 28 "test5.pgc"
+
+	{ ECPGtrans(__LINE__, NULL, "commit");}
+#line 29 "test5.pgc"
+
+	{ ECPGdisconnect(__LINE__, "CURRENT");}
+#line 30 "test5.pgc"
   /* <-- "main" not specified */
 
 	strc

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

2021-06-07 Thread Amit Kapila

On Mon, Jun 7, 2021 at 7:29 PM Robert Haas  wrote:
>
> On Fri, Jun 4, 2021 at 6:17 AM Amit Kapila  wrote:
> > Thoughts?
>
> As far as I can see, trying to error out at function call time if the
> function is parallel-safe doesn't fix any problem we have, and just
> makes the design of this part of the system less consistent with what
> we've done elsewhere. For example, if you create a stable function
> that internally calls a volatile function, you don't get an error. You
> can use your stable function in an index definition if you wish. That
> may break, but if so, that's your problem. Also, when it breaks, it
> probably won't blow up the entire world; you'll just have a messed-up
> index. Currently, the parallel-safety stuff works the same way. If we
> notice that something is marked parallel-unsafe, we'll skip
> parallelism.
>

This is not true in all cases which is one of the reasons for this
thread. For example, we don't skip parallelism when I/O functions are
parallel-unsafe as is shown in the following case:

postgres=# CREATE FUNCTION text_w_default_in(cstring)   RETURNS
text_w_default   AS 'textin'   LANGUAGE internal STRICT IMMUTABLE;
NOTICE:  type "text_w_default" is not yet defined
DETAIL:  Creating a shell type definition.
CREATE FUNCTION

postgres=# CREATE FUNCTION text_w_default_out(text_w_default)
RETURNS cstring   AS 'textout'   LANGUAGE internal STRICT IMMUTABLE;
NOTICE:  argument type text_w_default is only a shell
CREATE FUNCTION
postgres=# CREATE TYPE text_w_default (   internallength = variable,
input = text_w_default_in,   output = text_w_default_out,   alignment
= int4,   default = 'zippo');
CREATE TYPE
postgres=# CREATE TABLE default_test (f1 text_w_default, f2 int);
CREATE TABLE
postgres=# INSERT INTO default_test DEFAULT VALUES;
INSERT 0 1
postgres=# SELECT * FROM default_test;
ERROR:  parallel-safety execution violation of function "text_w_default_out" (u)

Note the error is raised after applying the patch, without the patch,
the above won't show any error (error message could be improved here).
Such cases can lead to unpredictable behavior without a patch because
we won't be able to detect the execution of parallel-unsafe functions.
There are similar examples from regression tests. Now, one way to deal
with similar cases could be that we document them and say we don't
consider parallel-safety in such cases and the other way is to detect
such cases and error out. Yet another way could be that we somehow try
to check these cases as well before enabling parallelism but I thought
these cases fall in the similar category as aggregate's support
functions.

> But you can lie to us and claim that things are safe when
> they're not, and if you do, it may break, but that's your problem.
> Mostly likely your query will just error out, and there will be no
> worse consequences than that, though if your parallel-unsafe function
> is written in C, it could do horrible things like crash, which is
> unavoidable because C code can do anything.
>

That is true but I was worried for cases where users didn't lie to us
but we still allowed those to choose parallelism.

> Now, the reason for all of this work, as I understand it, is because
> we want to enable parallel inserts, and the problem there is that a
> parallel insert could involve a lot of different things: it might need
> to compute expressions, or fire triggers, or check constraints, and
> any of those things could be parallel-unsafe. If we enable parallelism
> and then find out that we need to do to one of those things, we have a
> problem. Something probably will error out. The thing is, with this
> proposal, that issue is not solved. Something will definitely error
> out. You'll probably get the error in a different place, but nobody
> fires off an INSERT hoping to get one error message rather than
> another. What they want is for it to work. So I'm kind of confused how
> we ended up going in this direction which seems to me at least to be a
> tangent from the real issue, and somewhat at odds with the way the
> rest of PostgreSQL is designed.
>
> It seems to me that we could simply add a flag to each relation saying
> whether or not we think that INSERT operations - or perhaps DML
> operations generally - are believed to be parallel-safe for that
> relation.
>

This is exactly the direction we are trying to pursue. The proposal
[1] has semantics like:
CREATE TABLE table_name (...) PARALLEL DML { UNSAFE | RESTRICTED | SAFE };
ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE };

This property is recorded in pg_class's relparallel column as 'u',
'r', or 's', just like pg_proc's proparallel. The default is UNSAFE.
This might require some bike-shedding to decide how exactly we want to
expose it to the user but I think it is on the lines of what you have
described here.

> Like the marking on functions, it would be the user's
> responsibility to get that marking correct. If they don't, they might
> call a parallel-unsafe

Re: Make unlogged table resets detectable

2021-06-07 Thread Michael Paquier

On Mon, Jun 07, 2021 at 02:56:57PM -0400, Tom Lane wrote:
> +1.  I'd support recording the time of the last crash recovery, as
> well as having a counter.  I think an LSN would not be as useful
> as a timestamp.

One could guess a timestamp based on a LSN, no?  So I'd like to think
the opposite actually: a LSN would be more useful than a timestamp.
--
Michael


signature.asc
Description: PGP signature

Re: Hook for extensible parsing.

2021-06-07 Thread Julien Rouhaud

On Sun, Jun 06, 2021 at 02:50:19PM +0800, Julien Rouhaud wrote:
> On Sat, May 01, 2021 at 03:24:58PM +0800, Julien Rouhaud wrote:
> > 
> > I'm attaching some POC patches that implement this approach to start a
> > discussion.
> 
> I just noticed that the cfbot fails with the v1 patch.  Attached v2 that 
> should
> fix that.

The cfbot then revealed a missing dependency in the makefile to generate the
contrib parser, which triggers in make check-world without a previous
make -C contrib.

Thanks a lot to Thomas Munro for getting me the logfile from the failed cfbot
run and the fix!
>From 3522fd2b0b27f52ab400abe1c9fbd5bb0c6169b4 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Wed, 21 Apr 2021 22:47:18 +0800
Subject: [PATCH v3 1/4] Add a parser_hook hook.

This does nothing but allow third-party plugins to implement a different
syntax, and fallback on the core parser if they don't implement a superset of
the supported core syntax.
---
 src/backend/tcop/postgres.c | 16 ++--
 src/include/tcop/tcopprot.h |  5 +
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8cea10c901..e941b59b85 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -99,6 +99,9 @@ int			log_statement = LOGSTMT_NONE;
 /* GUC variable for maximum stack depth (measured in kilobytes) */
 int			max_stack_depth = 100;
 
+/* Hook for plugins to get control in pg_parse_query() */
+parser_hook_type parser_hook = NULL;
+
 /* wait N seconds to allow attach from a debugger */
 int			PostAuthDelay = 0;
 
@@ -589,18 +592,27 @@ ProcessClientWriteInterrupt(bool blocked)
  * database tables.  So, we rely on the raw parser to determine whether
  * we've seen a COMMIT or ABORT command; when we are in abort state, other
  * commands are not processed any further than the raw parse stage.
+ *
+ * To support loadable plugins that monitor the parsing or implements SQL
+ * syntactic sugar we provide a hook variable that lets a plugin get control
+ * before and after the standard parsing process.  If the plugin only implement
+ * a subset of postgres supported syntax, it's its duty to call raw_parser (or
+ * the previous hook if any) for the statements it doesn't understand.
  */
 List *
 pg_parse_query(const char *query_string)
 {
-	List	   *raw_parsetree_list;
+	List	   *raw_parsetree_list = NIL;
 
 	TRACE_POSTGRESQL_QUERY_PARSE_START(query_string);
 
 	if (log_parser_stats)
 		ResetUsage();
 
-	raw_parsetree_list = raw_parser(query_string, RAW_PARSE_DEFAULT);
+	if (parser_hook)
+		raw_parsetree_list = (*parser_hook) (query_string, RAW_PARSE_DEFAULT);
+	else
+		raw_parsetree_list = raw_parser(query_string, RAW_PARSE_DEFAULT);
 
 	if (log_parser_stats)
 		ShowUsage("PARSER STATISTICS");
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 968345404e..131dc2b22e 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -17,6 +17,7 @@
 #include "nodes/params.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
+#include "parser/parser.h"
 #include "storage/procsignal.h"
 #include "utils/guc.h"
 #include "utils/queryenvironment.h"
@@ -43,6 +44,10 @@ typedef enum
 
 extern PGDLLIMPORT int log_statement;
 
+/* Hook for plugins to get control in pg_parse_query() */
+typedef List *(*parser_hook_type) (const char *str, RawParseMode mode);
+extern PGDLLIMPORT parser_hook_type parser_hook;
+
 extern List *pg_parse_query(const char *query_string);
 extern List *pg_rewrite_query(Query *query);
 extern List *pg_analyze_and_rewrite(RawStmt *parsetree,
-- 
2.31.1

>From 51a4fd99b8c66b970c3f8819cc135e1095126c48 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Wed, 21 Apr 2021 23:54:02 +0800
Subject: [PATCH v3 2/4] Add a sqlol parser.

This is a toy example of alternative grammar that only accept a LOLCODE
compatible version of a

SELECT [column, ] column FROM tablename

and fallback on the core parser for everything else.
---
 contrib/Makefile|   1 +
 contrib/sqlol/.gitignore|   7 +
 contrib/sqlol/Makefile  |  33 ++
 contrib/sqlol/sqlol.c   | 107 +++
 contrib/sqlol/sqlol_gram.y  | 440 ++
 contrib/sqlol/sqlol_gramparse.h |  61 
 contrib/sqlol/sqlol_keywords.c  |  98 ++
 contrib/sqlol/sqlol_keywords.h  |  38 +++
 contrib/sqlol/sqlol_kwlist.h|  21 ++
 contrib/sqlol/sqlol_scan.l  | 544 
 contrib/sqlol/sqlol_scanner.h   | 118 +++
 11 files changed, 1468 insertions(+)
 create mode 100644 contrib/sqlol/.gitignore
 create mode 100644 contrib/sqlol/Makefile
 create mode 100644 contrib/sqlol/sqlol.c
 create mode 100644 contrib/sqlol/sqlol_gram.y
 create mode 100644 contrib/sqlol/sqlol_gramparse.h
 create mode 100644 contrib/sqlol/sqlol_keywords.c
 create mode 100644 contrib/sqlol/sqlol_keywords.h
 create mode 100644 contrib/sqlol/sqlol_kwlist.h
 create mode 100644 contrib/sqlol/sqlol_scan.l

Re: Duplicate history file?

2021-06-07 Thread Kyotaro Horiguchi

Yeah, it's hot these days...

At Tue, 08 Jun 2021 12:04:43 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> (Mmm. thunderbird or gmail connects this thread to the previous one..)
> 
> At Mon, 7 Jun 2021 14:20:38 -0400, Stephen Frost  wrote 
> in 
> > Greetings,
> > 
> > * Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> > > So, this is the new new thread.
> > 
> > This is definitely not the way I would recommend starting up a new
> > thread as you didn't include the actual text of the prior discussion for
> > people to be able to read and respond to, instead making them go hunt
> > for the prior discussion on the old thread and negating the point of
> > starting a new thread..
> 
> Sorry for that. I'll do that next time.
> 
> > Still, I went and found the other email-
> 
> Thanks!
> 
> > * Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> > > At Mon, 31 May 2021 11:52:05 +0900, Tatsuro Yamada 
> > >  wrote in 
> > > > Since the above behavior is different from the behavior of the
> > > > test command in the following example in postgresql.conf, I think
> > > > we should write a note about this example.
> > > > 
> > > > # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p
> > > > # /mnt/server/archivedir/%f'
> > > >
> > > > Let me describe the problem we faced.
> > > > - When archive_mode=always, archive_command is (sometimes) executed
> > > >   in a situation where the history file already exists on the standby
> > > >   side.
> > > > 
> > > > - In this case, if "test ! -f" is written in the archive_command of
> > > >   postgresql.conf on the standby side, the command will keep failing.
> > > > 
> > > >   Note that this problem does not occur when archive_mode=on.
> > > > 
> > > > So, what should we do for the user? I think we should put some notes
> > > > in postgresql.conf or in the documentation. For example, something
> > > > like this:
> > 
> > First off, we should tell them to not use test or cp in their actual
> > archive command because they don't do things like make sure that the WAL
> > that's been archived has actually been fsync'd.  Multiple people have
> > tried to make improvements in this area but the long and short of it is
> > that trying to provide a simple archive command in the documentation
> > that actually *works* isn't enough- you need a real tool.  Maybe someone
> > will write one some day that's part of core, but it's not happened yet
> > and instead there's external solutions which actually do the correct
> > things.
> 
> Ideally I agree that it is definitely right. But the documentation
> doesn't say a bit of "don't use the simple copy command in any case
> (or at least the cases where more than a certain level of durability
> and integrity guarantee is required).".
> 
> Actually many people are satisfied with just "cp/copy" and I think
> they know that the command doesn't guarantee on the integrity of
> archived files on , say, some disastrous situation like a sudden power
> cut.
> 
> However, the use of "test !  -f..." is in a bit different kind of
> suggestion.
> 
> https://www.postgresql.org/docs/13/continuous-archiving.html
> | The archive command should generally be designed to refuse to
> | overwrite any pre-existing archive file. This is an important safety
> | feature to preserve the integrity of your archive in case of
> | administrator error (such as sending the output of two different
> | servers to the same archive directory)
> 
> This implies that no WAL segment are archived more than once at least
> under any valid operation. Some people are following this suggestion
> to prevent archive from breaking by some *wrong* operations.
> 
> > The existing documentation should be taken as purely "this is how the
> > variables which are passed in get expanded" not as "this is what you
> > should do", because it's very much not the latter in any form.
> 

- It describes "how archive_command should be like" and showing examples
+ It describes "what archive_command should be like" and showing examples

> among the description implies that the example conforms the
> should-be's.
> 
> Nevertheless, the issue here is that there's a case where archiving
> stalls when following the suggestion above under a certain condition.
> Even if it is written premising "set .. archive_mode to on", I don't
> believe that people can surmise that the same archive_command might
- fail when setting archive_mode to always, because the description
- implies
+ fail when setting archive_mode to always.

> 
> So I think we need to revise the documentation, or need to *fix* the
> revealed problem that is breaking the assumption of the documentation.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Make unlogged table resets detectable

2021-06-07 Thread Julien Rouhaud

On Tue, Jun 08, 2021 at 12:46:05PM +0900, Michael Paquier wrote:
> On Mon, Jun 07, 2021 at 02:56:57PM -0400, Tom Lane wrote:
> > +1.  I'd support recording the time of the last crash recovery, as
> > well as having a counter.  I think an LSN would not be as useful
> > as a timestamp.
> 
> One could guess a timestamp based on a LSN, no?  So I'd like to think
> the opposite actually: a LSN would be more useful than a timestamp.

Wouldn't that work only if the LSN is recent enough, depending on the WAL
activity?

Re: Asynchronous Append on postgres_fdw nodes.

2021-06-07 Thread Etsuro Fujita

On Mon, Jun 7, 2021 at 6:36 PM Etsuro Fujita  wrote:
> I created a patch for that, which I'm attaching.  I'm planning
> to commit the patch.

Done.

Best regards,
Etsuro Fujita

Re: Logical replication keepalive flood

2021-06-07 Thread Kyotaro Horiguchi

At Tue, 08 Jun 2021 10:05:36 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Mon, 7 Jun 2021 15:26:05 +0500, Abbas Butt  
> wrote in 
> > On Mon, Jun 7, 2021 at 3:13 PM Amit Kapila  wrote:
> > > I am not sure sending feedback every time before sleep is a good idea,
> > > this might lead to unnecessarily sending more messages. Can we try by
> > > using one-second interval with -s option to see how it behaves? As a
> > > matter of comparison the similar logic in workers.c uses
> > > wal_receiver_timeout to send such an update message rather than
> > > sending it every time before sleep.
> 
> Logical walreceiver sends a feedback when walrcv_eceive() doesn't
> receive a byte.  If its' not good that pg_recvlogical does the same
> thing, do we need to improve logical walsender's behavior as well?

For the clarity, only the change in the walsender side can stop the
flood.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch

2021-06-07 Thread Noah Misch

On Sun, Jun 06, 2021 at 03:10:07PM -0400, Tom Lane wrote:
> I wrote:
> > We could make use of COMPARE_COERCIONFORM_FIELD 100% correct by removing
> > these two tests of the funcformat value, but on the whole I doubt that
> > would be better.
> 
> On still closer inspection, that seems like it'd be fine.  All of
> the gram.y productions that emit COERCE_SQL_SYNTAX also produce
> schema-qualified function names (via SystemFuncName); and it seems
> hard to see a use-case where we'd not do that.  This makes the two
> checks I cited 100% redundant, because the conditions they are in
> also insist on an unqualified function name.  So let's just take them
> out again, making it strictly OK to use COMPARE_COERCIONFORM_FIELD.

I have little intuition on this exact topic, but I have no particular concerns
about the change you pushed.

Re: Race condition in recovery?

2021-06-07 Thread Dilip Kumar

On Tue, Jun 8, 2021 at 12:32 AM Robert Haas  wrote:
>
> I tried back-porting my version of this patch to 9.6 to see what would
> happen there. One problem is that some of the functions have different
> names before v10. So 9.6 needs this:
>
> -"SELECT pg_walfile_name(pg_current_wal_lsn());");
> +"SELECT pg_xlogfile_name(pg_current_xlog_location());");
>
> But there's also another problem, which is that this doesn't work before v12:
>
> $node_standby->psql('postgres', 'SELECT pg_promote()');
>
> So I tried changing it to this:
>
> $node_standby->promote;
>
> But then the test fails, because pg_promote() has logic built into it
> to wait until the promotion actually happens, but ->promote doesn't,
> so SELECT pg_walfile_name(pg_current_wal_lsn()) errors out because the
> system is still in recovery. I'm not sure what to do about that. I
> quickly tried adding -w to 'sub promote' in PostgresNode.pm, but that
> didn't fix it, so I guess we'll have to find some other way to wait
> until the promotion is complete.
>

Maybe we can use it ?

# Wait until the node exits recovery.
$standby->poll_query_until('postgres', "SELECT pg_is_in_recovery() = 'f';")
or die "Timed out while waiting for promotion";

I will try to generate a version for 9.6 based on this idea and see how it goes

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] logical decoding of two-phase transactions

2021-06-07 Thread Peter Smith

On Thu, Jun 3, 2021 at 7:56 PM Amit Kapila  wrote:
>
> On Wed, Jun 2, 2021 at 4:34 AM Peter Smith  wrote:
> >
> > Please find attached the latest patch set v82*
> >
>
> Few comments on 0001:
> 
> 1.
> + /*
> + * BeginTransactionBlock is necessary to balance the EndTransactionBlock
> + * called within the PrepareTransactionBlock below.
> + */
> + BeginTransactionBlock();
> + CommitTransactionCommand();
> +
> + /*
> + * Update origin state so we can restart streaming from correct position
> + * in case of crash.
> + */
> + replorigin_session_origin_lsn = prepare_data.end_lsn;
> + replorigin_session_origin_timestamp = prepare_data.prepare_time;
> +
> + PrepareTransactionBlock(gid);
> + CommitTransactionCommand();
>
> Here, the call to CommitTransactionCommand() twice looks a bit odd.
> Before the first call, can we write a comment like "This is to
> complete the Begin command started by the previous call"?
>

Fixed in v83-0001 and v83-0002

> 2.
> @@ -85,11 +85,16 @@ typedef struct LogicalDecodingContext
>   bool streaming;
>
>   /*
> - * Does the output plugin support two-phase decoding, and is it enabled?
> + * Does the output plugin support two-phase decoding.
>   */
>   bool twophase;
>
>   /*
> + * Is two-phase option given by output plugin?
> + */
> + bool twophase_opt_given;
> +
> + /*
>   * State for writing output.
>
> I think we can write few comments as to why we need a separate
> twophase parameter here? The description of twophase_opt_given can be
> changed to: "Is two-phase option given by output plugin? This is to
> allow output plugins to enable two_phase at the start of streaming. We
> can't rely on twophase parameter that tells whether the plugin
> provides all the necessary two_phase APIs for this purpose." Feel free
> to add more to it.
>

TODO

> 3.
> @@ -432,10 +432,19 @@ CreateInitDecodingContext(const char *plugin,
>   MemoryContextSwitchTo(old_context);
>
>   /*
> - * We allow decoding of prepared transactions iff the two_phase option is
> - * enabled at the time of slot creation.
> + * We allow decoding of prepared transactions when the two_phase is
> + * enabled at the time of slot creation, or when the two_phase option is
> + * given at the streaming start.
>   */
> - ctx->twophase &= MyReplicationSlot->data.two_phase;
> + ctx->twophase &= (ctx->twophase_opt_given || slot->data.two_phase);
> +
> + /* Mark slot to allow two_phase decoding if not already marked */
> + if (ctx->twophase && !slot->data.two_phase)
> + {
> + slot->data.two_phase = true;
> + ReplicationSlotMarkDirty();
> + ReplicationSlotSave();
> + }
>
> Why do we need to change this during CreateInitDecodingContext which
> is called at create_slot time? At that time, we don't need to consider
> any options and there is no need to toggle slot's two_phase value.
>
>

TODO

> 4.
> - /* Binary mode and streaming are only supported in v14 and higher */
> + /*
> + * Binary, streaming, and two_phase are only supported in v14 and
> + * higher
> + */
>
> We can say v15 for two_phase.
>

Fixed in v83-0001

> 5.
> -#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
> +#define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
> +#define LOGICALREP_PROTO_MAX_VERSION_NUM 3
>
> Isn't it better to define LOGICALREP_PROTO_MAX_VERSION_NUM as
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM instead of specifying directly
> the number?
>

Fixed in v83-0001

> 6.
> +/* Commit (and abort) information */
>  typedef struct LogicalRepCommitData
>  {
>   XLogRecPtr commit_lsn;
> @@ -122,6 +132,48 @@ typedef struct LogicalRepCommitData
>   TimestampTz committime;
>  } LogicalRepCommitData;
>
> Is there a reason for the above comment addition? If so, how is it
> related to this patch?
>

The LogicalRepCommitData is used by the 0002 patch and during
implementation it was not clear what was this struct, so I added the
missing comment (all other nearby typedefs except this one were
commented). But it is not strictly related to anything in patch 0001
so I have moved this change into the v83-0002 patch.


> 7.
> +++ b/src/test/subscription/t/021_twophase.pl
> @@ -0,0 +1,299 @@
> +# logical replication of 2PC test
> +use strict;
> +use warnings;
> +use PostgresNode;
> +use TestLib;
>
> In the nearby test files, we have Copyright notice like "# Copyright
> (c) 2021, PostgreSQL Global Development Group". We should add one to
> the new test files in this patch as well.
>

Fixed in v83-0001 and v83-0002

> 8.
> +# Also wait for two-phase to be enabled
> +my $twophase_query =
> + "SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT
> IN ('e');";
> +$node_subscriber->poll_query_until('postgres', $twophase_query)
> +  or die "Timed out while waiting for subscriber to enable twophase";
>
> Isn't it better to write this query as: "SELECT count(1) = 1 FROM
> pg_subscription WHERE subtwophasestate ='e';"? It looks a bit odd to
> use the NOT IN operator here. Similarly, change the same query used at
> another place in the

Re: SSL SNI

2021-06-07 Thread Michael Paquier

On Mon, Jun 07, 2021 at 11:34:24AM -0400, Tom Lane wrote:
> Yeah, I'd include the empty-string test just because it's standard
> practice in this area of libpq.  Whether those tests are actually
> triggerable in every case is obscure, but ...

Checking after a NULL string and an empty one is more libpq-ish.

> Patch looks sane by eyeball, though I didn't test it.

I did, and I could not break it.

+   SSLerrfree(err);
+   SSL_CTX_free(SSL_context);
+   return -1;
It seems to me that there is no need to free SSL_context if
SSL_set_tlsext_host_name() fails here, except if you'd like to move
the check for the SNI above SSL_CTX_free() around L1082.  There is no
harm as SSL_CTX_free() is a no-op on NULL input.
--
Michael


signature.asc
Description: PGP signature

1 2 >

100 matches

Mail list logo