date:20210510

GetSubscriptionRelations declares too many scan keys

2021-05-10 Thread Peter Smith

The function GetSubscriptionRelations was declaring ScanKeyData
skey[2]; but actually
only uses 1 scan key. It seems like the code was cut/paste from other
nearby functions
which really are using 2 keys.

PSA a trivial patch to declare the correct number of keys for this function.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-GetSubscriptionRelations-declare-only-1-scan-key.patch
Description: Binary data

Re: Remove "FROM" in "DELETE FROM" when using tab-completion

2021-05-10 Thread Julien Rouhaud

On Mon, May 10, 2021 at 06:36:19AM +, tanghy.f...@fujitsu.com wrote:
> On Monday, May 10, 2021 2:48 PM, Julien Rouhaud   worte
> >I think the behavior now is correct.  The goal of autocompletion is to save
> >keystrokes and time.  As the only valid keyword after a DELETE (at least in a
> >DeleteStmt) is FROM, it's a good thing that you get back "DELETE FROM" 
> >directly
> >rather than asking that to autocomplete in multiple steps.
> >
> >Now, the \help command is for commands, which is a different thing as the
> >command in that case is DELETE not DELETE FROM, even if you will have to 
> >follow
> >your DELETE with a FROM.
> 
> Thanks for your reply. I totally agree with you on the convenience of "DELETE 
> FROM" autocompletion.
> But I also noticed some autocompletion for "DELETE" in some cases is just 
> "DELETE" already. 
> 
> =# EXPLAIN[TAB]
> ANALYZE  DECLARE  DELETE   INSERT   SELECT   UPDATE   VERBOSE
> 
> =# COPY ([TAB]
> DELETE  INSERT  SELECT  TABLE   UPDATE  VALUES  WITH
> 
> Maybe we should keep the behavior consistent? 

Definitely.

> I mean we can change all "DELETE" to "DELETE FROM"  or just remove "FROM" for 
> consistency.

We should change all to DELETE FROM (apart from \help of course), and same for
INSERT, change to INSERT INTO everywhere it makes sense.

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Thomas Munro

On Mon, May 10, 2021 at 6:59 PM Fabien COELHO  wrote:
> > Is gdb installed, and are core files being dumped by that SIGABRT, and
> > are they using the default name (/proc/sys/kernel/core_pattern = core),
> > which the BF can find with the value it's using, namely 'core_file_glob'
> > => 'core*'?
>
> Nope:
>
>sh> cat /proc/sys/kernel/core_pattern
>|/usr/share/apport/apport %p %s %c %d %P %E

If you don't care about Ubuntu "apport" on this system (something for
sending crash/bug reports to developers with a GUI), you could
uninstall it (otherwise it overwrites the core_pattern every time it
restarts, no matter what you write in your sysctl.conf, apparently),
and then sudo sysctl -w kernel.core_pattern=core to undo the setting
immediately (or reboot).  Then hopefully the build farm would succeed
in dumping a backtrace into the log.

Re: GetSubscriptionRelations declares too many scan keys

2021-05-10 Thread Bharath Rupireddy

On Mon, May 10, 2021 at 12:36 PM Peter Smith  wrote:
>
> The function GetSubscriptionRelations was declaring ScanKeyData
> skey[2]; but actually
> only uses 1 scan key. It seems like the code was cut/paste from other
> nearby functions
> which really are using 2 keys.
>
> PSA a trivial patch to declare the correct number of keys for this function.

+1 for the change. It looks like a cut/paste type introduced by the
commit 7c4f52409a.

A comment on the patch: why do we need to declare an array of 1
element ScanKeyData skey[1];? Instead, can we just do ScanKeyData
skey;?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: row filtering for logical replication

2021-05-10 Thread Peter Smith

On Wed, Mar 31, 2021 at 12:47 PM Euler Taveira  wrote:
>

> Good question. IIRC the issue is that AlterPublicationStmt->tables has a list
> element that was a relation_expr_list and was converted to
> publication_table_list. If we share 'tables' with relation_expr_list (for 
> ALTER
> PUBLICATION ... DROP TABLE) and publication_table_list (for the other ALTER
> PUBLICATION ... ADD|SET TABLE), the OpenTableList() has to know what list
> element it is dealing with. I think I came to the conclusion that it is less
> uglier to avoid changing OpenTableList() and CloseTableList().
>
> [Doing some experimentation...]
>
> Here is a patch that remove the referred code. It uses 2 distinct list
> elements: relation_expr_list for ALTER PUBLICATION ... DROP TABLE and
> publication_table_list for for ALTER PUBLICATION ... ADD|SET TABLE. A new
> parameter was introduced to deal with the different elements of the list
> 'tables'.

AFAIK this is the latest patch available, but FYI it no longer applies
cleanly on HEAD.

git apply ../patches_misc/0001-Row-filter-for-logical-replication.patch
../patches_misc/0001-Row-filter-for-logical-replication.patch:518:
trailing whitespace.
error: patch failed: src/backend/parser/gram.y:426
error: src/backend/parser/gram.y: patch does not apply
error: patch failed: src/backend/replication/logical/worker.c:340
error: src/backend/replication/logical/worker.c: patch does not apply

Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: PG 14 release notes, first draft

2021-05-10 Thread Masahiko Sawada

On Mon, May 10, 2021 at 3:03 PM Bruce Momjian  wrote:
>
> I have committed the first draft of the PG 14 release notes.  You can
> see the most current  build of them here:
>
> https://momjian.us/pgsql_docs/release-14.html

Thank you!

> Add system view pg_stat_replication_slots to report replication slot activity 
> (Sawada Masahiko, Amit Kapila)
>
> Function pg_stat_reset_replication_slot() resets slot statistics. THIS IS 
> LOGICAL ONLY, BUT NO "LOGICAL" IN THE NAME?

IIUC pg_stat_replication_slots view supports only logical slot for
now. But we might have it show also physical slot in the future. I'm
fine with the current view name and description but someone might want
to use "logical replication slot" instead of just "replication slot".

> IS "ACTIVITY" THE RIGHT WORD?

The doc says "The pg_stat_replication_slots view will contain one row
per logical replication slot, showing statistics about its usage.". So
we can say "... to report replication slot statistics about its
usage".

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Race condition in recovery?

2021-05-10 Thread Kyotaro Horiguchi

At Fri, 7 May 2021 15:16:03 +0530, Dilip Kumar  wrote in 
> Okay, I got your point, now, consider the scenario that we are trying
> to get the checkpoint record in XLogFileReadAnyTLI, you are right that
> it returns history file 0002.history.  I think I did not mention
> one point, basically, the tool while restarting node 3 after promoting
> node 2 is deleting all the local WAL of node3 (so that node 3 can
> follow node2).  So now node3 doesn't have the checkpoint in the local
> segment.  Suppose checkpoint record was in segment
...
> So now you come out of XLogFileReadAnyTLI, without reading checkpoint
> record and without setting expectedTLEs.  Because expectedTLEs is only
> set if we are able to read the checkpoint record.  Make sense?

Thanks. I understood the case and reproduced. Although I don't think
removing WAL files from non-backup cluster is legit, I also think we
can safely start archive recovery from a replicated segment.

> So now expectedTLEs is still NULL and you go to get the checkpoint
> record from primary and use checkPointCopy.ThisTimeLineID.

I don't think erasing expectedTLEs after once set is the right fix
because expectedTLEs are supposed to be set just once iff we are sure
that we are going to follow the history, until rescan changes it as
the only exception.

It seems to me the issue here is not a race condition but
WaitForWALToBecomeAvailable initializing expectedTLEs with the history
of a improper timeline. So using recoveryTargetTLI instead of
receiveTLI for the case fixes this issue.

-   if (!expectedTLEs)
-   expectedTLEs = 
readTimeLineHistory(receiveTLI);

I thought that the reason using receiveTLI instead of
recoveryTargetTLI here is that there's a case where receiveTLI is the
future of recoveryTarrgetTLI but I haven't successfully had such a
situation.  If I set recovoryTargetTLI to a TLI that standby doesn't
know but primary knows, validateRecoveryParameters immediately
complains about that before reaching there.  Anyway the attached
assumes receiveTLI may be the future of recoveryTargetTLI.

Just inserting if() into the exising code makes the added lines stick
out of the right side edge of 80 columns so I refactored there a bit
to lower indentation.


I believe the 004_timeline_switch.pl detects your issue.  And the
attached change fixes it.

Any suggestions are welcome.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 52eba6b6ef8d8fda078d3acd27b6ce7406078df8 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Mon, 10 May 2021 16:57:24 +0900
Subject: [PATCH] Choose correct timeline when received historic timelines

When a standby starts streaming before determining expectedTLEs, it
determines the expected timeline list based on the timeline of the WAL
segment just streamed from the primary. If we fetched the last
checkpoint in the older timeline, this behavior prevents recovery from
proceeding. When primary streamed over a WAL file in a historical
timeline, use recoveryTargetTLI, which must be a history of the
primary.  There's one scenario where this makes a difference: suppose
two standbys are connected to a primary both by archive and
streaming. In the case where one of the standby promotes, then another
reconnects to the promoted standby before archiving the first segment
of the new timeline, and the content of pg_wal of the new standby is
removed before reconnection, the standby fetches the history file for
the new timeline but the first segment for the timeline is available
only via streaming.  In this case, the new standby thought that the
primary always sends newer timeline than the current recovery target
but that is not the case of this scenario.
---
 src/backend/access/transam/xlog.c  | 45 ++---
 src/test/recovery/t/004_timeline_switch.pl | 74 +-
 2 files changed, 108 insertions(+), 11 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c1d4415a43..1ca55b7ec0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -12656,22 +12656,47 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 		 * info is set correctly and XLogReceiptTime isn't
 		 * changed.
 		 */
-		if (readFile < 0)
-		{
-			if (!expectedTLEs)
-expectedTLEs = readTimeLineHistory(receiveTLI);
-			readFile = XLogFileRead(readSegNo, PANIC,
-	receiveTLI,
-	XLOG_FROM_STREAM, false);
-			Assert(readFile >= 0);
-		}
-		else
+
+		if (readFile >= 0)
 		{
 			/* just make sure source info is correct... */
 			readSource = XLOG_FROM_STREAM;
 			XLogReceiptSource = XLOG_FROM_STREAM;
 			return true;
 		}
+
+		readFile = XLogFileRead(readSegNo, PANIC,
+receiveTLI,
+XLOG_FROM_STREAM, false);
+		Assert(re

Re: Inaccurate error message when set fdw batch_size to 0

2021-05-10 Thread Bharath Rupireddy

On Mon, May 10, 2021 at 12:00 PM Tom Lane  wrote:
>
> Fujii Masao  writes:
> > +1 for the change of the error messages.
>
> Yeah, this error message seems outright buggy.  However, it's a minor
> matter.  Also, some people think "positive" is the same thing as
> "non-negative", so maybe we need less ambiguous wording?

Since value 0 can't be considered as either a positive or negative
integer, I think we can do as following(roughly):

if (value < 0) "requires a zero or positive integer value"
if (value <= 0) "requires a positive integer value"

I'm not sure whether we should consider changing these messages:
remainder for hash partition must be a non-negative integer
parallel vacuum degree must be a non-negative integer
repeat count size must be a non-negative integer
number of workers must be a non-negative integer
%s requires a non-negative numeric value
distance in phrase operator should be non-negative and less than %d

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: PG 14 release notes, first draft

2021-05-10 Thread David Rowley

On Mon, 10 May 2021 at 18:03, Bruce Momjian  wrote:
> I need clarification on many items, and the document still needs its
> items properly ordered, and markup added.  I also expect a lot of
> feedback.

Thanks for drafting that up.

> Add executor method to cache results from the inner-side of joins (David 
> Rowley)

I think it would be more accurate to say "inner-side of nested loop joins".

> Allow efficient retrieval of heap rows via tid (Edmund Horner, David Rowley)

I'd say we already had that feature with TID Scan. Maybe it would be
better to write:

"Allow efficient heap scanning on ranges of tids (Edmund Horner, David Rowley)"

> Improve the performance of parallel sequential scans (Thomas Munro, David 
> Rowley)

I think it is worth mentioning "I/O" before "performance".  This
change won't really help cases if all the table's pages are already in
shared buffers.

David

Re: GetSubscriptionRelations declares too many scan keys

2021-05-10 Thread Julien Rouhaud

On Mon, May 10, 2021 at 01:39:31PM +0530, Bharath Rupireddy wrote:
> On Mon, May 10, 2021 at 12:36 PM Peter Smith  wrote:
> >
> > The function GetSubscriptionRelations was declaring ScanKeyData
> > skey[2]; but actually
> > only uses 1 scan key. It seems like the code was cut/paste from other
> > nearby functions
> > which really are using 2 keys.
> >
> > PSA a trivial patch to declare the correct number of keys for this function.
> 
> +1 for the change. It looks like a cut/paste type introduced by the
> commit 7c4f52409a.
> 
> A comment on the patch: why do we need to declare an array of 1
> element ScanKeyData skey[1];? Instead, can we just do ScanKeyData
> skey;?

+1, there are already many places where it's done this way if there's only 1
key.

Re: Race condition in recovery?

2021-05-10 Thread Dilip Kumar

On Mon, May 10, 2021 at 2:05 PM Kyotaro Horiguchi
 wrote:

> I thought that the reason using receiveTLI instead of
> recoveryTargetTLI here is that there's a case where receiveTLI is the
> future of recoveryTarrgetTLI but I haven't successfully had such a
> situation.  If I set recovoryTargetTLI to a TLI that standby doesn't
> know but primary knows, validateRecoveryParameters immediately
> complains about that before reaching there.  Anyway the attached
> assumes receiveTLI may be the future of recoveryTargetTLI.

If you see the note in this commit. It says without the timeline
history file, so does it trying to say that although receiveTLI is the
ancestor of recovoryTargetTLI,  it can not detect that because of the
absence of the TL.history file ?

ee994272ca50f70b53074f0febaec97e28f83c4e
Author: Heikki Linnakangas   2013-01-03 14:11:58
Committer: Heikki Linnakangas   2013-01-03 14:11:58
.
  Without the timeline history file, recovering that file
will fail as the older timeline ID is not recognized to be an ancestor of
the target timeline. If you try to recover from such a backup, using only
streaming replication to fetch the WAL, this patch is required for that to
work.
=

>
> I believe the 004_timeline_switch.pl detects your issue.  And the
> attached change fixes it.

I think this fix looks better to me, but I will think more about it
and give my feedback.  Thanks for quickly coming up with the
reproducible test case.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: GetSubscriptionRelations declares too many scan keys

2021-05-10 Thread Peter Smith

On Mon, May 10, 2021 at 6:09 PM Bharath Rupireddy
 wrote:
>
> On Mon, May 10, 2021 at 12:36 PM Peter Smith  wrote:
> >
> > The function GetSubscriptionRelations was declaring ScanKeyData
> > skey[2]; but actually
> > only uses 1 scan key. It seems like the code was cut/paste from other
> > nearby functions
> > which really are using 2 keys.
> >
> > PSA a trivial patch to declare the correct number of keys for this function.
>
> +1 for the change. It looks like a cut/paste type introduced by the
> commit 7c4f52409a.
>
> A comment on the patch: why do we need to declare an array of 1
> element ScanKeyData skey[1];? Instead, can we just do ScanKeyData
> skey;?

IMO declaring skey[1] is better because then the code can share the
same pattern as every other ScanData skey[n] code.

Please search PG source code for "ScanData skey[1];" - there are
dozens of precedents where other people felt the same as me for
declaring single keys.

Kind Regards,
Peter Smith.
Fujitsu Australia

Executor code - found an instance of a WHILE that should just be an IF

2021-05-10 Thread Greg Nancarrow

Hi,

During debugging I noticed some code in ExecResult() where a WHILE
loop is being used with an unconditional RETURN at the end of the
block (which is intentional, looking at the history of changes), but
now there's no actual use of the loop in any way. The code should
probably be changed to just use IF for clarity.
I've attached a patch.

Regards,
Greg Nancarrow
Fujitsu Australia


v1-0001-Change-an-instance-of-WHILE-to-IF-in-executor-code.patch
Description: Binary data

Re: GetSubscriptionRelations declares too many scan keys

2021-05-10 Thread Julien Rouhaud

On Mon, May 10, 2021 at 07:09:29PM +1000, Peter Smith wrote:
> On Mon, May 10, 2021 at 6:09 PM Bharath Rupireddy
>  wrote:
> >
> > On Mon, May 10, 2021 at 12:36 PM Peter Smith  wrote:
> > >
> > > The function GetSubscriptionRelations was declaring ScanKeyData
> > > skey[2]; but actually
> > > only uses 1 scan key. It seems like the code was cut/paste from other
> > > nearby functions
> > > which really are using 2 keys.
> > >
> > > PSA a trivial patch to declare the correct number of keys for this 
> > > function.
> >
> > +1 for the change. It looks like a cut/paste type introduced by the
> > commit 7c4f52409a.
> >
> > A comment on the patch: why do we need to declare an array of 1
> > element ScanKeyData skey[1];? Instead, can we just do ScanKeyData
> > skey;?
> 
> IMO declaring skey[1] is better because then the code can share the
> same pattern as every other ScanData skey[n] code.
> 
> Please search PG source code for "ScanData skey[1];" - there are
> dozens of precedents where other people felt the same as me for
> declaring single keys.

AFAICT there are 73 occurences vs 62 of the "Scandata skey;".  I don't think
there's a huge consensus for one over the other.

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Fabien COELHO





If you don't care about Ubuntu "apport" on this system (something for
sending crash/bug reports to developers with a GUI), you could
uninstall it (otherwise it overwrites the core_pattern every time it
restarts, no matter what you write in your sysctl.conf, apparently),
and then sudo sysctl -w kernel.core_pattern=core to undo the setting
immediately (or reboot).  Then hopefully the build farm would succeed
in dumping a backtrace into the log.


I forced-removed apport (which meant removing xserver-xorg). Let's see 
whether the reports are better or whether I break something.


--
Fabien.

Re: GetSubscriptionRelations declares too many scan keys

2021-05-10 Thread Dilip Kumar

On Mon, May 10, 2021 at 2:46 PM Julien Rouhaud  wrote:
>
> On Mon, May 10, 2021 at 07:09:29PM +1000, Peter Smith wrote:
> > On Mon, May 10, 2021 at 6:09 PM Bharath Rupireddy
> >  wrote:
> > >
> > > On Mon, May 10, 2021 at 12:36 PM Peter Smith  
> > > wrote:
> > > >
> > > > The function GetSubscriptionRelations was declaring ScanKeyData
> > > > skey[2]; but actually
> > > > only uses 1 scan key. It seems like the code was cut/paste from other
> > > > nearby functions
> > > > which really are using 2 keys.
> > > >
> > > > PSA a trivial patch to declare the correct number of keys for this 
> > > > function.
> > >
> > > +1 for the change. It looks like a cut/paste type introduced by the
> > > commit 7c4f52409a.
> > >
> > > A comment on the patch: why do we need to declare an array of 1
> > > element ScanKeyData skey[1];? Instead, can we just do ScanKeyData
> > > skey;?
> >
> > IMO declaring skey[1] is better because then the code can share the
> > same pattern as every other ScanData skey[n] code.
> >
> > Please search PG source code for "ScanData skey[1];" - there are
> > dozens of precedents where other people felt the same as me for
> > declaring single keys.
>
> AFAICT there are 73 occurences vs 62 of the "Scandata skey;".  I don't think
> there's a huge consensus for one over the other.

I think both Scandata skey[1]; and Scandata skey; are used. But IMHO
using Scandata skey; looks better.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Why do we have perl and sed versions of Gen_dummy_probes?

2021-05-10 Thread Peter Eisentraut


On 07.05.21 20:31, Andrew Dunstan wrote:

On 5/7/21 1:20 PM, Andres Freund wrote:

On 2021-05-07 11:19:02 -0400, Andrew Dunstan wrote:

Here's a patch that adds the README and also adds a Makefile recipe for
regenerating Gen_dummy_probes.pl after the sed script is changed. On my
system at least the recipe is idempotent.

Nice! Thanks for this work.


de nada. pushed.


This recipe doesn't produce a Gen_dummy_probes.pl that matches exactly 
the one that is there now.  If this is going to be the preferred method, 
then we should generate it once so that it matches going forward.

Fix of file path in the file comments

2021-05-10 Thread Nitin Jadhav

Hi,

The file path mentioned in the file comments of
'src/backend/utils/activity/backend_status.c'
was incorrect. Modified it to the correct path. Please find the patch
attached.

Thanks & Regards,
Nitin Jadhav


v1_0001-fix_file_path_in_comments.patch
Description: Binary data

Re: [Proposal] Global temporary tables

2021-05-10 Thread Dilip Kumar

On Thu, Apr 22, 2021 at 1:11 PM wenjing  wrote:
>

I have briefly looked into the design comments added by the patch.  I
have a few questions.

+Feature description
+
+
+Previously, temporary tables are defined once and automatically
+created (starting with empty contents) in every session before using them.


I don’t think this statement is correct, I mean if we define a temp
table in one session then it doesn’t automatically create in all the
sessions.


+
+Like local temporary table, Global Temporary Table supports ON COMMIT
PRESERVE ROWS
+or ON COMMIT DELETE ROWS clause, so that data in the temporary table can be
+cleaned up or reserved automatically when a session exits or a
transaction COMMITs.

/reserved/preserved


I was trying to look into the “Main design idea” section.

+1) CATALOG
+GTTs store session-specific data. The storage information of GTTs'data, their
+transaction information, and their statistics are not stored in the catalog.

I did not understand what do you mean by “transaction information” is
not stored in the catalog?  Mean what transaction information are
stored in catalog in the normal table which is not stored for GTT?

+Changes to the GTT's metadata affect all sessions.
+The operations making those changes include truncate GTT, Vacuum/Cluster GTT,
+and Lock GTT.

How does Truncate or Vacuum affect all the sessions, I mean truncate
should only truncate the data of the current session and the same is
true for the vacuum no?

I will try to do a more detailed review.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Thomas Munro

On Mon, May 10, 2021 at 9:30 PM Fabien COELHO  wrote:
> I forced-removed apport (which meant removing xserver-xorg). Let's see
> whether the reports are better or whether I break something.

And of course this time it succeeded :-)

Just by the way, I noticed it takes ~40 minutes to compile.  Is there
a reason you don't install ccache and set eg CC="ccache
/path/to/clang", CXX="ccache /path/to/clang++", CLANG="ccache
/path/to/clang"?  That doesn't seem to conflict with your goal of
testing LLVM/Clang's main branch, because ccache checks the compiler's
mtime as part of its cache invalidation strategy.

Re: Why do we have perl and sed versions of Gen_dummy_probes?

2021-05-10 Thread Dagfinn Ilmari Mannsåker

Peter Eisentraut  writes:

> On 07.05.21 20:31, Andrew Dunstan wrote:
>> On 5/7/21 1:20 PM, Andres Freund wrote:
>>> On 2021-05-07 11:19:02 -0400, Andrew Dunstan wrote:
 Here's a patch that adds the README and also adds a Makefile recipe for
 regenerating Gen_dummy_probes.pl after the sed script is changed. On my
 system at least the recipe is idempotent.
>>> Nice! Thanks for this work.
>>
>> de nada. pushed.
>
> This recipe doesn't produce a Gen_dummy_probes.pl that matches exactly
> the one that is there now.  If this is going to be the preferred method,
> then we should generate it once so that it matches going forward.

Which version of perltidy do you have installed?  For me it generates
identical versions using any of 20170521 (per src/tools/pgindent/README),
20201207 (what I happened to have installed before), and 20210402 (the
latest).

Also, what does the difference look like?

- ilmari
-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.   - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.  - Calle Dybedahl

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Fabien COELHO





And of course this time it succeeded :-)


Hmmm. ISTM that failures are on and off every few attempts.


Just by the way, I noticed it takes ~40 minutes to compile.  Is there
a reason you don't install ccache and set eg CC="ccache
/path/to/clang", CXX="ccache /path/to/clang++", CLANG="ccache
/path/to/clang"?  That doesn't seem to conflict with your goal of
testing LLVM/Clang's main branch, because ccache checks the compiler's
mtime as part of its cache invalidation strategy.


Yep.

I remember that I disactivated it for some reason once, but I cannot 
remember why. I just reactivated it, will see what happens.


--
Fabien.

Re: PG 14 release notes, first draft

2021-05-10 Thread Matthias van de Meent

On Mon, 10 May 2021 at 08:03, Bruce Momjian  wrote:
>
> I have committed the first draft of the PG 14 release notes.  You can
> see the most current  build of them here:
> https://momjian.us/pgsql_docs/release-14.html
>
> I need clarification on many items, and the document still needs its
> items properly ordered, and markup added.  I also expect a lot of
> feedback.

I noticed that the improvement in bloat control in the HeapAM that I
know of (3c3b8a4b, 0ff8bbde) weren't documented here. Although each
can be considered minor, they together can decrease the bloating
behaviour of certain workloads significantly (and limit the total
damage), and in my opinion this should be mentioned.

3c3b8a4b: Returns space claimed for the line pointer array back to the
page's empty space, so that it can also be used for tuple data.

0ff8bbde: Allows large tuples to be inserted on pages which have only
a small amount of data, regardless of fillfactor.

Together they should be able to help significantly in both bloat
prevention and bloat reduction.

> I plan to work on completing this document this coming week in
> preparation for beta next week.

Thanks!

With regards,

Matthias van de Meent

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-10 Thread Antonin Houska

Michail Nikolaev  wrote:

> Hello, Antonin.
> 
> > I'm trying to review the patch, but not sure if I understand this problem,
> > please see my comment below.
> 
> Thanks a lot for your attention. It is strongly recommended to look at
> version N3 (1) because it is a much more elegant, easy, and reliable
> solution :) But the minRecoveryPoint-related issue affects it anyway.

Indeed I'm reviewing (1), but I wanted to discuss this particular question in
context, so I replied here.

> > Why doesn't minRecoveryPoint get updated to 20? IMO that should happen by
> > replaying the commit record. And if the standby happens to crash before the
> > commit record could be replayed, no query should see the deletion and thus 
> > no
> > hint bit should be set in the index.
> 
> minRecoveryPoint is not affected by replaying the commit record in
> most cases. It is updated in a lazy way, something like this:
> minRecoveryPoint = max LSN of flushed page. Version 3 of a patch
> contains a code_optional.patch to move minRecoveryPoint more
> aggressively to get additional performance on standby (based on
> Peter’s answer in (2).

> So, “minRecoveryPoint will go here” is not because of “STANDBY INSERTS
> NEW ROW IN INDEX” it is just a random event.
> Michail.

Sorry, I missed the fact that your example can be executed inside BEGIN - END
block, in which case minRecoveryPoint won't advance after each command.

I'll continue my review by replying to (1)

> [1]: 
> https://www.postgresql.org/message-id/CANtu0ohHu1r1xQfTzEJuxeaOMYncG7xRxUQWdH%3DcMXZSf%2Bnzvg%40mail.gmail.com
> [2]: 
> https://www.postgresql.org/message-id/CAH2-WzkSUcuFukhJdSxHFgtL6zEQgNhgOzNBiTbP_4u%3Dk6igAg%40mail.gmail.com

> (“Also, btw, do you know any reason to keep minRecoveryPoint at a low
> value?”)

I'm not an expert in this area (I'm reviewing this patch also to learn more
about recovery and replication), but after a breif research I think that
postgres tries not to update the control file too frequently, see comments in
UpdateMinRecoveryPoint(). I don't know if what you do in code_optional.patch
would be a problem. Actually I think that a commit record should be replayed
more often than XLOG_RUNNING_XACTS, shouldn't it?

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: Executor code - found an instance of a WHILE that should just be an IF

2021-05-10 Thread David Rowley

On Mon, 10 May 2021 at 21:16, Greg Nancarrow  wrote:
> During debugging I noticed some code in ExecResult() where a WHILE
> loop is being used with an unconditional RETURN at the end of the
> block (which is intentional, looking at the history of changes), but
> now there's no actual use of the loop in any way. The code should
> probably be changed to just use IF for clarity.
> I've attached a patch.

Looks like leftovers from ea15e1867.

I don't think this will affect any code generation but you are right,
it should be an "if".

David

Re: PG 14 release notes, first draft

2021-05-10 Thread Joe Conway


On 5/10/21 2:03 AM, Bruce Momjian wrote:

I have committed the first draft of the PG 14 release notes.  You can
see the most current  build of them here:

https://momjian.us/pgsql_docs/release-14.html

I need clarification on many items, and the document still needs its
items properly ordered, and markup added.  I also expect a lot of
feedback.

I plan to work on completing this document this coming week in
preparation for beta next week.


While only a small change, this commit does affect user visible behavior 
and so should probably be noted:


https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b12bd4869b5e

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: [PATCH] Identify LWLocks in tracepoints

2021-05-10 Thread Peter Eisentraut


On 05.05.21 06:20, Craig Ringer wrote:

On Wed, 5 May 2021 at 09:15, Craig Ringer  wrote:


warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
  1322 |TRACE_POSTGRESQL_LWLOCK_WAIT_START(T_NAME(lock), mode);
   |  ^


Odd that I didn't get that.


This compiler complaint is not due to the _ENABLED() test as such.
It's due to the fact that we completely define out the
TRACE_POSTGRESQL_ macros with src/backend/utils/Gen_dummy_probes.sed .

While explicit braces could be added around each test, I suggest
fixing Gen_dummy_probes.sed to emit the usual dummy statement instead.
Patch attached.


Committed, with the Gen_dummy_probes.pl change added.

Re: Inherited UPDATE/DELETE vs async execution

2021-05-10 Thread Amit Langote

Fujita-san,

On Sat, May 8, 2021 at 1:21 AM Etsuro Fujita  wrote:
> I noticed this while working on the
> EXPLAIN-ANALYZE-for-async-capable-nodes issue:
>
> EXPLAIN (VERBOSE, COSTS OFF)
> DELETE FROM async_pt;
>QUERY PLAN
> 
>  Delete on public.async_pt
>Foreign Delete on public.async_p1 async_pt_1
>Foreign Delete on public.async_p2 async_pt_2
>Delete on public.async_p3 async_pt_3
>->  Append
>  ->  Async Foreign Delete on public.async_p1 async_pt_1
>Remote SQL: DELETE FROM public.base_tbl1
>  ->  Async Foreign Delete on public.async_p2 async_pt_2
>Remote SQL: DELETE FROM public.base_tbl2
>  ->  Seq Scan on public.async_p3 async_pt_3
>Output: async_pt_3.tableoid, async_pt_3.ctid
> (11 rows)
>
> DELETE FROM async_pt;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> connection to server was lost
>
> The cause for this would be that direct-update plans are mistakenly
> treated as async-capable ones, as shown in the EXPLAIN output.

I guess that happens because the ForeignScan nodes responsible for
scanning or direct-updating/deleting from child foreign tables appear
under an Append as of 86dc90056, whereas before they would appear as
child plans of a ModifyTable node.  IIUC, it's the Append that causes
the async_capable flag to be set in those ForeignScan nodes.

>  To
> fix, I think we should modify postgresPlanDirectModify() so that it
> clears the async-capable flag if it is set.  Attached is a patch for
> that.  Maybe I am missing something, though.

I see that your patch is to disable asynchronous execution in
ForeignScan nodes responsible for direct update/delete, but why not do
the same for other ForeignScan nodes too?  Or the other way around --
is it because fixing the crash that occurs in the former's case would
be a significant undertaking for little gain?

--
Amit Langote
EDB: http://www.enterprisedb.com

RE: Remove "FROM" in "DELETE FROM" when using tab-completion

2021-05-10 Thread tanghy.f...@fujitsu.com

On Monday, May 10, 2021 4:15 PM, Julien Rouhaud  wrote
>We should change all to DELETE FROM (apart from \help of course), and same for
>INSERT, change to INSERT INTO everywhere it makes sense.

Thanks for the reply. Your advice sounds reasonable to me.
So I tried to change all "DELETE" to "DELETE FROM" and "INSERT" to "INSERT 
INTO" in the attached patch except 
the follow cases which I think is in accordance with what PG-Doc said.
 CREATE POLICY
 CREATE [ OR REPLACE ] RULE
 CREATE [ OR REPLACE ] TRIGGER
 ALTER DEFAULT PRIVILEGES

After applying the patch, the tap-tests for psql is passed.
Please be free to tell me anything insufficient you found in my fix. Thanks.

Regards,
Tang


0001-TAB-completion-modification-for-INSERT-INTO-and-DELE.patch
Description:  0001-TAB-completion-modification-for-INSERT-INTO-and-DELE.patch

Re: Why do we have perl and sed versions of Gen_dummy_probes?

2021-05-10 Thread Andrew Dunstan



On 5/10/21 7:16 AM, Dagfinn Ilmari Mannsåker wrote:
> Peter Eisentraut  writes:
>
>> On 07.05.21 20:31, Andrew Dunstan wrote:
>>> On 5/7/21 1:20 PM, Andres Freund wrote:
 On 2021-05-07 11:19:02 -0400, Andrew Dunstan wrote:
> Here's a patch that adds the README and also adds a Makefile recipe for
> regenerating Gen_dummy_probes.pl after the sed script is changed. On my
> system at least the recipe is idempotent.
 Nice! Thanks for this work.
>>> de nada. pushed.
>> This recipe doesn't produce a Gen_dummy_probes.pl that matches exactly
>> the one that is there now.  If this is going to be the preferred method,
>> then we should generate it once so that it matches going forward.
> Which version of perltidy do you have installed?  For me it generates
> identical versions using any of 20170521 (per src/tools/pgindent/README),
> 20201207 (what I happened to have installed before), and 20210402 (the
> latest).
>
> Also, what does the difference look like?
>

Yep:

andrew@emma:utils $ touch Gen_dummy_probes.sed
andrew@emma:utils $ touch ../../../src/Makefile.global
andrew@emma:utils $ make top_srcdir=../../.. Gen_dummy_probes.pl
perl -ni -e ' print; exit if /^\$0/;' Gen_dummy_probes.pl
s2p -f Gen_dummy_probes.sed  | sed -e 1,4d -e '/# #/d' -e '$d' >>
Gen_dummy_probes.pl
perltidy --profile=../../tools/pgindent/perltidyrc Gen_dummy_probes.pl
perl -pi -e '!$lb && ( /^\t+#/  || /^# prototypes/ ) && print qq{\n};'\
    -e '$lb = m/^\n/; ' Gen_dummy_probes.pl
andrew@emma:utils $ git diff
andrew@emma:utils $ perltidy --version
This is perltidy, v20170521

cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

postgres_fdw - make cached connection functions tests meaningful

2021-05-10 Thread Bharath Rupireddy

Hi,

While working on [1], I got to know that there is a new GUC
debug_invalidate_system_caches_always that has been introduced in v14.
It can be used to switch off cache invalidation in
CLOBBER_CACHE_ALWAYS builds which makes cache sensitive tests stable.
Using this GUC, it is quite possible to make cached connection
management function tests more meaningful by returning original
values(true/false, all the output columns) instead of SELECT 1. Note
that the commit f77717b29 stabilized the tests for those functions -
postgres_fdw_disconnect, postgres_fdw_disconnect_all and
postgres_fdw_get_connections by masking actual return value of the
functions.

Attaching a patch to use the new GUC to make the functions return actual output.

Thoughts?

[1] - 
https://www.postgresql.org/message-id/CALj2ACVGSQsq68y-LmyXKZzbNVgSgsiVKSzsrVXzVgnsZTN26Q%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From a5ff26edbf54fca4d42833ef4aedf79bec9445bd Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 8 May 2021 19:51:58 +0530
Subject: [PATCH v1] postgres_fdw-make cached connection functions tests
 meaningful

The new GUC debug_invalidate_system_caches_always which has been
introduced in v14 can be used to switch off cache invalidation in
CLOBBER_CACHE_ALWAYS builds which can make cache sensitive tests
stable. Using this GUC, it is quite possible to make cached
connection management function tests more meaningful by returning
original values(true/false, all the output columns) instead of
SELECT 1. Note that the commit f77717b29 stabilized the tests for
those functions - postgres_fdw_disconnect, postgres_fdw_disconnect_all
and postgres_fdw_get_connections by masking actual return value of
the functions.
---
 .../postgres_fdw/expected/postgres_fdw.out| 135 ++
 contrib/postgres_fdw/sql/postgres_fdw.sql |  58 
 2 files changed, 100 insertions(+), 93 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 6f533c745d..67d993edb8 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9198,17 +9198,26 @@ PREPARE TRANSACTION 'fdw_tpc';
 ERROR:  cannot PREPARE a transaction that has operated on postgres_fdw foreign tables
 ROLLBACK;
 WARNING:  there is no transaction in progress
+-- Let's ensure to close all the existing cached connections so that they don't
+-- make the following tests unstable.
+SELECT 1 FROM postgres_fdw_disconnect_all();
+ ?column? 
+--
+1
+(1 row)
+
+-- If debug_invalidate_system_caches_always is active, it results in dropping
+-- remote connections after every transaction, making it impossible to test
+-- connection retry use case meaningfully. And also, the new functions
+-- introduced for cached connections management will be unstable with the flag
+-- on. So turn it off for these tests.
+SET debug_invalidate_system_caches_always = 0;
 -- ===
 -- reestablish new connection
 -- ===
 -- Change application_name of remote connection to special one
 -- so that we can easily terminate the connection later.
 ALTER SERVER loopback OPTIONS (application_name 'fdw_retry_check');
--- If debug_invalidate_system_caches_always is active, it results in
--- dropping remote connections after every transaction, making it
--- impossible to test termination meaningfully.  So turn that off
--- for this test.
-SET debug_invalidate_system_caches_always = 0;
 -- Make sure we have a remote connection.
 SELECT 1 FROM ft1 LIMIT 1;
  ?column? 
@@ -9250,21 +9259,13 @@ SELECT 1 FROM ft1 LIMIT 1;-- should fail
 ERROR:  08006
 \set VERBOSITY default
 COMMIT;
-RESET debug_invalidate_system_caches_always;
 -- =
 -- test connection invalidation cases and postgres_fdw_get_connections function
 -- =
--- Let's ensure to close all the existing cached connections.
-SELECT 1 FROM postgres_fdw_disconnect_all();
- ?column? 
---
-1
-(1 row)
-
 -- No cached connections, so no records should be output.
-SELECT server_name FROM postgres_fdw_get_connections() ORDER BY 1;
- server_name 
--
+SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;
+ server_name | valid 
+-+---
 (0 rows)
 
 -- This test case is for closing the connection in pgfdw_xact_callback
@@ -9284,11 +9285,11 @@ SELECT 1 FROM ft7 LIMIT 1;
 
 -- List all the existing cached connections. loopback and loopback3 should be
 -- output.
-SELECT server_name FROM postgres_fdw_get_connections() ORDER BY 1;
- server_name 
--
- loopback
- loopback3
+SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;
+ server_name | valid 
+--

compatibility issue - problem with migrating from Postgres 11

2021-05-10 Thread Pavel Stehule

Hi

my customer reported an issue related to unsupported TABLESPACE pg_default
for partitioned table:

postgres=# CREATE TABLE IF NOT EXISTS foo2
(
data bytea,
guid character varying(255) COLLATE pg_catalog."default" NOT NULL,
part date NOT NULL,
retention_period integer,
CONSTRAINT document_data_bytea_pkey1 PRIMARY KEY (guid, part)
) PARTITION BY RANGE (part)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ERROR:  cannot specify default tablespace for partitioned relations

This check is two years old
https://github.com/postgres/postgres/commit/87259588d0ab0b8e742e30596afa7ae25caadb18#diff-f2c91c95b7f2a84d916138e0af4338859803a03cee0d7e2e710fbcb869c59d0c

Are there some plans to fix this issue?

Regards

Pavel

Re: PG 14 release notes, first draft

2021-05-10 Thread Alexander Korotkov

Hi, Bruce!

On Mon, May 10, 2021 at 9:03 AM Bruce Momjian  wrote:
> I have committed the first draft of the PG 14 release notes.  You can
> see the most current  build of them here:
>
> https://momjian.us/pgsql_docs/release-14.html
>
> I need clarification on many items, and the document still needs its
> items properly ordered, and markup added.  I also expect a lot of
> feedback.
>
> I plan to work on completing this document this coming week in
> preparation for beta next week.

Thank you very much for your work!

Let me provide a missing description for the items related to me.

 * Improve handling of compound words in to_tsquery() and
websearch_to_tsquery() (Alexander Korotkov)
Compound words are now transformed into parts connected with phrase
search operators.  For example, to_tsquery('pg_class') becomes 'pg <->
class' instead of 'pg & class'.  This eliminates bug of handling
compound words connected with the phrase operator and makes the search
of compound words more strict.

 * Fix extra distance in phrase operators for quoted text in
websearch_to_tsquery() (Alexander Korotkov)
For example, websearch_to_tsquery('english', '"aaa: bbb"') becomes
'aaa <> bbb' instead of  'aaa <2> bbb'.

Feel free to make stylistic and other corrections if needed.

--
Regards,
Alexander Korotkov

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-10 Thread Michail Nikolaev

Hello,
Antonin.

> Sorry, I missed the fact that your example can be executed inside BEGIN - END
> block, in which case minRecoveryPoint won't advance after each command.

No, the block is not executed as a single transaction, all commands
are separated transactions (see below)

> Actually I think that a commit record should be replayed
> more often than XLOG_RUNNING_XACTS, shouldn't it?

Yes, but replaying commit records DOES NOT affect minRecoveryPoint in
almost all cases.

UpdateMinRecoveryPoint is called by XLogFlush,  but xact_redo_commit
calls XLogFlush only in two cases:
* DropRelationFiles is called (some relation are dropped)
* If ForceSyncCommit was used on primary - few “heavy” commands, like
DropTableSpace, CreateTableSpace, movedb, etc.

But “regular” commit record is replayed without XLogFlush and, as
result, without UpdateMinRecoveryPoint.

So, in practice, UpdateMinRecoveryPoint is updated in an “async” way
by checkpoint job. This is why there is a sense to call it on
XLOG_RUNNING_XACTS.

Thanks,
Michail.

Re: Enhanced error message to include hint messages for redundant options error

2021-05-10 Thread vignesh C

On Mon, May 10, 2021 at 6:00 AM houzj.f...@fujitsu.com
 wrote:
>
> > > > > > > > Thanks! The v5 patch looks good to me. Let's see if all
> > > > > > > > agree on the goto duplicate_error approach which could reduce
> > the LOC by ~80.
> > > > > > >
> > > > > > > I think the "goto duplicate_error" approach looks good, it
> > > > > > > avoids duplicating the same error code multiple times.
> > > > > >
> > > > > > Thanks. I will mark the v5 patch "ready for committer" if no one has
> > comments.
> > > > >
> > > > > Hi,
> > > > >
> > > > > I looked into the patch and noticed a minor thing.
> > > > >
> > > > > +   return; /* keep compiler quiet */
> > > > >  }
> > > > >
> > > > > I think we do not need the comment here.
> > > > > The compiler seems not require "return" at the end of function
> > > > > when function's return type is VOID.
> > > > >
> > > > > In addition, it seems better to remove these "return;" like what
> > > > > commit "3974c4" did.
> > > >
> > > > It looks like that commit removed the plain return statements for a
> > > > void returning functions. I see in the code that there are return
> > > > statements that are there right after ereport(ERROR, just to keep
> > > > the compiler quiet. Here in this patch also, we have return;
> > > > statements after ereport(ERROR, for void returning functions. I'm
> > > > not sure removing them would cause some compiler warnings on some
> > > > platforms with some other compilers. If we're not sure, I'm okay to
> > > > keep those return; statements. Thoughts?
> > >
> > > I felt we could retain the return statement and remove the comments.
> > > If you are ok with that I will modify and post a patch for it.
> > > Thoughts?
> >
> > I would like to keep it as is i.e. both return statement and /* keep 
> > compiler
> > quiet */ comment. Having said that, it's better to leave it to the 
> > committer on
> > whether to have the return statement at all.
>
> Yes, it's better to leave it to the committer on whether to have the 
> "return;".
> But, I think at least removing "return;" which is at the *end* of the 
> function will not cause any warning.
> Such as:
>
> +   return; /* keep compiler quiet */
> }
>
> So, I'd vote for at least removing the comment " keep the compiler quiet ".

That sounds fine to me, Attached v6 patch which has the changes for the same.

Regards,
Vignesh
From e4db135ebba067c5ae0a53489acf687e4c6a6f33 Mon Sep 17 00:00:00 2001
From: vignesh 
Date: Mon, 26 Apr 2021 18:40:36 +0530
Subject: [PATCH v6] Enhance error message.

Enhanced error message, so that the user can easily identify the error.
---
 contrib/file_fdw/file_fdw.c |  20 +--
 src/backend/catalog/aclchk.c|  22 ++--
 src/backend/commands/copy.c |  66 --
 src/backend/commands/dbcommands.c   |  90 +-
 src/backend/commands/extension.c|  27 ++--
 src/backend/commands/foreigncmds.c  |  30 +++--
 src/backend/commands/functioncmds.c |  57 +
 src/backend/commands/publicationcmds.c  |  39 +++---
 src/backend/commands/sequence.c |  57 +++--
 src/backend/commands/subscriptioncmds.c |  75 +--
 src/backend/commands/tablecmds.c|   2 +-
 src/backend/commands/typecmds.c |  38 +++---
 src/backend/commands/user.c | 131 +++-
 src/backend/parser/parse_utilcmd.c  |   2 +-
 src/backend/replication/pgoutput/pgoutput.c |  31 +++--
 src/backend/replication/walsender.c |  23 ++--
 src/backend/tcop/utility.c  |  20 +--
 src/include/commands/defrem.h   |   6 +-
 src/include/commands/publicationcmds.h  |   4 +-
 src/include/commands/subscriptioncmds.h |   4 +-
 src/include/commands/typecmds.h |   2 +-
 src/include/commands/user.h |   2 +-
 src/test/regress/expected/copy2.out |  24 ++--
 src/test/regress/expected/foreign_data.out  |   8 +-
 src/test/regress/expected/publication.out   |   4 +-
 25 files changed, 353 insertions(+), 431 deletions(-)

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 2c2f149fb0..32398c227f 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -200,6 +200,7 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 	char	   *filename = NULL;
 	DefElem*force_not_null = NULL;
 	DefElem*force_null = NULL;
+	DefElem*def;
 	List	   *other_options = NIL;
 	ListCell   *cell;
 
@@ -209,7 +210,7 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 	 */
 	foreach(cell, options_list)
 	{
-		DefElem*def = (DefElem *) lfirst(cell);
+		def = (DefElem *) lfirst(cell);
 
 		if (!is_valid_option(def->defname, catalog))
 		{
@@ -290,10 +291,7 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 		else if (strcmp(def->defname, "force_not_null") == 0)
 		{
 			if (force_not_null)
-ereport(ERROR,
-		(errcode(ERRCODE_SYNTAX_ERROR),
-

Re: row filtering for logical replication

2021-05-10 Thread Euler Taveira

On Mon, May 10, 2021, at 5:19 AM, Peter Smith wrote:
> AFAIK this is the latest patch available, but FYI it no longer applies
> cleanly on HEAD.
Peter, the last patch is broken since f3b141c4825. I'm still working on it for
the next CF. I already addressed the points suggested by Amit in his last
review; however, I'm still working on a cache for evaluating expression as
suggested by Andres. I hope to post a new patch soon.


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: Remove "FROM" in "DELETE FROM" when using tab-completion

2021-05-10 Thread Dilip Kumar

On Mon, May 10, 2021 at 5:57 PM tanghy.f...@fujitsu.com
 wrote:
>
> On Monday, May 10, 2021 4:15 PM, Julien Rouhaud  wrote
> >We should change all to DELETE FROM (apart from \help of course), and same 
> >for
> >INSERT, change to INSERT INTO everywhere it makes sense.
>
> Thanks for the reply. Your advice sounds reasonable to me.
> So I tried to change all "DELETE" to "DELETE FROM" and "INSERT" to "INSERT 
> INTO" in the attached patch except
> the follow cases which I think is in accordance with what PG-Doc said.
>  CREATE POLICY
>  CREATE [ OR REPLACE ] RULE
>  CREATE [ OR REPLACE ] TRIGGER
>  ALTER DEFAULT PRIVILEGES
>
> After applying the patch, the tap-tests for psql is passed.
> Please be free to tell me anything insufficient you found in my fix. Thanks.

LGTM.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Corrected documentation of data type for the logical replication message formats.

2021-05-10 Thread vignesh C

On Sun, May 9, 2021 at 6:54 PM Euler Taveira  wrote:
>
> On Sun, May 9, 2021, at 9:37 AM, vignesh C wrote:
>
> For some of the logical replication messages the data type documented
> was not correct, especially for lsn and xid. For lsn actual datatype
> used is uint64 but is documented as int64, similarly for xid, datatype
> used is uint32 but documented as int32.
> Attached is a patch which has the fix for the same.
> Thoughts?
>
> There was a discussion [1] a few months ago about it. Read the Message Data
> Types definition [2]. It is confusing that an internal data type (int64) has a
> name similar to a generic data type in a protocol definition. As I said [1] we
> should probably inform that that piece of information (LSN) is a XLogRecPtr.
> Since this chapter is intended for developers, I think it is fine to include
> such internal detail.

I agree to specifying the actual dataypes like XLogRecPtr for lsn,
TimestampTz for timestamp, TransactionId for xid and Oid for the
object id. Attached v2 patch which is changed on similar lines.
Thoughts?

Regards,
Vignesh
From e82378313f955699375f15f539a5267eb756b313 Mon Sep 17 00:00:00 2001
From: vignesh 
Date: Sun, 9 May 2021 17:57:08 +0530
Subject: [PATCH v2] Corrected data type for the logical replication message
 formats.

Corrected data type for the logical replication message formats.
---
 doc/src/sgml/protocol.sgml | 74 +++---
 1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 2f4dde3beb..bfc29a25f5 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -6403,7 +6403,7 @@ Begin
 
 
 
-Int64
+XLogRecPtr
 
 
 
@@ -6413,7 +6413,7 @@ Begin
 
 
 
-Int64
+TimestampTz
 
 
 
@@ -6424,7 +6424,7 @@ Begin
 
 
 
-Int32
+TransactionId
 
 
 
@@ -6458,7 +6458,7 @@ Message
 
 
 
-Int32
+TransactionId
 
 
 
@@ -6469,7 +6469,7 @@ Message
 
 
 
-Int8
+Uint8
 
 
 
@@ -6480,7 +6480,7 @@ Message
 
 
 
-Int64
+XLogRecPtr
 
 
 
@@ -6534,7 +6534,7 @@ Commit
 
 
 
-Int8
+Uint8(0)
 
 
 
@@ -6544,7 +6544,7 @@ Commit
 
 
 
-Int64
+XLogRecPtr
 
 
 
@@ -6554,7 +6554,7 @@ Commit
 
 
 
-Int64
+XLogRecPtr
 
 
 
@@ -6564,7 +6564,7 @@ Commit
 
 
 
-Int64
+TimestampTz
 
 
 
@@ -6599,7 +6599,7 @@ Origin
 
 
 
-Int64
+XLogRecPtr
 
 
 
@@ -6648,7 +6648,7 @@ Relation
 
 
 
-Int32
+TransactionId
 
 
 
@@ -6659,7 +6659,7 @@ Relation
 
 
 
-Int32
+Oid 
 
 
 
@@ -6702,7 +6702,7 @@ Relation
 
 
 
-Int16
+Uint16
 
 
 
@@ -6715,7 +6715,7 @@ Relation
 
 
 
-Int8
+Uint8
 
 
 
@@ -6736,7 +6736,7 @@ Relation
 
 
 
-Int32
+Oid
 
 
 
@@ -6780,7 +6780,7 @@ Type
 
 
 
-Int32
+TransactionId
 
 
 
@@ -6791,7 +6791,7 @@ Type
 
 
 
-Int32
+Oid
 
 
 
@@ -6845,7 +6845,7 @@ Insert
 
 
 
-Int32
+TransactionId
 
 
 
@@ -6856,7 +6856,7 @@ Insert
 
 
 
-Int32
+Oid
 
 
 
@@ -6912,7 +6912,7 @@ Update
 
 
 
-Int32
+TransactionId
 
 
 
@@ -6923,7 +6923,7 @@ Update
 
 
 
-Int32
+Oid
 
 
 
@@ -7026,7 +7026,7 @@ Delete
 
 
 
-Int32
+TransactionId
 
 
 
@@ -7037,7 +7037,7 @@ Delete
 
 
 
-Int32
+Oid
 
 
 
@@ -7115,7 +7115,7 @@ Truncate
 
 
 
-Int32
+TransactionId
 
 
 
@@ -7136,7 +7136,7 @@ Truncate
 
 
 
-Int8
+Uint8
 
 
 
@@ -7147,7 +7147,7 @@ Truncate
 
 
 
-Int32
+Oid
 
 
 
@@ -7193,7 +7193,7 @@ Stream Start
 
 
 
-Int32
+TransactionId
 
 
 
@@ -7203,7 +7203,7 @@ Stream Start
 
 
 
-Int8
+Uint8
 
 
 
@@ -7262,7 +7262,7 @@ Stream Commit
 
 
 
-Int32
+TransactionId
 
 
 
@@ -7272,7 +7272,7 @@ Stream Commit
 
 
 
-Int8
+Uint8(0)
 
 
 
@@ -7282,7 +7282,7 @@ Stream Commit
 
 
 
-Int64
+XLogRecPtr
 
 
 
@@ -7292,7 +7292,7 @@ Stream Commit
 
 
 
-Int64
+XLogRecPtr
 
 
 
@@ -7302,7 +7302,7 @@ Stream Commit
 
 
 
-Int64
+TimestampTz
 
 
 
@@ -7337,7 +7337,7 @@ Stream Abort
 
 
 
-Int32
+TransactionId
 
 
 
@@ -7347,7 +7347,7 @@ Stream Abort
 
 
 
-Int32
+TransactionId
 
 
 
@@ -7382,7 +7382,7 @@ TupleData
 
 
 
-Int16
+Uint16
 
 
 
-- 
2.25.1

Re: Corrected documentation of data type for the logical replication message formats.

2021-05-10 Thread vignesh C

On Sun, May 9, 2021 at 6:44 PM Peter Smith  wrote:
>
> On Sun, May 9, 2021 at 10:38 PM vignesh C  wrote:
> >
> > Hi,
> >
> > For some of the logical replication messages the data type documented
> > was not correct, especially for lsn and xid. For lsn actual datatype
> > used is uint64 but is documented as int64, similarly for xid, datatype
> > used is uint32 but documented as int32.
> > Attached is a patch which has the fix for the same.
> > Thoughts?
>
> If you want to do this then there are more - e.g. Flags should be
> Uint8 instead of Int8.

Thanks for the comments.
I have made this change in v2 patch posted at [1].
This also includes the fix to specify uint8(0) at appropriate places.

[1] -
https://www.postgresql.org/message-id/CALDaNm2G_BJ9G%3DCxy9A6ht-TXPn4nB8W9_BcawuA1uxsNvoWfQ%40mail.gmail.com

Regards,
Vignesh

Re: PG 14 release notes, first draft

2021-05-10 Thread Ian Lawrence Barwick

2021年5月10日(月) 15:03 Bruce Momjian :
>
> I have committed the first draft of the PG 14 release notes.  You can
> see the most current  build of them here:
>
> https://momjian.us/pgsql_docs/release-14.html
>
> I need clarification on many items, and the document still needs its
> items properly ordered, and markup added.  I also expect a lot of
> feedback.
>
> I plan to work on completing this document this coming week in
> preparation for beta next week.

This misses the change of default value, and is a bit unclear:

> Remove password_encryption's support for boolean values, e.g. true (Peter 
> Eisentraut)
>
> Previous boolean values enabled md5.  Now, only the md5 string does this.

I'd suggest something along these lines:

> The default for password_encryption is now "scram-sha-256" (Peter Eisentraut)
>
> The pseudo-boolean values "true", "on", "yes" and "1" are no longer accepted 
> as an alias for "md5".

(It hasn't been a true boolean setting since Pg 9.6).

Regards

Ian Barwick

-- 
EnterpriseDB: https://www.enterprisedb.com

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-10 Thread Antonin Houska

Michail Nikolaev  wrote:

> > Sorry, I missed the fact that your example can be executed inside BEGIN - 
> > END
> > block, in which case minRecoveryPoint won't advance after each command.
> 
> No, the block is not executed as a single transaction, all commands
> are separated transactions (see below)
> 
> > Actually I think that a commit record should be replayed
> > more often than XLOG_RUNNING_XACTS, shouldn't it?
> 
> Yes, but replaying commit records DOES NOT affect minRecoveryPoint in
> almost all cases.
> 
> UpdateMinRecoveryPoint is called by XLogFlush,  but xact_redo_commit
> calls XLogFlush only in two cases:
> * DropRelationFiles is called (some relation are dropped)
> * If ForceSyncCommit was used on primary - few “heavy” commands, like
> DropTableSpace, CreateTableSpace, movedb, etc.
> 
> But “regular” commit record is replayed without XLogFlush and, as
> result, without UpdateMinRecoveryPoint.

ok, I missed this. Thanks for explanation.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: PG 14 release notes, first draft

2021-05-10 Thread Justin Pryzby

Thanks for putting it together.

I think these two should be merged:
| Remove containment operators @ and ~ from contrib modules cube, hstore, 
intarray, and seg (Justin Pryzby) 
| Remove deprecated containment operators for built-in geometry data types 
(Justin Pryzby) 

| Improve autovacuum's analyze of partitioned tables (Yuzuko Hosoya) 
| DETAILS? 

Should say: Autovacuum now analyzes partitioned tables.

| The server variable check_client_connection_interval allows supporting 
operating systems, e.g., Linux, to automatically cancel queries by disconnected 
clients. 
The GUC is actually called client_connection_check_interval - the commit
message used the wrong name.

|  This is particularly helpful for reducing index bloat on tables that 
frequently update indexed columns. 
Does it mean "..where indexed columns are frequently updated"?

|  Allow multiple foreign table scans to be run in parallel (Robert Haas, 
Kyotaro Horiguchi, Thomas Munro, Etsuro Fujita) 
I think it means multiple foreight table scan *nodes*

| If server variable compute_query_id is enabled, display the hash in 
pg_stat_activity, EXPLAIN VERBOSE, csvlog, and optionally in log_line_prefix 
(Julien Rouhaud) 
I think needs details, like: "If disabled, then the hash might be computed by
an extension, instead".

Later, you say:
| Extension pg_stat_statements will need to enable hash computation via the 
compute_query_id server variable to function properly. pg_stat_statements can 
now use a custom hash computation method. 
Maybe it should say "will need hash computation to be enabled".

| Allow more than the common name (CN) to be matched for client certificate 
authentication (Andrew Dunstan) 
Your description makes it sound like arbitrary attributes can be compared.  But
the option just allows comparing CN or DN.

| Allow file system sync at the start of crash recovery on Linux (Thomas Munro) 
I think this should describe the existing, default behavior:
Allow syncfs method to sync data directory during recovery;
The default behavior is to open and fsync every data file, and the new setting
recovery_init_sync_method=syncfs instead syncs each filesystem in the data
directory.

| Add date_bin function (John Naylor) 
This truncate timestamps on an arbitrary interval.
Like date_trunc() but also supports eg. '15 minutes', and also uses an 
arbitrary "origin".

| Support negative indexes in split_part() (Nikhil Benesch) 
| Negative values count from the last field going forward. 
should say "start from the last field and count backward" ?

|  Add configure option --with-openssl to behave like --with-ssl={openssl} 
(Daniel Gustafsson, Michael Paquier) 
| The option --with-openssl is kept for compatibility. 
I think this is backwards.  The new option is with-ssl=openssl, and (as you
said) with-openssl is kept.

Should these be in the "compatibility" section?

| Force custom server variable names to match the pattern used for unquoted SQL 
identifiers (Tom Lane) 

| Change password_encryption's default to scram-sha-256 (Peter Eisentraut) 

| Change checkpoint_completion_target default to 0.9 (Stephen Frost) 

| Reduce the default value of vacuum_cost_page_miss (Peter Geoghegan) 

Nitpicks to follow:

| Allow some GiST index to be built by presorting the data (Andrey Borodin) 
indexes

| with --with-lz4 support to enable this feature
I would change to say "to support" rather than "support to enable"

| Speed truncation of small tables on large shared buffer servers (Kirk 
Jamison) 
"on servers with large settings of shared_buffers"

| Allow windowing functions to perform incremental sorts (David Rowley) 
Just "window" functions

| Improve pg_stat_activity reporting for walsenders processes (Tom Lane) 
 walsender

| Previously these functions could only be executed by super-users, and still 
defaults do that. 
..which is still the default behavior.

| This allows multiple queries to be send and only wait for completion when a 
specific synchronization message is sent. 
be sent

| Enhance libpq libpq's target_session_attrs parameter options (Haribabu Kommi, 
Greg Nancarrow, Vignesh C, Tom Lane) 
remove first "libpq"

| With the removal of the ! operator in this release, factorial() is the only 
built-in way to computer a factorial. 
compute

| For example, GROUP BY CUBE (a,b), CUBE (b,c) will generated duplicate 
grouping combinations without DISTINCT. 

will generate

| Allow VACUUM VERBOSE to report page deletion counts for each scan of an index 
(Peter Geoghegan) 

I think "Allow" is wrong - should just say that VACUUM VERBOSE reports..

|By default, only the root of partitioned tables are imported. 
*is* imported

Can these be merged:
 Allow logical replication to stream long transactions to standbys (Dilip 
Kumar, Tomas Vondra, Amit Kapila, Nikhil Sontakke) 
 Improve the logical replication API to allow streaming large in-progress 
transactions (Tomas Vondra, Dilip Kumar, Amit Kapila)

Re: Inaccurate error message when set fdw batch_size to 0

2021-05-10 Thread Tom Lane

Bharath Rupireddy  writes:
> On Mon, May 10, 2021 at 12:00 PM Tom Lane  wrote:
>> Yeah, this error message seems outright buggy.  However, it's a minor
>> matter.  Also, some people think "positive" is the same thing as
>> "non-negative", so maybe we need less ambiguous wording?

> Since value 0 can't be considered as either a positive or negative
> integer, I think we can do as following(roughly):

> if (value < 0) "requires a zero or positive integer value"
> if (value <= 0) "requires a positive integer value"

I was thinking of avoiding the passive voice and writing

"foo must be greater than zero"

which removes all doubt.  It's not necessary to keep the "integer"
aspect of the existing text, because if someone had supplied a
non-integer value, that would not have gotten this far anyway.

> I'm not sure whether we should consider changing these messages:
> remainder for hash partition must be a non-negative integer
> parallel vacuum degree must be a non-negative integer
> repeat count size must be a non-negative integer
> number of workers must be a non-negative integer
> %s requires a non-negative numeric value
> distance in phrase operator should be non-negative and less than %d

I think for consistency it'd be good to change 'em all.  I'm almost
tempted to put this matter into our message style guide too.

regards, tom lane

Re: GetSubscriptionRelations declares too many scan keys

2021-05-10 Thread Tom Lane

Julien Rouhaud  writes:
> On Mon, May 10, 2021 at 07:09:29PM +1000, Peter Smith wrote:
>> Please search PG source code for "ScanData skey[1];" - there are
>> dozens of precedents where other people felt the same as me for
>> declaring single keys.

> AFAICT there are 73 occurences vs 62 of the "Scandata skey;".  I don't think
> there's a huge consensus for one over the other.

Yeah, there's no real consensus about that.  But in this case there's
a strong reason to use skey[1]: it makes the patch a very safe one-liner.
To convert to the other pattern would require touching more code.

regards, tom lane

pg_stat_statements requires compute_query_id

2021-05-10 Thread Pavel Stehule

Hi

I tested features of Postgres 14. The extension pg_stat_statements didn't
work to me until I enabled compute_query_id. Is it expected behaviour?

I expected just an empty column query_id and workable extension. This
doesn't look well.

More, it increases the (little bit) complexity of migration to Postgres 14.

Regards

Pavel

Re: PG 14 release notes, first draft

2021-05-10 Thread Justin Pryzby

Same as the last couple years, I checked for missing items in the release
notes, running something like this.

git log --cherry-pick --oneline origin/REL_13_STABLE...origin/master

Should any of these be included?

f82de5c46b Do COPY FROM encoding conversion/verification in larger chunks.
9e596b65f4 Add "LP_DEAD item?" column to GiST pageinspect functions

10a5b35a00 Report resource usage at the end of recovery
7e453634bb Add additional information in the vacuum error context.
1ea396362b Improve logging of bad parameter values in BIND messages.

86dc90056d Rework planning and execution of UPDATE and DELETE.
a1115fa078 Postpone some more stuff out of ExecInitModifyTable.
c5b7ba4e67 Postpone some stuff out of ExecInitModifyTable.

7db0cd2145 Set PD_ALL_VISIBLE and visibility map bits in COPY FREEZE
01e658fa74 Hash support for row types
a929e17e5a Allow run-time pruning on nested Append/MergeAppend nodes
8c15a29745 Allow ALTER TYPE to update an existing type's typsubscript value.
c7aba7c14e Support subscripting of arbitrary types, not only arrays.

7b94e99960 Remove catalog function currtid()
926fa801ac Remove undocumented IS [NOT] OF syntax.
cd9c1b3e19 Rename PGPROC->vacuumFlags to statusFlags
a04daa97a4 Remove es_result_relation_info from EState.

3d351d916b Redefine pg_class.reltuples to be -1 before the first VACUUM or 
ANALYZE.
fea10a6434 Rename VariableCacheData.nextFullXid to nextXid.
9de9294b0c Stop archive recovery if WAL generated with wal_level=minimal is 
found. (see also 15251c0a6)

f40c6969d0 Routine usage information schema tables

b4af70cb21 Simplify state managed by VACUUM.
4753ef37e0 Use a WaitLatch for vacuum/autovacuum sleeping
9dd963ae25 Recycle nbtree pages deleted during same VACUUM.
3c3b8a4b26 Truncate line pointer array during VACUUM.

ad1c36b070 Fix foreign-key selectivity estimation in the presence of constants.

Re: SQL-standard function body

2021-05-10 Thread Peter Eisentraut


On 27.04.21 18:16, Tom Lane wrote:

That's kind of a lot of complication, and inefficiency, for a corner case
that may never arise in practice.  We've ignored the risk for default
expressions, and AFAIR have yet to receive any field complaints about it.
So maybe it's okay to do the same for SQL-style function bodies, at least
for now.


Another option would be that we disallow this at creation time.


Don't like that one much.  The backend shouldn't be in the business
of rejecting valid commands just because pg_dump might be unable
to cope later.


Since this is listed as an open item, I want to clarify that I'm 
currently not planning to work on this, based on this discussion. 
Certainly something to look into sometime later, but it's not in my 
plans right now.

Re: pg_stat_statements requires compute_query_id

2021-05-10 Thread Julien Rouhaud

Hi Pavel,

On Mon, May 10, 2021 at 04:36:16PM +0200, Pavel Stehule wrote:
> 
> I tested features of Postgres 14. The extension pg_stat_statements didn't
> work to me until I enabled compute_query_id. Is it expected behaviour?

Yes.

> I expected just an empty column query_id and workable extension. This
> doesn't look well.
> 
> More, it increases the (little bit) complexity of migration to Postgres 14.

This was already raised multiple times, and the latest discussion can be found
at [1].

Multiple options have been suggested, but AFAICT there isn't a clear consensus
on what we should do exactly, so I've not been able to send a fix yet.

[1]: 
https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com

Re: PG 14 release notes, first draft

2021-05-10 Thread Peter Geoghegan

On Sun, May 9, 2021 at 11:03 PM Bruce Momjian  wrote:
> I have committed the first draft of the PG 14 release notes.

This definitely isn't necessary, since the commit in question was a
totally mechanical thing that cleaned up a minor inconsistency:

Initialize work_mem and maintenance_work_mem using current guc.c
default (Peter Geoghegan)

Oversight in commit 848ae330a49, which increased the previous defaults
for work_mem and maintenance_work_mem by 4X. IS THIS A BEHAVIORAL
CHANGE?

-- 
Peter Geoghegan

Re: PG 14 release notes, first draft

2021-05-10 Thread Peter Geoghegan

On Mon, May 10, 2021 at 4:44 AM Matthias van de Meent
 wrote:
> I noticed that the improvement in bloat control in the HeapAM that I
> know of (3c3b8a4b, 0ff8bbde) weren't documented here. Although each
> can be considered minor, they together can decrease the bloating
> behaviour of certain workloads significantly (and limit the total
> damage), and in my opinion this should be mentioned.
>
> 3c3b8a4b: Returns space claimed for the line pointer array back to the
> page's empty space, so that it can also be used for tuple data.
>
> 0ff8bbde: Allows large tuples to be inserted on pages which have only
> a small amount of data, regardless of fillfactor.

+1 on mentioning both things.

-- 
Peter Geoghegan

Re: PG 14 release notes, first draft

2021-05-10 Thread Peter Geoghegan

On Mon, May 10, 2021 at 7:00 AM Justin Pryzby  wrote:
> | Allow VACUUM VERBOSE to report page deletion counts for each scan of an 
> index (Peter Geoghegan)
>
> I think "Allow" is wrong - should just say that VACUUM VERBOSE reports..

It's also not accurate, since the count of deleted pages was always
shown by VACUUM VERBOSE (once per index scan). The new feature has us
show pages deleted by the VACUUM that actually ran (not some earlier
VACUUM) -- these are "newly deleted pages".

I don't think that this item is worth mentioning, though -- it's just
a nice to have. If Bruce removes it from the release notes entirely I
won't object.

In addition to the items that I commented on in my response to
Matthias just now, I should point out the following item as worthy of
inclusion:

9dd963ae25 Recycle nbtree pages deleted during same VACUUM.

I suggest that this item be phrased more or less as follows:

"Allow VACUUM to eagerly place newly deleted B-Tree pages in the Free
Space Map. Previously VACUUM could only place preexisting deleted
pages in the Free Space Map for recycling."

-- 
Peter Geoghegan

Re: SQL-standard function body

2021-05-10 Thread Tom Lane

Peter Eisentraut  writes:
> On 27.04.21 18:16, Tom Lane wrote:
>> That's kind of a lot of complication, and inefficiency, for a corner case
>> that may never arise in practice.  We've ignored the risk for default
>> expressions, and AFAIR have yet to receive any field complaints about it.
>> So maybe it's okay to do the same for SQL-style function bodies, at least
>> for now.

>>> Another option would be that we disallow this at creation time.

>> Don't like that one much.  The backend shouldn't be in the business
>> of rejecting valid commands just because pg_dump might be unable
>> to cope later.

> Since this is listed as an open item, I want to clarify that I'm 
> currently not planning to work on this, based on this discussion. 
> Certainly something to look into sometime later, but it's not in my 
> plans right now.

Right, I concur with moving it to the "won't fix" category.

regards, tom lane

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-10 Thread Antonin Houska

Michail Nikolaev  wrote:

> After some correspondence with Peter Geoghegan (1) and his ideas, I
> have reworked the patch a lot and now it is much more simple with even
> better performance (no new WAL or conflict resolution, hot standby
> feedback is unrelated).

My review that started in [1] continues here.

(Please note that code.patch does not apply to the current master branch.)

I think I understand your approach now and couldn't find a problem by reading
the code. What I consider worth improving is documentation, both code comments
and nbtree/README. Especially for the problem discussed in [1] it should be
explained what would happen if kill_prior_tuple_min_lsn was not checked.

Also, in IsIndexLpDeadAllowed() you say that invalid
deadness->latest_removed_xid means the following:

/*
 * Looks like it is tuple cleared by heap_page_prune_execute,
 * we must be sure if LSN of XLOG_HEAP2_CLEAN (or any subsequent
 * updates) less than minRecoveryPoint to avoid MVCC failure
 * after crash recovery.
 */

However I think there's one more case: if heap_hot_search_buffer() considers
all tuples in the chain to be "surely dead", but
HeapTupleHeaderAdvanceLatestRemovedXid() skips them all for this reason:

/*
 * Ignore tuples inserted by an aborted transaction or if the tuple was
 * updated/deleted by the inserting transaction.
 *
 * Look for a committed hint bit, or if no xmin bit is set, check clog.
 */

I think that the dead tuples produced this way should never be visible on the
standby (and even if they were, they would change the page LSN so your
algorithm would treat them correctly) so I see no correctness problem. But it
might be worth explaining better the meaning of invalid "latest_removed_xid"
in comments.

In the nbtree/README, you say

"... if the commit record of latestRemovedXid is more ..."

but it's not clear to me what "latestRemovedXid" is. If you mean the
scan->kill_prior_tuple_min_lsn field, you probably need more words to explain
it.

* IsIndexLpDeadAllowed()

/* It all always allowed on primary if *all_dead. */

should probably be

/* It is always allowed on primary if *all_dead. */

* gistkillitems()

As the function is only called if (so->numKilled > 0), I think both
"killedsomething" and "dirty" variables should always have the same value, so
one variable should be enough. Assert(so->numKilled) would be appropriate in
that case.

The situation is similar for btree and hash indexes.

doc.patch:

"+applying the fill page write."

[1] https://www.postgresql.org/message-id/61470.1620647290%40antos

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: [Patch] ALTER SYSTEM READ ONLY

2021-05-10 Thread Robert Haas

On Sun, May 9, 2021 at 1:26 AM Amul Sul  wrote:
> The state in the control file also gets cleared. Though, after
> clearing in memory the state patch doesn't really do the immediate
> change to the control file, it relies on the next UpdateControlFile()
> to do that.

But when will that happen? If you're relying on some very nearby code,
that might be OK, but perhaps a comment is in order. If you're just
thinking it's going to happen eventually, I think that's not good
enough.

> Regarding log message I think I have skipped that intentionally, to
> avoid confusing log as "system is now read write" when we do start as
> hot-standby which is not really read-write.

I think the message should not be phrased that way. In fact, I think
now that we've moved to calling this pg_prohibit_wal() rather than
ALTER SYSTEM READ ONLY, a lot of messages need to be rethought, and
maybe some comments and function names as well. Perhaps something
like:

system is read only -> WAL is now prohibited
system is read write -> WAL is no longer prohibited

And then for this particular case, maybe something like:

clearing WAL prohibition because the system is in archive recovery

> > The second part of this proposal was:
> >
> > "2. Create a new function with a name like XLogAcceptWrites(). Move the
> > following things from StartupXLOG() into that function: (1) the call
> > to UpdateFullPageWrites(), (2) the following block of code that does
> > either CreateEndOfRecoveryRecord() or RequestCheckpoint() or
> > CreateCheckPoint(), (3) the next block of code that runs
> > recovery_end_command, (4) the call to XLogReportParameters(), and (5)
> > the call to CompleteCommitTsInitialization(). Call the new function
> > from the place where we now call XLogReportParameters(). This would
> > mean that (1)-(3) happen later than they do now, which might require
> > some adjustments."
> >
> > Now you moved that code, but you also moved (6)
> > CompleteCommitTsInitialization(), (7) setting the control file to
> > DB_IN_PRODUCTION, (8) setting the state to RECOVERY_STATE_DONE, and
> > (9) requesting a checkpoint if we were just promoted. That's not what
> > was proposed. One result of this is that the server now thinks it's in
> > recovery even after the startup process has exited.
> > RecoveryInProgress() is still returning true everywhere. But that is
> > inconsistent with what Andres and I were recommending in
> > http://postgr.es/m/CA+TgmoZYQN=rcye-ixwnjdvmaoh+7jaqsif3u2k8xqxipba...@mail.gmail.com
>
> Regarding modified approach, I tried to explain that why I did
> this in 
> http://postgr.es/m/CAAJ_b96Yb4jaW6oU1bVYEBaf=tq-ql+mmt1exfwvnzer7xr...@mail.gmail.com

I am not able to understand what problem you are seeing there. If
we're in crash recovery, then nobody can connect to the database, so
there can't be any concurrent activity. If we're in archive recovery,
we now clear the WAL-is-prohibited flag so that we will go read-write
directly at the end of recovery. We can and should refuse any effort
to call pg_prohibit_wal() during recovery. If we reached the end of
crash recovery and are now permitting read-only connections, why would
anyone be able to write WAL before the system has been changed to
read-write? If that can happen, it's a bug, not a reason to change the
design.

Maybe your concern here is about ordering: the process that is going
to run XLogAcceptWrites() needs to allow xlog writes locally before we
tell other backends that they also can xlog writes; otherwise, some
other records could slip in before UpdateFullPageWrites() and similar
have run, which we probably don't want. But that's why
LocalSetXLogInsertAllowed() was invented, and if it doesn't quite do
what we need in this situation, we should be able to tweak it so it
does.

If your concern is something else, can you spell it out for me again
because I'm not getting it?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Why do we have perl and sed versions of Gen_dummy_probes?

2021-05-10 Thread Tom Lane

Andrew Dunstan  writes:
> On 5/10/21 7:16 AM, Dagfinn Ilmari Mannsåker wrote:
>> Peter Eisentraut  writes:
>>> This recipe doesn't produce a Gen_dummy_probes.pl that matches exactly
>>> the one that is there now.  If this is going to be the preferred method,
>>> then we should generate it once so that it matches going forward.

>> Which version of perltidy do you have installed?  For me it generates
>> identical versions using any of 20170521 (per src/tools/pgindent/README),
>> 20201207 (what I happened to have installed before), and 20210402 (the
>> latest).

> Yep:

For me, using App-s2p-1.003 and perltidy v20170521, it works
as long as I start with the previous version of
Gen_dummy_probes.pl in place.  I first tried to test this by
"rm Gen_dummy_probes.pl; make Gen_dummy_probes.pl", and what
I got was a script without all the initial commentary nor
the first line of actual Perl code.

I don't think this is good practice; it implies that any
accidental corruption of the commentary would be carried
forward.  I think we should be extracting the commentary
from Gen_dummy_probes.sed.

regards, tom lane

Re: [PATCH] Identify LWLocks in tracepoints

2021-05-10 Thread Andres Freund

On 2021-05-09 19:51:13 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2021-05-08 13:13:47 -0400, Tom Lane wrote:
> >> (I wonder why flaviventris and serinus are still using an "experimental"
> >> compiler version that is now behind mainstream.)
> 
> > The upgrade script didn't install the newer version it because it had to
> > remove some conflicting packages... Should be fixed for runs starting
> > now.
> 
> Looks like that didn't work ...

Looks like it did, but turned out to have some unintended side-effects
:(.

The snapshot builds are now new:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=flaviventris&dt=2021-05-10%2015%3A43%3A56&stg=configure
configure:3966: ccache /usr/lib/gcc-snapshot/bin/gcc --version >&5
gcc (Debian 20210421-1) 11.0.1 20210421 (prerelease) [gcc-11 revision 
fbb7739892e:d13ce34bd01:3756d99dab6a268d0d8a17583980a86f23f0595a]

But the aforementioned dependencies that needed to remove broke the
installed old versions of gcc/clang.

I started to build the old versions of llvm manually, but that then hits
the issue that at least 3.9 doesn't build with halfway modern versions
of gcc/clang. So I gotta do it stepwise (i.e. go backwards, build llvm
n-2 with n-1), will take a bit of time.

Greetings,

Andres Freund

Re: [PATCH] Identify LWLocks in tracepoints

2021-05-10 Thread Tom Lane

Andres Freund  writes:
> Looks like it did, but turned out to have some unintended side-effects
> :(.
> The snapshot builds are now new:
> https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=flaviventris&dt=2021-05-10%2015%3A43%3A56&stg=configure
> configure:3966: ccache /usr/lib/gcc-snapshot/bin/gcc --version >&5
> gcc (Debian 20210421-1) 11.0.1 20210421 (prerelease) [gcc-11 revision 
> fbb7739892e:d13ce34bd01:3756d99dab6a268d0d8a17583980a86f23f0595a]
> But the aforementioned dependencies that needed to remove broke the
> installed old versions of gcc/clang.
> I started to build the old versions of llvm manually, but that then hits
> the issue that at least 3.9 doesn't build with halfway modern versions
> of gcc/clang. So I gotta do it stepwise (i.e. go backwards, build llvm
> n-2 with n-1), will take a bit of time.

Ugh.  Memo to self: don't rag on other peoples' buildfarm configurations
right before a release deadline :-(.  Sorry to cause you trouble.

regards, tom lane

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Mon, May 10, 2021 at 08:16:16AM +0200, Laurenz Albe wrote:
> On Mon, 2021-05-10 at 02:03 -0400, Bruce Momjian wrote:
> > When using \e in psql, if the buffer is not modified by the editor, ignore 
> > the editor contents and leave the buffer unchanged (Laurenz Albe)
> > The \ef and \ev commands also now have this behavior. DOCS SAY BUFFER IS 
> > CLEARED.
> 
> It's a bit more complicated: If you edit the current buffer with \e, the 
> buffer is
> unchanged if you quit the editor.
> However, if you edit the previous statement, a file or the definition of a 
> function
> or view, the query buffer is cleared if you quit the editor without saving.
> 
> Suggested wording:
> 
> When editing anything else than the current query buffer with \e, and you quit
> the editor, the query buffer is cleared.  This makes the behavior less 
> surprising
> and prevents the unintended re-execution of the previous statement.

OK, I figured it out.  I was confused by \p because \? says:

test=> \?
Query Buffer
  \e [FILE] [LINE]   edit the query buffer (or file) with external 
editor
  \ef [FUNCNAME [LINE]]  edit function definition with external editor
  \ev [VIEWNAME [LINE]]  edit view definition with external editor
-->   \p show the contents of the query buffer
  \r reset (clear) the query buffer
  ...


but the documentaton says:

   \p or \print
   Print the current query buffer to the standard output. If
-->the current query buffer is empty, the most recently executed
-->query is printed instead.

I wasn't aware that \e loads the previous query if the buffer is empty. 
I came up with this release note text:





When editing the previous query or a file with psql's \e, ignore the
contents if the editor exits without saving (Laurenz Albe)



Previously, editing the previous query or a file and not saving the
editor contents would still execute the editor contents.  The \ef and
\ev commands also now have this behavior.



-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: [PATCH] Identify LWLocks in tracepoints

2021-05-10 Thread Andres Freund

Hi,

On 2021-05-10 12:14:46 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Looks like it did, but turned out to have some unintended side-effects
> > :(.
> > The snapshot builds are now new:
> > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=flaviventris&dt=2021-05-10%2015%3A43%3A56&stg=configure
> > configure:3966: ccache /usr/lib/gcc-snapshot/bin/gcc --version >&5
> > gcc (Debian 20210421-1) 11.0.1 20210421 (prerelease) [gcc-11 revision 
> > fbb7739892e:d13ce34bd01:3756d99dab6a268d0d8a17583980a86f23f0595a]
> > But the aforementioned dependencies that needed to remove broke the
> > installed old versions of gcc/clang.
> > I started to build the old versions of llvm manually, but that then hits
> > the issue that at least 3.9 doesn't build with halfway modern versions
> > of gcc/clang. So I gotta do it stepwise (i.e. go backwards, build llvm
> > n-2 with n-1), will take a bit of time.
> 
> Ugh.  Memo to self: don't rag on other peoples' buildfarm configurations
> right before a release deadline :-(.  Sorry to cause you trouble.

No worries - I knew that I'd have to do this at some point, even though
I hadn't planned to do that today... I should have all of them green
before end of today.

I found that I actually can build LLVM 3.9 directly, as clang-6 can
still build it directly (wheras the oldest gcc still installed can't
build it directly). So it's a bit less painful than I thought at first

The 3.9 instances (phycodurus, dragonet) tests are running right now,
and I'm fairly sure they'll pass (most of a --noreport --nostatus run
passed). Going forward building LLVM 4,5,6 now - the later versions take
longer...

Greetings,

Andres Freund

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Mon, May 10, 2021 at 02:51:28PM +0800, Julien Rouhaud wrote:
> On Mon, May 10, 2021 at 02:03:08AM -0400, Bruce Momjian wrote:
> > I have committed the first draft of the PG 14 release notes.  You can
> > see the most current  build of them here:
> > 
> > https://momjian.us/pgsql_docs/release-14.html
> > 
> > I need clarification on many items, and the document still needs its
> > items properly ordered, and markup added.  I also expect a lot of
> > feedback.
> 
> There's a small typo:
> 
> +Improve tab completion (Vignesh C,, Michael [...]
> 
> (duplicated comma)

Fixed.

> Also
> 
> +
> +Avoid retrieval of CHECK constraints and DEFAULT exprs in data-only dump 
> (Julien Rouhaud)
> +
> +
> +
> +IS THIS BACKWARD INCOMPATIBLE?
> +
> +
> 
> The new behavior doesn't have any impact on the generated dump, as the
> modification is to avoid retrieving data that won't be used.
> 
> For users, it only means maybe slight faster pg_dump execution, or slightly
> better change to be able to run a pg_dump --data-only if pg_constraint is
> corrupted but not the rest of the user data, so maybe it's not necessary to
> mention that in the release notes?

Thanks, removed.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: [Patch] ALTER SYSTEM READ ONLY

2021-05-10 Thread Amul Sul

On Mon, May 10, 2021 at 9:21 PM Robert Haas  wrote:
>
> On Sun, May 9, 2021 at 1:26 AM Amul Sul  wrote:
> > The state in the control file also gets cleared. Though, after
> > clearing in memory the state patch doesn't really do the immediate
> > change to the control file, it relies on the next UpdateControlFile()
> > to do that.
>
> But when will that happen? If you're relying on some very nearby code,
> that might be OK, but perhaps a comment is in order. If you're just
> thinking it's going to happen eventually, I think that's not good
> enough.
>

Ok.

> > Regarding log message I think I have skipped that intentionally, to
> > avoid confusing log as "system is now read write" when we do start as
> > hot-standby which is not really read-write.
>
> I think the message should not be phrased that way. In fact, I think
> now that we've moved to calling this pg_prohibit_wal() rather than
> ALTER SYSTEM READ ONLY, a lot of messages need to be rethought, and
> maybe some comments and function names as well. Perhaps something
> like:
>
> system is read only -> WAL is now prohibited
> system is read write -> WAL is no longer prohibited
>
> And then for this particular case, maybe something like:
>
> clearing WAL prohibition because the system is in archive recovery
>

Ok, thanks for the suggestions.

> > > The second part of this proposal was:
> > >
> > > "2. Create a new function with a name like XLogAcceptWrites(). Move the
> > > following things from StartupXLOG() into that function: (1) the call
> > > to UpdateFullPageWrites(), (2) the following block of code that does
> > > either CreateEndOfRecoveryRecord() or RequestCheckpoint() or
> > > CreateCheckPoint(), (3) the next block of code that runs
> > > recovery_end_command, (4) the call to XLogReportParameters(), and (5)
> > > the call to CompleteCommitTsInitialization(). Call the new function
> > > from the place where we now call XLogReportParameters(). This would
> > > mean that (1)-(3) happen later than they do now, which might require
> > > some adjustments."
> > >
> > > Now you moved that code, but you also moved (6)
> > > CompleteCommitTsInitialization(), (7) setting the control file to
> > > DB_IN_PRODUCTION, (8) setting the state to RECOVERY_STATE_DONE, and
> > > (9) requesting a checkpoint if we were just promoted. That's not what
> > > was proposed. One result of this is that the server now thinks it's in
> > > recovery even after the startup process has exited.
> > > RecoveryInProgress() is still returning true everywhere. But that is
> > > inconsistent with what Andres and I were recommending in
> > > http://postgr.es/m/CA+TgmoZYQN=rcye-ixwnjdvmaoh+7jaqsif3u2k8xqxipba...@mail.gmail.com
> >
> > Regarding modified approach, I tried to explain that why I did
> > this in 
> > http://postgr.es/m/CAAJ_b96Yb4jaW6oU1bVYEBaf=tq-ql+mmt1exfwvnzer7xr...@mail.gmail.com
>
> I am not able to understand what problem you are seeing there. If
> we're in crash recovery, then nobody can connect to the database, so
> there can't be any concurrent activity. If we're in archive recovery,
> we now clear the WAL-is-prohibited flag so that we will go read-write
> directly at the end of recovery. We can and should refuse any effort
> to call pg_prohibit_wal() during recovery. If we reached the end of
> crash recovery and are now permitting read-only connections, why would
> anyone be able to write WAL before the system has been changed to
> read-write? If that can happen, it's a bug, not a reason to change the
> design.
>
> Maybe your concern here is about ordering: the process that is going
> to run XLogAcceptWrites() needs to allow xlog writes locally before we
> tell other backends that they also can xlog writes; otherwise, some
> other records could slip in before UpdateFullPageWrites() and similar
> have run, which we probably don't want. But that's why
> LocalSetXLogInsertAllowed() was invented, and if it doesn't quite do
> what we need in this situation, we should be able to tweak it so it
> does.
>

Yes, we don't want any write slip in before UpdateFullPageWrites().
Recently[1], we have decided to let the Checkpointed process call
XLogAcceptWrites() unconditionally.

Here problem is that when a backend executes the
pg_prohibit_wal(false) function to make the system read-write, the wal
prohibited state is set to inprogress(ie.
WALPROHIBIT_STATE_GOING_READ_WRITE) and then Checkpointer is signaled.
Next, Checkpointer will convey this system change to all existing
backends using a global barrier, and after that final wal prohibited
state is set to the read-write(i.e. WALPROHIBIT_STATE_READ_WRITE).
While Checkpointer is in the progress of conveying this global
barrier,  any new backend can connect at that time and can write a new
record because the inprogress read-write state is equivalent to the
final read-write state iff LocalXLogInsertAllowed != 0 for that
backend.  And, that new record could slip in before or in between
records to be written by XLogAcceptWrites().

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Mon, May 10, 2021 at 05:28:24PM +0900, Masahiko Sawada wrote:
> On Mon, May 10, 2021 at 3:03 PM Bruce Momjian  wrote:
> >
> > I have committed the first draft of the PG 14 release notes.  You can
> > see the most current  build of them here:
> >
> > https://momjian.us/pgsql_docs/release-14.html
> 
> Thank you!
> 
> > Add system view pg_stat_replication_slots to report replication slot 
> > activity (Sawada Masahiko, Amit Kapila)
> >
> > Function pg_stat_reset_replication_slot() resets slot statistics. THIS IS 
> > LOGICAL ONLY, BUT NO "LOGICAL" IN THE NAME?
> 
> IIUC pg_stat_replication_slots view supports only logical slot for
> now. But we might have it show also physical slot in the future. I'm
> fine with the current view name and description but someone might want
> to use "logical replication slot" instead of just "replication slot".

OK, I was just confirming we are happy with the name.
> 
> > IS "ACTIVITY" THE RIGHT WORD?
> 
> The doc says "The pg_stat_replication_slots view will contain one row
> per logical replication slot, showing statistics about its usage.". So
> we can say "... to report replication slot statistics about its
> usage".

OK, I think I prefer "activity" so will just keep that.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: pg_stat_statements requires compute_query_id

2021-05-10 Thread Maciek Sakrejda

On Mon, May 10, 2021 at 7:43 AM Julien Rouhaud  wrote:
> On Mon, May 10, 2021 at 04:36:16PM +0200, Pavel Stehule wrote:
> > I expected just an empty column query_id and workable extension. This
> > doesn't look well.
> >
> > More, it increases the (little bit) complexity of migration to Postgres 14.
>
> This was already raised multiple times, and the latest discussion can be found
> at [1].
>
> Multiple options have been suggested, but AFAICT there isn't a clear consensus
> on what we should do exactly, so I've not been able to send a fix yet.
>
> [1]: 
> https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com

Before it petered out, the thread seemed to be converging toward
consensus that the situation could be improved. I work on pganalyze,
and our product requires pg_stat_statements to be enabled for a lot of
its functionality. We guide our users through enabling it, but if
additional steps are required in 14, that may be confusing. I don't
have any strong feelings on the particular mechanism that would work
best here, but it would be nice if enabling pg_stat_statements in 14
did not require more work than in 13. Even if it's just one extra
setting, it's another potential thing to get wrong and have to
troubleshoot, plus it means all existing pg_stat_statements guides out
there would become out of date.

Thanks,
Maciek

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Mon, May 10, 2021 at 08:52:44PM +1200, David Rowley wrote:
> On Mon, 10 May 2021 at 18:03, Bruce Momjian  wrote:
> > I need clarification on many items, and the document still needs its
> > items properly ordered, and markup added.  I also expect a lot of
> > feedback.
> 
> Thanks for drafting that up.
> 
> > Add executor method to cache results from the inner-side of joins (David 
> > Rowley)
> 
> I think it would be more accurate to say "inner-side of nested loop joins".

OK, thanks.  I suspected that was true.

> > Allow efficient retrieval of heap rows via tid (Edmund Horner, David Rowley)
> 
> I'd say we already had that feature with TID Scan. Maybe it would be
> better to write:
> 
> "Allow efficient heap scanning on ranges of tids (Edmund Horner, David 
> Rowley)"

I went with:

Allow efficient heap scanning of a range of tids (Edmund Horner,
David Rowley)

> > Improve the performance of parallel sequential scans (Thomas Munro, David 
> > Rowley)
> 
> I think it is worth mentioning "I/O" before "performance".  This
> change won't really help cases if all the table's pages are already in
> shared buffers.

I went with:

Improve the performance of parallel sequential I/O scans (Thomas Munro,
David Rowley)

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: pg_stat_statements requires compute_query_id

2021-05-10 Thread Pavel Stehule

po 10. 5. 2021 v 19:03 odesílatel Maciek Sakrejda 
napsal:

> On Mon, May 10, 2021 at 7:43 AM Julien Rouhaud  wrote:
> > On Mon, May 10, 2021 at 04:36:16PM +0200, Pavel Stehule wrote:
> > > I expected just an empty column query_id and workable extension. This
> > > doesn't look well.
> > >
> > > More, it increases the (little bit) complexity of migration to
> Postgres 14.
> >
> > This was already raised multiple times, and the latest discussion can be
> found
> > at [1].
> >
> > Multiple options have been suggested, but AFAICT there isn't a clear
> consensus
> > on what we should do exactly, so I've not been able to send a fix yet.
> >
> > [1]:
> https://www.postgresql.org/message-id/flat/35457b09-36f8-add3-1d07-6034fa585ca8%40oss.nttdata.com
>
> Before it petered out, the thread seemed to be converging toward
> consensus that the situation could be improved. I work on pganalyze,
> and our product requires pg_stat_statements to be enabled for a lot of
> its functionality. We guide our users through enabling it, but if
> additional steps are required in 14, that may be confusing. I don't
> have any strong feelings on the particular mechanism that would work
> best here, but it would be nice if enabling pg_stat_statements in 14
> did not require more work than in 13. Even if it's just one extra
> setting, it's another potential thing to get wrong and have to
> troubleshoot, plus it means all existing pg_stat_statements guides out
> there would become out of date.
>

+1

minimally it requires extra notes in some migration guide.

I understand so queryid is one from key values. So it is not possible to
merge data with and without a queryid. My idea about the best solution is
something like pg_stat_statements can work without a queryid, and when the
compute_query_id is changed, then the values stored in pg_stat_statements
are resetted. I have no idea if it can be implemented. Current state is not
user friendly. The people who know the previous behaviour can be very
confused.

Regards

Pavel



> Thanks,
> Maciek
>

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Mon, May 10, 2021 at 01:44:12PM +0200, Matthias van de Meent wrote:
> On Mon, 10 May 2021 at 08:03, Bruce Momjian  wrote:
> >
> > I have committed the first draft of the PG 14 release notes.  You can
> > see the most current  build of them here:
> > https://momjian.us/pgsql_docs/release-14.html
> >
> > I need clarification on many items, and the document still needs its
> > items properly ordered, and markup added.  I also expect a lot of
> > feedback.
> 
> I noticed that the improvement in bloat control in the HeapAM that I
> know of (3c3b8a4b, 0ff8bbde) weren't documented here. Although each
> can be considered minor, they together can decrease the bloating
> behaviour of certain workloads significantly (and limit the total
> damage), and in my opinion this should be mentioned.
> 
> 3c3b8a4b: Returns space claimed for the line pointer array back to the
> page's empty space, so that it can also be used for tuple data.
> 
> 0ff8bbde: Allows large tuples to be inserted on pages which have only
> a small amount of data, regardless of fillfactor.
> 
> Together they should be able to help significantly in both bloat
> prevention and bloat reduction.

I looked at those items.  I try to mention performance items that enable
new workloads or require some user action to benefit from it.  I am not
sure these two qualify, but can others comments?  Thanks.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: PG 14 release notes, first draft

2021-05-10 Thread Laurenz Albe

On Mon, 2021-05-10 at 12:38 -0400, Bruce Momjian wrote:
> I came up with this release note text:
> 
>   
>   
>   
>   
>   When editing the previous query or a file with psql's \e, ignore the
>   contents if the editor exits without saving (Laurenz Albe)
>   
>   
>   
>   Previously, editing the previous query or a file and not saving the
>   editor contents would still execute the editor contents.  The \ef and
>   \ev commands also now have this behavior.
>   
>   

Thanks, that looks much better.

The second paragraph starts describing the previous behavior, but the second
sentence details on the changes.  Perhaps it would be better to put that into
the first paragraph:

When editing the previous query or a file with psql's \e, or when a
view or function definition are edited with \ev or \ef, ignore the
contents if the editor exits without saving (Laurenz Albe)

Previously, editing the previous query or a file and not saving the
editor contents would still execute the editor contents.

Yours,
Laurenz Albe

Re: Reducing opr_sanity test's runtime under CLOBBER_CACHE_ALWAYS

2021-05-10 Thread Andres Freund

Hi,

On 2021-05-08 15:44:57 -0400, Tom Lane wrote:
> In a nearby thread I bemoaned the fact that the core regression tests
> seem to have gotten significantly slower in the last couple of months,
> at least with CCA enabled: hyrax reports completing them in 12:52:44
> on 18 March, while its most recent run on 1 May took 14:08:18.
>
> Trying to diagnose the cause overall seemed a bit daunting, but
> I thought I'd dig into the opr_sanity test in particular, as it
> is one of the slower tests under CCA to start with and had also
> slowed down noticeably (from 3701581 ms to 4761183 ms, or 28%).
> I was able to complete a bisection using just that test, and
> got an unexpected result: most of the slowdown appeared at
> ab596105b (BRIN minmax-multi indexes).  Apparently the additional
> time is simply from having to check the additional pg_amop and
> pg_amproc entries, which that patch added quite a few of.

I suspect that it might be not just that. From a quick profile it looks
like debug_invalidate_system_caches_always spends a good chunk of its
time in ResetCatalogCache() and hash_seq_search(). Those cost linearly
with the size of the underlying hash tables.

Wo what what might be happening is that the additional catalog entries
pushed some of the catcache hash tables into growing
(RehashCatCache()). Which then makes all subsequent ResetCatalogCache()
scans slower.

Not that that changes much - your proposed fix still seems reasonable.


> I noticed that all of the slowest queries in that test were dependent
> on the binary_coercible() plpgsql function that it uses.  Now, that
> function has always been a rather lame attempt to approximate the
> behavior of the parser's IsBinaryCoercible() function, so I've been
> thinking for some time that we ought to get rid of it in favor of
> actually using IsBinaryCoercible().  I tried that, by adding a
> shim function to regress.c, and got a most gratifying result:
> on my machine opr_sanity's runtime with
> debug_invalidate_system_caches_always = 1 drops from
> 29m9s to 3m19s.  Without CCA the speedup is far less impressive,
> 360ms to 305ms, but that's still useful.  Especially since this
> makes the test strictly more accurate.

Cool!


> Anyway, I propose that we ought to sneak this into HEAD, since
> it's only touching test code and not anything production-critical.

+1

Greetings,

Andres Freund

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Mon, May 10, 2021 at 07:39:17PM +0200, Laurenz Albe wrote:
> On Mon, 2021-05-10 at 12:38 -0400, Bruce Momjian wrote:
> > I came up with this release note text:
> > 
> > 
> > 
> > 
> > 
> > When editing the previous query or a file with psql's \e, ignore the
> > contents if the editor exits without saving (Laurenz Albe)
> > 
> > 
> > 
> > Previously, editing the previous query or a file and not saving the
> > editor contents would still execute the editor contents.  The \ef and
> > \ev commands also now have this behavior.
> > 
> > 
> 
> Thanks, that looks much better.
> 
> The second paragraph starts describing the previous behavior, but the second
> sentence details on the changes.  Perhaps it would be better to put that into
> the first paragraph:
> 
> 
> When editing the previous query or a file with psql's \e, or when a
> view or function definition are edited with \ev or \ef, ignore the
> contents if the editor exits without saving (Laurenz Albe)
> 
> 
> 
> Previously, editing the previous query or a file and not saving the
> editor contents would still execute the editor contents.
> 

Uh, I try to keep the first sentence short so people can scan it more
easily, so I am hesitant to make this change.  I went with this change:





When editing the previous query or a file with psql's \e, or using \ef 
and \ev, ignore the contents if the editor exits without saving (Laurenz Albe)



Previously, such edits would still execute the editor contents.



-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: Reducing opr_sanity test's runtime under CLOBBER_CACHE_ALWAYS

2021-05-10 Thread Tom Lane

Andres Freund  writes:
> On 2021-05-08 15:44:57 -0400, Tom Lane wrote:
>> I was able to complete a bisection using just that test, and
>> got an unexpected result: most of the slowdown appeared at
>> ab596105b (BRIN minmax-multi indexes).  Apparently the additional
>> time is simply from having to check the additional pg_amop and
>> pg_amproc entries, which that patch added quite a few of.

> I suspect that it might be not just that. From a quick profile it looks
> like debug_invalidate_system_caches_always spends a good chunk of its
> time in ResetCatalogCache() and hash_seq_search(). Those cost linearly
> with the size of the underlying hash tables.
> Wo what what might be happening is that the additional catalog entries
> pushed some of the catcache hash tables into growing
> (RehashCatCache()). Which then makes all subsequent ResetCatalogCache()
> scans slower.

Hm.  But constantly flushing the caches should mean that they're never
populated with very many entries at one time, which ought to forestall
that, at least to some extent.

I wonder if there's anything we could do to make ResetCatalogCache
faster?  It wouldn't help much for normal execution of course,
but it might do something to bring CCA testing time down out of
the stratosphere.

regards, tom lane

Re: AlterSubscription_refresh "wrconn" wrong variable?

2021-05-10 Thread Alvaro Herrera

On 2021-May-10, Peter Smith wrote:

> PSA v5 of the patch. It is the same as v4 but with the v4-0001 part
> omitted because that was already pushed.

I made a few whitespace adjustments on Friday that I didn't get time to
push, so I left the whole set to after the minors are finalized this
week.  I'll get them pushed on Wednesday or so.  (The back branches have
a few conflicts, on every release, but I see no reason to post those and
it'd upset the cfbot).

-- 
Álvaro Herrera   Valdivia, Chile
>From 8494316bef64cde7476146bce0bda9a93890def7 Mon Sep 17 00:00:00 2001
From: Peter Smith 
Date: Fri, 7 May 2021 17:51:12 +1000
Subject: [PATCH v6] Rename the logical replication global wrconn.

The worker.c global wrconn is only meant to be used for logical apply/tablesync
workers, but there are other variables with the same name. To reduce future confusion
the global has renamed from "wrconn" to "LogRepWorkerWalRcvConn".
---
 src/backend/replication/logical/launcher.c  |  4 +--
 src/backend/replication/logical/tablesync.c | 35 -
 src/backend/replication/logical/worker.c| 23 +++---
 src/include/replication/worker_internal.h   |  2 +-
 4 files changed, 36 insertions(+), 28 deletions(-)

diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index cb462a052a..f97ab12675 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -649,8 +649,8 @@ static void
 logicalrep_worker_onexit(int code, Datum arg)
 {
 	/* Disconnect gracefully from the remote side. */
-	if (wrconn)
-		walrcv_disconnect(wrconn);
+	if (LogRepWorkerWalRcvConn)
+		walrcv_disconnect(LogRepWorkerWalRcvConn);
 
 	logicalrep_worker_detach();
 
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 0638f5c7f8..67f907cdd9 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -302,8 +302,11 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
    MyLogicalRepWorker->relstate,
    MyLogicalRepWorker->relstate_lsn);
 
-		/* End wal streaming so wrconn can be re-used to drop the slot. */
-		walrcv_endstreaming(wrconn, &tli);
+		/*
+		 * End streaming so that LogRepWorkerWalRcvConn can be used to drop
+		 * the slot.
+		 */
+		walrcv_endstreaming(LogRepWorkerWalRcvConn, &tli);
 
 		/*
 		 * Cleanup the tablesync slot.
@@ -322,7 +325,7 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
 		 * otherwise, it won't be dropped till the corresponding subscription
 		 * is dropped. So passing missing_ok = false.
 		 */
-		ReplicationSlotDropAtPubNode(wrconn, syncslotname, false);
+		ReplicationSlotDropAtPubNode(LogRepWorkerWalRcvConn, syncslotname, false);
 
 		finish_sync_worker();
 	}
@@ -642,7 +645,7 @@ copy_read_data(void *outbuf, int minread, int maxread)
 		for (;;)
 		{
 			/* Try read the data. */
-			len = walrcv_receive(wrconn, &buf, &fd);
+			len = walrcv_receive(LogRepWorkerWalRcvConn, &buf, &fd);
 
 			CHECK_FOR_INTERRUPTS();
 
@@ -715,7 +718,8 @@ fetch_remote_table_info(char *nspname, char *relname,
 	 "   AND c.relname = %s",
 	 quote_literal_cstr(nspname),
 	 quote_literal_cstr(relname));
-	res = walrcv_exec(wrconn, cmd.data, lengthof(tableRow), tableRow);
+	res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data,
+	  lengthof(tableRow), tableRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -752,9 +756,11 @@ fetch_remote_table_info(char *nspname, char *relname,
 	 "   AND a.attrelid = %u"
 	 " ORDER BY a.attnum",
 	 lrel->remoteid,
-	 (walrcv_server_version(wrconn) >= 12 ? "AND a.attgenerated = ''" : ""),
+	 (walrcv_server_version(LogRepWorkerWalRcvConn) >= 12 ?
+		 "AND a.attgenerated = ''" : ""),
 	 lrel->remoteid);
-	res = walrcv_exec(wrconn, cmd.data, lengthof(attrRow), attrRow);
+	res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data,
+	  lengthof(attrRow), attrRow);
 
 	if (res->status != WALRCV_OK_TUPLES)
 		ereport(ERROR,
@@ -841,7 +847,7 @@ copy_table(Relation rel)
 		appendStringInfo(&cmd, " FROM %s) TO STDOUT",
 		 quote_qualified_identifier(lrel.nspname, lrel.relname));
 	}
-	res = walrcv_exec(wrconn, cmd.data, 0, NULL);
+	res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
 	pfree(cmd.data);
 	if (res->status != WALRCV_OK_COPY_OUT)
 		ereport(ERROR,
@@ -957,8 +963,9 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 	 * application_name, so that it is different from the main apply worker,
 	 * so that synchronous replication can distinguish them.
 	 */
-	wrconn = walrcv_connect(MySubscription->conninfo, true, slotname, &err);
-	if (wrconn == NULL)
+	LogRepWorkerWalRcvConn =
+		walrcv_connect(MySubscription->conninfo, true, slotname, &err);
+	if (LogRepWorkerWalRcvConn == NULL)
 		ereport(ERROR,
 (errmsg("could not connect to the publisher: %s", err)));
 
@@ -985,7 +992,7 @@ LogicalRepS

Re: Reducing opr_sanity test's runtime under CLOBBER_CACHE_ALWAYS

2021-05-10 Thread Andres Freund

Hi,

On 2021-05-10 14:06:16 -0400, Tom Lane wrote:
> Hm.  But constantly flushing the caches should mean that they're never
> populated with very many entries at one time, which ought to forestall
> that, at least to some extent.

That's probably true...

> I wonder if there's anything we could do to make ResetCatalogCache
> faster?  It wouldn't help much for normal execution of course,
> but it might do something to bring CCA testing time down out of
> the stratosphere.

We could make the hashtables shrink, not just grow...

There's also the issue that most people, I assume, run CCA tests with -O0. In
a quick test that does make a big difference in e.g. ResetCatalogCache(). I
just added a function specific annotation to optimize just that function and
the overall time in my test shrank 10% or so.

Greetings,

Andres Freund

libpq_pipeline in tmp_install

2021-05-10 Thread Peter Eisentraut

The test program libpq_pipeline produced by the test suite in 
src/test/modules/libpq_pipeline/ is installed into tmp_install as part 
of make check.  This isn't a real problem by itself, but I think it 
creates a bit of an asymmetric situation that might be worth cleaning up.


Before, the contents of tmp_install exactly matched an actual 
installation.  There were no extra test programs installed.


Also, the test suite code doesn't actually use that installed version, 
so it's not of any use, and it creates confusion about which copy is in use.


The reason this is there is that the test suite uses PGXS to build the 
test program, and so things get installed automatically.  I suggest that 
we should either write out the build system by hand to avoid this, or 
maybe extend PGXS to support building programs but not installing them. 
The advantage of the former approach is that it would allow additional 
test programs to be added later as well.  (We should really collect the 
libpq tests under src/interfaces/libpq/ anyway at some point.)

Re: Another modest proposal for reducing CLOBBER_CACHE_ALWAYS runtime

2021-05-10 Thread Tom Lane

David Rowley  writes:
> On Mon, 10 May 2021 at 18:04, Tom Lane  wrote:
>> real293m31.054s
>> to
>> real1m47.807s
>> Yes, really.

> That's quite impressive.

> I've very much in favour of this change. Making it more realistic to
> run the regression tests on a CLOBBER_CACHE_ALWAYS builds before a
> commit is a very worthy goal and this is a big step towards that.
> Nice.

It occurred to me to check hyrax's results on the older branches
(it also tests v12 and v13), and the slope of the curve is bad:

Branch  Latest "check" phase runtime

HEAD13:56:11
v13 11:00:33
v12 6:05:30

Seems like we'd better do something about that.

About 2.5 hours worth of the jump from 12 to 13 can be blamed on
the privileges test, looks like.  The slowdown in that evidently
can be blamed on 0c882e52a86, which added this:

+-- results below depend on having quite accurate stats for atest12
+SET default_statistics_target = 1;
 VACUUM ANALYZE atest12;
+RESET default_statistics_target;

The slow queries in that test all cause the planner to apply the
"leak()" function to every histogram entry for atest12, so this
one change caused a 100X increase in the amount of work there.
I find it a bit remarkable that we barely noticed that in normal
operation.  In CCA mode, though, each leak() call takes circa 100ms
(at least on my workstation), so kaboom.

Anyway, I'm now feeling that what I should do with this patch
is wait for the release cycle to finish and then apply it to
v13 as well as HEAD.  The other patch I proposed, to cut
opr_sanity's runtime, may be too invasive for back-patch.

More generally, there is an upward creep in the test runtimes
that doesn't seem to be entirely accounted for by our constantly
adding more tests.  I am suspicious that plpgsql may be largely
to blame for this.  The smoking gun I found for that is the
runtimes for the plpgsql_control test, which hasn't changed
*at all* since it was added in v11; but hyrax shows these
runtimes:

HEAD:
test plpgsql_control  ... ok56105 ms
v13:
test plpgsql_control  ... ok46879 ms
v12:
test plpgsql_control  ... ok30809 ms

In normal builds that test's time has held pretty steady.
So I'm not sure what's underneath this rock, but I plan
to try to find out.

regards, tom lane

Re: Reducing opr_sanity test's runtime under CLOBBER_CACHE_ALWAYS

2021-05-10 Thread Tom Lane

Andres Freund  writes:
> On 2021-05-10 14:06:16 -0400, Tom Lane wrote:
>> I wonder if there's anything we could do to make ResetCatalogCache
>> faster?  It wouldn't help much for normal execution of course,
>> but it might do something to bring CCA testing time down out of
>> the stratosphere.

> We could make the hashtables shrink, not just grow...

Maybe ...

> There's also the issue that most people, I assume, run CCA tests with -O0. In
> a quick test that does make a big difference in e.g. ResetCatalogCache(). I
> just added a function specific annotation to optimize just that function and
> the overall time in my test shrank 10% or so.

If they do I think they're nuts ;-).  CCA is slow enough already without
hobbling it.

hyrax appears to use the usual -O2, as does/did avocet.  Not sure
if we have any other CCA buildfarm members right now.

regards, tom lane

Re: PG 14 release notes, first draft

2021-05-10 Thread John Naylor

Hi Bruce,

Thanks for doing this work again!

> Add date_bin function (John Naylor)
>
> WHAT DOES THIS DO?

Hard to describe in a one-liner, but it lines up timestamps into regular
intervals as specified by the user. It is more clear after seeing examples:

https://www.postgresql.org/docs/devel/functions-datetime.html#FUNCTIONS-DATETIME-BIN

> Dramatically improve Unicode normalization (John Naylor)
>
> WHAT OPERATIONS USE THIS?

PG13 added the normalize() function to normalize Unicode sequences, as well
as the IS NORMALIZED syntax to test for that. The commits* here do not
change behavior and only improve performance. As such, this really belongs
in the performance section.

*There is one additional commit that belongs to this entry:

Author: Michael Paquier 
2020-10-11 [80f8eb79e] Use perfect hash for NFC and NFKC Unicode
Normalization quick check

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Reducing opr_sanity test's runtime under CLOBBER_CACHE_ALWAYS

2021-05-10 Thread Tom Lane

Andres Freund  writes:
> On 2021-05-10 14:06:16 -0400, Tom Lane wrote:
>> I wonder if there's anything we could do to make ResetCatalogCache
>> faster?  It wouldn't help much for normal execution of course,
>> but it might do something to bring CCA testing time down out of
>> the stratosphere.

> We could make the hashtables shrink, not just grow...

I noticed that we already have counters that can tell whether a
catcache or dynahash table is empty, so I experimented with the
attached patch.  Testing one of the slow queries from privileges.sql
(which might not be very representative of the overall issue),
I see:

HEAD:
Time: 536429.715 ms (08:56.430)

with ResetCatalogCache hack:
Time: 488938.597 ms (08:08.939)

plus hash_seq_search hack:
Time: 475400.634 ms (07:55.401)

Of course, the issue with these patches is that they change these
counters from things that (I think) we only trust for statistical
purposes into things that had better be right or you're going to
have impossible-to-track-down bugs with sometimes failing to
invalidate cache entries.  My gut feeling is that the risk-to-reward
ratio is worth it for changing ResetCatalogCache, but not for
hash_seq_search.  This is partly because of the greater absolute
payback and partly because ResetCatalogCache doesn't deal with
shared data structures, reducing the risk of counting issues.

regards, tom lane

diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 4fbdc62d8c..df5f219254 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -644,6 +644,10 @@ ResetCatalogCache(CatCache *cache)
 	dlist_mutable_iter iter;
 	int			i;

+	/* If the cache is entirely empty, there's nothing to do */
+	if (cache->cc_ntup == 0)
+		return;
+
 	/* Remove each list in this cache, or at least mark it dead */
 	dlist_foreach_modify(iter, &cache->cc_lists)
 	{
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 6546e3c7c7..1ac87725c0 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -1455,11 +1455,23 @@ hash_seq_search(HASH_SEQ_STATUS *status)
 	}

 	/*
-	 * Search for next nonempty bucket starting at curBucket.
+	 * Nothing to do if hash table is entirely empty.  (For a partitioned
+	 * hash table, we'd have to check all the freelists, which is unlikely
+	 * to be a win.)
 	 */
-	curBucket = status->curBucket;
 	hashp = status->hashp;
 	hctl = hashp->hctl;
+	if (hctl->freeList[0].nentries == 0 &&
+		!IS_PARTITIONED(hctl))
+	{
+		hash_seq_term(status);
+		return NULL;			/* table is empty */
+	}
+
+	/*
+	 * Search for next nonempty bucket starting at curBucket.
+	 */
+	curBucket = status->curBucket;
 	ssize = hashp->ssize;
 	max_bucket = hctl->max_bucket;

Re: PG 14 release notes, first draft

2021-05-10 Thread Matthias van de Meent

On Mon, 10 May 2021 at 19:34, Bruce Momjian  wrote:
>
> On Mon, May 10, 2021 at 01:44:12PM +0200, Matthias van de Meent wrote:
> > On Mon, 10 May 2021 at 08:03, Bruce Momjian  wrote:
> > >
> > > I have committed the first draft of the PG 14 release notes.  You can
> > > see the most current  build of them here:
> > > https://momjian.us/pgsql_docs/release-14.html
> > >
> > > I need clarification on many items, and the document still needs its
> > > items properly ordered, and markup added.  I also expect a lot of
> > > feedback.
> >
> > I noticed that the improvement in bloat control in the HeapAM that I
> > know of (3c3b8a4b, 0ff8bbde) weren't documented here. Although each
> > can be considered minor, they together can decrease the bloating
> > behaviour of certain workloads significantly (and limit the total
> > damage), and in my opinion this should be mentioned.
> >
> > 3c3b8a4b: Returns space claimed for the line pointer array back to the
> > page's empty space, so that it can also be used for tuple data.
> >
> > 0ff8bbde: Allows large tuples to be inserted on pages which have only
> > a small amount of data, regardless of fillfactor.
> >
> > Together they should be able to help significantly in both bloat
> > prevention and bloat reduction.
>
> I looked at those items.  I try to mention performance items that enable
> new workloads or require some user action to benefit from it.

0ff8bbde Enables a workload that inserts (and non-locally updates)
large (> FILLFACTOR %) tuples in tables that have a low FILLFACTOR.
Previously this would fail dramatically by only inserting on new
pages; this would extend the table indefinately. See the thread [0]

3c3b8a4b improves workloads with high local update-then-delete churn.
Previously this would irreversably claim space on the page for tuple
identifiers even when they were later deleted; now we can reclaim this
space when a tuple is deleted from the page.

I see these two improvements in a similar light as the bottom-up index
deletion in btree: No user action required, works out-of-the-box,
decreases bloat / disk usage, but good to note as it fixes (known)
bloating footguns that a user might have encountered.

> I am not sure these two qualify, but can others comments?  Thanks.

I'd like to refer to Peter Geoghegan's reply [1] upthread.

Thank you for your effort,

Matthias van de Meent

[0] 
https://www.postgresql.org/message-id/flat/6e263217180649339720afe2176c50aa%40opammb0562.comp.optiver.com
[1] 
https://www.postgresql.org/message-id/CAH2-Wz%3D-A%3DjRxpB2Owj3KQadCue7%2BNLqj56Q566ees7TapMRvA%40mail.gmail.com

Re: Reducing opr_sanity test's runtime under CLOBBER_CACHE_ALWAYS

2021-05-10 Thread Andres Freund

Hi,

On 2021-05-10 16:17:18 -0400, Tom Lane wrote:
> I noticed that we already have counters that can tell whether a
> catcache or dynahash table is empty, so I experimented with the
> attached patch.  Testing one of the slow queries from privileges.sql
> (which might not be very representative of the overall issue),
> I see:
> 
> HEAD:
> Time: 536429.715 ms (08:56.430)
> 
> with ResetCatalogCache hack:
> Time: 488938.597 ms (08:08.939)
> 
> plus hash_seq_search hack:
> Time: 475400.634 ms (07:55.401)

Oh, nice.

Perhaps we generally ought to lower the initial sycache sizes further?
20cb18db4668 did that, but we still have things like PROCNAMEARGNSP,
PROCOID, RELNAMENSP, RELOID, STATRELATTINH, ... using 128 as the initial
size. Not hard to imagine that some of these are larger than what simple
workloads or CCA encounter.

> Of course, the issue with these patches is that they change these
> counters from things that (I think) we only trust for statistical
> purposes into things that had better be right or you're going to
> have impossible-to-track-down bugs with sometimes failing to
> invalidate cache entries.  My gut feeling is that the risk-to-reward
> ratio is worth it for changing ResetCatalogCache, but not for
> hash_seq_search.  This is partly because of the greater absolute
> payback and partly because ResetCatalogCache doesn't deal with
> shared data structures, reducing the risk of counting issues.

That sounds reasonable. We could mitigate the risk for dynahash by
testing HASH_SHARED_MEM (which we don't store right now), but it's not
clear it's worth it here. But I wonder if there's other cases where it
could help? If we did make the check support shared memory *and*
partitioned tables, I could easily see it be a win for things like
LockReleaseAll().

Greetings,

Andres Freund

Re: Enhanced error message to include hint messages for redundant options error

2021-05-10 Thread Alvaro Herrera

On 2021-May-10, vignesh C wrote:

> That sounds fine to me, Attached v6 patch which has the changes for the same.

What about defining a function (maybe a static inline function in
defrem.h) that is marked noreturn and receives the DefElem and
optionally pstate, and throws the error?  I think that would avoid the
patch's need to have half a dozen copies of the "duplicate_error:" label
and ereport stanza.

-- 
Álvaro Herrera39°49'30"S 73°17'W

Re: Reducing opr_sanity test's runtime under CLOBBER_CACHE_ALWAYS

2021-05-10 Thread Thomas Munro

On Tue, May 11, 2021 at 8:52 AM Andres Freund  wrote:
> ... If we did make the check support shared memory *and*
> partitioned tables, I could easily see it be a win for things like
> LockReleaseAll().

For that case, has the idea of maintaining a dlist of local locks been
considered?

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Mon, May 10, 2021 at 07:54:24AM -0700, Peter Geoghegan wrote:
> On Mon, May 10, 2021 at 4:44 AM Matthias van de Meent
>  wrote:
> > I noticed that the improvement in bloat control in the HeapAM that I
> > know of (3c3b8a4b, 0ff8bbde) weren't documented here. Although each
> > can be considered minor, they together can decrease the bloating
> > behaviour of certain workloads significantly (and limit the total
> > damage), and in my opinion this should be mentioned.
> >
> > 3c3b8a4b: Returns space claimed for the line pointer array back to the
> > page's empty space, so that it can also be used for tuple data.
> >
> > 0ff8bbde: Allows large tuples to be inserted on pages which have only
> > a small amount of data, regardless of fillfactor.
> 
> +1 on mentioning both things.

OK, you are confirming what Matthias suggested.  I added these two
items, which both seem to apply only to heap pages, not index pages:

---





Deallocate space reserved by trailing unused heap line pointers
(Matthias van de Meent, Peter Geoghegan)



---





Allow wide tuples to be always added to almost-empty heap pages (John 
Naylor,
Floris van Nee)



Previously tuples whose insertion would have exceeded the page's fill
factor were instead added to new pages.



-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: PG 14 release notes, first draft

2021-05-10 Thread Peter Geoghegan

On Mon, May 10, 2021 at 3:58 PM Bruce Momjian  wrote:
> OK, you are confirming what Matthias suggested.  I added these two
> items, which both seem to apply only to heap pages, not index pages:

That's right -- these two relate to heap pages only.

I think that Matthias compared these two to bottom-up index deletion
because all three patches are concerned about avoiding "a permanent
solution to a temporary problem". They're conceptually similar despite
being in fairly different areas. Evidently Matthias has a similar
mental model to my own when it comes to this stuff.

Unfortunately the practical significance of the line pointer patch is
hard to demonstrate with a benchmark. I believe that it is very useful
on a sufficiently long timeline and with certain workloads because of
the behavior it avoids. As I pointed out on that other thread
recently, once you have irreversible bloat very small adverse events
will eventually add up and cause big problems. When this happens it'll
be very hard or impossible to detect, since it just looks like heap
fragmentation.

That said, it's clearly an issue with one of the TPC-C tables if you
run BenchmarkSQL for days and days (just one table, though). So there
is hard evidence that line pointer bloat could get really out of hand
in at least some tables.

-- 
Peter Geoghegan

Re: Reducing opr_sanity test's runtime under CLOBBER_CACHE_ALWAYS

2021-05-10 Thread Andres Freund

Hi,

On 2021-05-11 10:57:03 +1200, Thomas Munro wrote:
> On Tue, May 11, 2021 at 8:52 AM Andres Freund  wrote:
> > ... If we did make the check support shared memory *and*
> > partitioned tables, I could easily see it be a win for things like
> > LockReleaseAll().

Errr, that's not even a shared hashtable... So it would help even if we
just excluded shared memory hashtables.

> For that case, has the idea of maintaining a dlist of local locks been
> considered?

Yea, there's been a long discussion on that for
LockReleaseAll(). Combined with alternatives around shrinking the hashtable...

Greetings,

Andres Freund

Is element access after HASH_REMOVE ever OK?

2021-05-10 Thread Thomas Munro

Hi,

After hearing from a couple of directions about systems spending too
much time scanning the local lock hash table, I wrote the trivial
patch to put them in a linked list, before learning that people have
considered that before, so I should probably go and read some history
on that and find out why it hasn't been done...

However, I noticed in passing that RemoveLocalLock() accesses
*locallock after removing it from the hash table (in assertion builds
only).  So one question I have is whether it's actually a programming
rule that you can't do that (at most you can compare the pointer
against NULL), or whether it's supposed to be
safe-if-you-know-what-you're-doing, as the existing comments hints.
Here also is a patch that does wipe_mem on removed elements, as
threatened last time this topic came up[1], which reveals the problem.
I'm also not exactly sure why it's only a WARNING if your local lock
table is out of sync, but perhaps that's in the archives too.

[1] 
https://www.postgresql.org/message-id/flat/CAHut%2BPs-pL%2B%2Bf6CJwPx2%2BvUqXuew%3DXt-9Bi-6kCyxn%2BFwi2M7w%40mail.gmail.com
From d267c3c0cb26ea97616ca1184ab13664acb4feb8 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 11 May 2021 11:11:15 +1200
Subject: [PATCH 1/3] Clobber memory released by HASH_REMOVE.

---
 src/backend/utils/hash/dynahash.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 6546e3c7c7..b977d0395c 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -102,6 +102,7 @@
 #include "storage/shmem.h"
 #include "storage/spin.h"
 #include "utils/dynahash.h"
+#include "utils/memdebug.h"
 #include "utils/memutils.h"
 
 
@@ -1076,10 +1077,12 @@ hash_search_with_hash_value(HTAB *hashp,
 	SpinLockRelease(&hctl->freeList[freelist_idx].mutex);
 
 /*
- * better hope the caller is synchronizing access to this
- * element, because someone else is going to reuse it the next
- * time something is added to the table
+ * Clobber the memory, to find bugs in code that accesses
+ * it after removing.
  */
+#ifdef CLOBBER_FREED_MEMORY
+wipe_mem(ELEMENTKEY(currBucket), hctl->entrysize);
+#endif
 return (void *) ELEMENTKEY(currBucket);
 			}
 			return NULL;
-- 
2.30.2

From 1cf44ab0f7bfc9d5221ce19f162b3e07ab42c567 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 11 May 2021 11:11:37 +1200
Subject: [PATCH 2/3] Don't access a local locks after freeing them.

---
 src/backend/storage/lmgr/lock.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 108b4d9023..8642afc275 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -1390,15 +1390,15 @@ RemoveLocalLock(LOCALLOCK *locallock)
 		SpinLockRelease(&FastPathStrongRelationLocks->mutex);
 	}
 
-	if (!hash_search(LockMethodLocalHash,
-	 (void *) &(locallock->tag),
-	 HASH_REMOVE, NULL))
-		elog(WARNING, "locallock table corrupted");
-
 	/*
 	 * Indicate that the lock is released for certain types of locks
 	 */
 	CheckAndSetLockHeld(locallock, false);
+
+	if (!hash_search(LockMethodLocalHash,
+	 (void *) &(locallock->tag),
+	 HASH_REMOVE, NULL))
+		elog(WARNING, "locallock table corrupted");
 }
 
 /*
-- 
2.30.2

Re: Is element access after HASH_REMOVE ever OK?

2021-05-10 Thread Tom Lane

Thomas Munro  writes:
> However, I noticed in passing that RemoveLocalLock() accesses
> *locallock after removing it from the hash table (in assertion builds
> only).  So one question I have is whether it's actually a programming
> rule that you can't do that (at most you can compare the pointer
> against NULL), or whether it's supposed to be
> safe-if-you-know-what-you're-doing, as the existing comments hints.

I'd say it's, at best, unwarranted familiarity with the dynahash
implementation ...

> Here also is a patch that does wipe_mem on removed elements, as
> threatened last time this topic came up[1], which reveals the problem.

... one good reason being that it'll fail under this sort of
entirely-reasonable debugging aid.  Can we get rid of the unsafe
access easily?

regards, tom lane

Re: Is element access after HASH_REMOVE ever OK?

2021-05-10 Thread Tom Lane

I wrote:
> ...  Can we get rid of the unsafe
> access easily?

Oh, shoulda read your second patch first.  Looking at that,
I fear it might not be quite that simple, because the
comment on CheckAndSetLockHeld says very clearly

 * It is callers responsibility that this function is called after
 * acquiring/releasing the relation extension/page lock.

so your proposed patch violates that specification.

I'm inclined to think that this API spec is very poorly thought out
and should be changed --- why is it that the flags should change
*after* the lock change in both directions?  But we'd have to take
a look at the usage of these flags to understand what's going on
exactly.

regards, tom lane

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Thomas Munro

On Mon, May 10, 2021 at 11:21 PM Fabien COELHO  wrote:
> > And of course this time it succeeded :-)
>
> Hmmm. ISTM that failures are on and off every few attempts.

OK we got the SIGABRT this time, but still no backtrace.  If the
kernel's core_pattern is "core", gdb is installed, then considering
that the buildfarm core_file_glob is "core*" and the script version is
recent (REL_12), then I'm out of ideas.  ulimit -c XXX shouldn't be
needed because the perl script does that with rlimit.

Re: PG 14 release notes, first draft

2021-05-10 Thread David Rowley

Thanks for making the updates.

On Tue, 11 May 2021 at 05:07, Bruce Momjian  wrote:
>
> On Mon, May 10, 2021 at 08:52:44PM +1200, David Rowley wrote:
> > > Improve the performance of parallel sequential scans (Thomas Munro, David 
> > > Rowley)
> >
> > I think it is worth mentioning "I/O" before "performance".  This
> > change won't really help cases if all the table's pages are already in
> > shared buffers.
>
> I went with:
>
> Improve the performance of parallel sequential I/O scans (Thomas 
> Munro,
> David Rowley)

I think I'd have gone with:

"Improve I/O performance of parallel sequential scans (Thomas Munro,
David Rowley)"

The operation we're speeding up is called sequential scan. We don't
have any operation that's named sequential I/O scan.

David

Zeroing partial pages in walreceiver

2021-05-10 Thread Andres Freund

Hi,

In 
https://www.postgresql.org/message-id/20210505010835.umylslxgq4a6rbwg%40alap3.anarazel.de
I commented that we have a number of hacky workarounds to deal with the fact
that walreceiver writes partial WAL pages into reycled segments.

The problem with that practice is that within a page we cannot reliably detect
invalid record headers. This is especially true, when the record header spans
across a page boundary - currently the only check in that case is if the
record length is smaller than 1GB, and even that is just checked in the
frontend. Note that we cannot rely on the CRC checksum here - it can only be
validated once the whole record has been read.

On a primary we *do* pad partial pages, see AdvanceXLInsertBuffer():
/*
 * Be sure to re-zero the buffer so that bytes beyond what we've
 * written will look like zeroes and not valid XLOG records...
 */
MemSet((char *) NewPage, 0, XLOG_BLCKSZ);

Particularly the logic in allocate_recordbuf() is scary: In a completely
working setup we'll regularly try to allocate large buffers that we'll never
need - and the record buffer is not freed until the startup process exits. And
we have no corresponding size check in the frontend (which doesn't make any
sense to me). In the case of a record header across a page boundary, this
check will pass in roughly 1/4 of the cases!


As an example of the difference this makes, I ran a primary/standby setup with
continuously running regression tests, and had a psql \watch terminate
walsender every 1.5 s.

Decoding failures without zero-padding:
2021-05-10 16:52:51.448 PDT [2481446][1/0] LOG:  record with incorrect 
prev-link 103FF/73 at 4/C154FD50
2021-05-10 16:52:53.001 PDT [2481446][1/0] LOG:  record with incorrect 
prev-link 0/ at 4/C3531A88
2021-05-10 16:52:57.848 PDT [2481446][1/0] LOG:  invalid resource manager ID 32 
at 4/C3B67AD8
2021-05-10 16:52:58.773 PDT [2481446][1/0] LOG:  record with incorrect 
prev-link 403FF/12 at 4/C47F35E8
2021-05-10 16:53:03.771 PDT [2481446][1/0] LOG:  invalid page at 4/C562E000
2021-05-10 16:53:04.945 PDT [2481446][1/0] LOG:  invalid record length at 
4/C6E1C1E8: wanted 24, got 0
2021-05-10 16:53:06.176 PDT [2481446][1/0] LOG:  invalid page at 4/C704
2021-05-10 16:53:07.624 PDT [2481446][1/0] LOG:  record with incorrect 
prev-link 2FF/64 at 4/C7475078
...


With zero-padding:
2021-05-10 16:58:20.186 PDT [2489042][1/0] LOG:  invalid record length at 
5/7049A40: wanted 24, got 0
2021-05-10 16:58:22.832 PDT [2489042][1/0] LOG:  invalid record length at 
5/801AD70: wanted 24, got 0
2021-05-10 16:58:27.548 PDT [2489042][1/0] LOG:  invalid record length at 
5/8319908: wanted 24, got 0
2021-05-10 16:58:30.945 PDT [2489042][1/0] LOG:  invalid record length at 
5/AFDC770: wanted 24, got 0
...
2021-05-10 16:59:24.546 PDT [2489042][1/0] LOG:  invalid page at 5/19284000

The "invalid page" cases are a lot rarer - previously we would hit them
whenever the record header itself passed [minimal] muster, even though it was
just padding passing as e.g. a valid record length. Now it's only when the end
of WAL actually is at the page boundary.


On 13+ we could do a bit better than the current approach, and use
pg_pwritev() to append the zeroed data. However, I'm not convinced it is a
good idea - when pg_pwritev is emulated, we'd always do the zeroing as part of
a separate write, which does seem like it increases the likelihood of
encountering such partially written pages a bit. But perhaps it's too
insignificant to matter.

Greetings,

Andres Freund
diff --git c/src/backend/replication/walreceiver.c i/src/backend/replication/walreceiver.c
index 9a0e3806fcf..3188743e59b 100644
--- c/src/backend/replication/walreceiver.c
+++ i/src/backend/replication/walreceiver.c
@@ -865,16 +865,38 @@ XLogWalRcvProcessMsg(unsigned char type, char *buf, Size len)
 
 /*
  * Write XLOG data to disk.
+ *
+ * This is a bit more complicated than immediately obvious: WAL files are
+ * recycled, which means they can contain almost arbitrary data. To ensure
+ * that we can reliably detect the end of the WAL, we need to pad partial
+ * pages with zeroes. If we didn't do so, the data beyond the end of the WAL
+ * is interpreted as a record header - while we can do some checks on the
+ * record, they are not bulletproof. Particularly because the record's CRC
+ * checksum can only be validated once the entire record has been read.
+ *
+ * There is no corresponding risk at the start of a page, because
+ * XLogPageHeader.xlp_pageaddr inside recycled segment data will always differ
+ * from the expected pageaddr.
+ *
+ * We cannot just pad 'buf' by whatever amount is necessary, as it is
+ * allocated outside of our control, preventing us from resizing it to be big
+ * enough. Copying the entire incoming data into a temporary buffer would be a
+ * noticeable overhead. Instead we separately write pages that need to be
+ * padded, and copy just that pa

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Andres Freund

On 2021-05-11 12:16:44 +1200, Thomas Munro wrote:
> OK we got the SIGABRT this time, but still no backtrace.  If the
> kernel's core_pattern is "core", gdb is installed, then considering
> that the buildfarm core_file_glob is "core*" and the script version is
> recent (REL_12), then I'm out of ideas.  ulimit -c XXX shouldn't be
> needed because the perl script does that with rlimit.

Unless perhaps the hard rlimit for -C is set? ulimit -c -H should show
that.

RE: Parallel scan with SubTransGetTopmostTransaction assert coredump

2021-05-10 Thread Pengchengliu

Hi Andres,
   Reproduce steps.

1, Modify and adjust NUM_SUBTRANS_BUFFERS to 128 from 32 in the file  
"src/include/access/subtrans.h" line number 15.
2, configure with enable assert and build it.
3, init a new database cluster.
4, modify  postgres.conf  and add some parameters as below. As the coredump 
from parallel scan, so we adjust parallel setting, make it easy to reproduce. 

  max_connections = 2000

  parallel_setup_cost=0
  parallel_tuple_cost=0
  min_parallel_table_scan_size=0
  max_parallel_workers_per_gather=8
  max_parallel_workers = 32

5, start the database cluster.
6, use the script init_test.sql  in attachment to create tables. 
7, use pgbench with script sub_120.sql in attachment to test it. Try it 
sometimes, you should get the coredump file.
pgbench  -d postgres -p 33550  -n -r -f sub_120.sql   -c 200 -j 200 -T 120

Thanks
Pengcheng

-Original Message-
From: Andres Freund  
Sent: 2021年5月7日 11:55
To: Pengchengliu 
Cc: pgsql-hack...@postgresql.org
Subject: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump

Hi,

On 2021-05-07 11:32:57 +0800, Pengchengliu wrote:
> Hi Hackers，
> 
> Last email, format error, missing some information, so I resend this email.  
> 
>  With PG 13.2(3fb4c75e857adee3da4386e947ba58a75f3e74b7), I tested 
> subtransaction with parallel scan, I got a subtransaction coredump as below:

> So the root cause is the Parallel Workers process set the TransactionXmin 
> with later transcation snapshot. When parallel scan, Parallel Workers process 
> use the older active snapshot.
> 
> It leads to subtrans assert coredump. I don't know how to fix it. Is there 
> any ideas?

Do you have steps to reliably reproduce this?

Greetings,

Andres Freund

RE: Parallel scan with SubTransGetTopmostTransaction assert coredump

2021-05-10 Thread Pengchengliu

Hi Andres,
   Reproduce steps.

1, Modify and adjust NUM_SUBTRANS_BUFFERS to 128 from 32 in the file  
"src/include/access/subtrans.h" line number 15.
2, configure with enable assert and build it.
3, init a new database cluster.
4, modify  postgres.conf  and add some parameters as below. As the coredump 
from parallel scan, so we adjust parallel setting, make it easy to reproduce. 

  max_connections = 2000

  parallel_setup_cost=0
  parallel_tuple_cost=0
  min_parallel_table_scan_size=0
  max_parallel_workers_per_gather=8
  max_parallel_workers = 32

5, start the database cluster.
6, use the script init_test.sql  in attachment to create tables. 
7, use pgbench with script sub_120.sql in attachment to test it. Try it 
sometimes, you should get the coredump file.
pgbench  -d postgres -p 33550  -n -r -f sub_120.sql   -c 200 -j 200 -T 120

Thanks
Pengcheng

-Original Message-
From: Andres Freund  
Sent: 2021年5月7日 11:55
To: Pengchengliu 
Cc: pgsql-hack...@postgresql.org
Subject: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump

Hi,

On 2021-05-07 11:32:57 +0800, Pengchengliu wrote:
> Hi Hackers，
> 
> Last email, format error, missing some information, so I resend this email.  
> 
>  With PG 13.2(3fb4c75e857adee3da4386e947ba58a75f3e74b7), I tested 
> subtransaction with parallel scan, I got a subtransaction coredump as below:

> So the root cause is the Parallel Workers process set the TransactionXmin 
> with later transcation snapshot. When parallel scan, Parallel Workers process 
> use the older active snapshot.
> 
> It leads to subtrans assert coredump. I don't know how to fix it. Is there 
> any ideas?

Do you have steps to reliably reproduce this?

Greetings,

Andres Freund

test_script.tar
Description: Unix tar archive

Re: Is element access after HASH_REMOVE ever OK?

2021-05-10 Thread Andres Freund

Hi,

On 2021-05-10 20:15:41 -0400, Tom Lane wrote:
> I wrote:
> > ...  Can we get rid of the unsafe
> > access easily?
>
> Oh, shoulda read your second patch first.  Looking at that,
> I fear it might not be quite that simple, because the
> comment on CheckAndSetLockHeld says very clearly
>
>  * It is callers responsibility that this function is called after
>  * acquiring/releasing the relation extension/page lock.
>
> so your proposed patch violates that specification.

It wouldn't be too hard to fix this though - we can just copy the
locktag into a local variable. Or use one of the existing local copies,
higher in the stack.

But:

> I'm inclined to think that this API spec is very poorly thought out
> and should be changed --- why is it that the flags should change
> *after* the lock change in both directions?  But we'd have to take
> a look at the usage of these flags to understand what's going on
> exactly.

I can't see a need to do it after the HASH_REMOVE at least - as we don't
return early if that fails, there's no danger getting out of sync if we
reverse the order.  I think the comment could just be changed to say
that the function has to be called after it is inevitable that the lock
is acquired/released.

Greetings,

Andres Freund

Re: Small issues with CREATE TABLE COMPRESSION

2021-05-10 Thread Michael Paquier

On Mon, May 10, 2021 at 12:17:19PM +0530, Dilip Kumar wrote:
> Even I was confused about that's the reason I used liblz4_static.lib,
> but I see you have changed to liblz4.lib to make it consistent I
> guess?

That's the name the upstream code is using, yes.

> Patch looks good to me, I can not verify though because I don't have
> such an environment.  Thanks for improving the patch.

Thanks, I got that applied to finish the work of this thread for
beta1.  At least this will give people an option to test LZ4 on
Windows.  Perhaps this will require some adjustments, but let's see if
that's necessary when that comes up.
--
Michael


signature.asc
Description: PGP signature

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Mon, May 10, 2021 at 07:50:14AM -0400, Joe Conway wrote:
> On 5/10/21 2:03 AM, Bruce Momjian wrote:
> > I have committed the first draft of the PG 14 release notes.  You can
> > see the most current  build of them here:
> > 
> > https://momjian.us/pgsql_docs/release-14.html
> > 
> > I need clarification on many items, and the document still needs its
> > items properly ordered, and markup added.  I also expect a lot of
> > feedback.
> > 
> > I plan to work on completing this document this coming week in
> > preparation for beta next week.
> 
> While only a small change, this commit does affect user visible behavior and
> so should probably be noted:
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b12bd4869b5e

I see your point.  Here is the release entry I added:





Return false for has_column_privilege() checks on non-existent or 
dropped columns (Joe Conway)



Previously such columns returned an invalid column error.



-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Tue, May 11, 2021 at 12:35:28PM +1200, David Rowley wrote:
> Thanks for making the updates.
> 
> On Tue, 11 May 2021 at 05:07, Bruce Momjian  wrote:
> >
> > On Mon, May 10, 2021 at 08:52:44PM +1200, David Rowley wrote:
> > > > Improve the performance of parallel sequential scans (Thomas Munro, 
> > > > David Rowley)
> > >
> > > I think it is worth mentioning "I/O" before "performance".  This
> > > change won't really help cases if all the table's pages are already in
> > > shared buffers.
> >
> > I went with:
> >
> > Improve the performance of parallel sequential I/O scans (Thomas 
> > Munro,
> > David Rowley)
> 
> I think I'd have gone with:
> 
> "Improve I/O performance of parallel sequential scans (Thomas Munro,
> David Rowley)"
> 
> The operation we're speeding up is called sequential scan. We don't
> have any operation that's named sequential I/O scan.

OK, new text:





Improve the I/O performance of parallel sequential scans (Thomas Munro, 
David Rowley)



This was done by allocating blocks in groups to parallel workers.



-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: Replication slot stats misgivings

2021-05-10 Thread Amit Kapila

On Fri, May 7, 2021 at 8:03 AM Amit Kapila  wrote:
>
> Thanks for the summarization. I don't find anything that is left
> unaddressed. I think we can wait for a day or two to see if Andres or
> anyone else sees anything that is left unaddressed and then we can
> close the open item.
>

I have closed this open item.

-- 
With Regards,
Amit Kapila.

Re: PG 14 release notes, first draft

2021-05-10 Thread Bruce Momjian

On Mon, May 10, 2021 at 04:14:56PM -0700, Peter Geoghegan wrote:
> On Mon, May 10, 2021 at 3:58 PM Bruce Momjian  wrote:
> > OK, you are confirming what Matthias suggested.  I added these two
> > items, which both seem to apply only to heap pages, not index pages:
> 
> That's right -- these two relate to heap pages only.
> 
> I think that Matthias compared these two to bottom-up index deletion
> because all three patches are concerned about avoiding "a permanent
> solution to a temporary problem". They're conceptually similar despite
> being in fairly different areas. Evidently Matthias has a similar
> mental model to my own when it comes to this stuff.

Agreed, that is a very interesting distinction.

> Unfortunately the practical significance of the line pointer patch is
> hard to demonstrate with a benchmark. I believe that it is very useful
> on a sufficiently long timeline and with certain workloads because of
> the behavior it avoids. As I pointed out on that other thread
> recently, once you have irreversible bloat very small adverse events
> will eventually add up and cause big problems. When this happens it'll
> be very hard or impossible to detect, since it just looks like heap
> fragmentation.
> 
> That said, it's clearly an issue with one of the TPC-C tables if you
> run BenchmarkSQL for days and days (just one table, though). So there
> is hard evidence that line pointer bloat could get really out of hand
> in at least some tables.

OK, once I dug into what you two were saying, I see the significance.  I
was frankly surprised we didn't already have these optimizations, and
you are right they can lead to long-term problems.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

walreceiver that is behind doesn't quit, send replies

2021-05-10 Thread Andres Freund

Hi,

There are no interrupt checks in the WalReceiverMain() sub-loop for
receiving WAL. There's one above

/* See if we can read data immediately */
len = walrcv_receive(wrconn, &buf, &wait_fd);

but none in the loop below:
/*
 * Process the received data, and any 
subsequent data we
 * can read without blocking.
 */
for (;;)

Similarly, that inner loop doesn't send status updates or fsyncs, while
there's network data - but that matters a bit less, because we'll
sendstatus updates upon request, and flush WAL at segment boundaries.

This may explain why a low-ish wal_sender_timeout /
wal_receiver_status_interval combo still sees plenty timeouts.

I suspect this is a lot easier to hit when the IO system on the standby
is the bottleneck (with the kernel slowing us down inside the
pg_pwrite()), because that makes it easier to always have incoming
network data.

It's probably not a good idea to just remove that two-level loop - we
don't want to fsync at a much higher rate. But just putting an
ProcessWalRcvInterrupts() in the inner loop also seems unsatisfying, we
should respect wal_receiver_status_interval...


I've a couple times gotten into a situation where I was shutting down
the primary while the standby was behind, and the system appeared to
just lock up, with neither primary nor standby reacting to normal
shutdown attempts. This seems to happen more often with larger wal
segment size...

Greetings,

Andres Freund

1 2 >

1 - 100 of 125 matches

Mail list logo