date:20220327

Re: Invalid comment in ParallelQueryMain

2022-03-27 Thread Michael Paquier

On Sat, Mar 26, 2022 at 05:12:05PM +0100, Matthias van de Meent wrote:
> On Sat, 26 Mar 2022 at 17:01, Julien Rouhaud  wrote:
>> While reading code around I just noticed that I failed to adapt a comment a
>> couple of lines above a removed line in 0f61727b75b9.  Patch attached.
> 
> +1, seems OK to me.

Yep.  Will fix.
--
Michael


signature.asc
Description: PGP signature

Re: Assert in pageinspect with NULL pages

2022-03-27 Thread Michael Paquier

On Fri, Mar 25, 2022 at 12:27:24PM +0900, Michael Paquier wrote:
> Makes sense.  Fine by me to stick an extra "btree" here.

I have been able to look at that again today (earlier than expected),
and except for one incorrect thing in the GIN tests where we were not
testing the correct pattern, the rest was correct.  So applied and
backpatched.  The buildfarm is not complaining based on the first
reports.
--
Michael


signature.asc
Description: PGP signature

Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-27 Thread Julien Rouhaud

Hi,

On Fri, Mar 25, 2022 at 08:18:31PM +0900, Michael Paquier wrote:
>
> Now looking at 0002.  The changes in hba.c are straight-forward,
> that's a nice read.

Thanks!

>   if (!field) { \
> - ereport(LOG, \
> + ereport(elevel, \
>   (errcode(ERRCODE_CONFIG_FILE_ERROR), \
>errmsg("missing entry in file \"%s\" at end of 
> line %d", \
>   IdentFileName, line_num))); \
> + *err_msg = psprintf("missing entry at end of line"); \
>   return NULL; \
>   } \
> I think that we'd better add to err_msg the line number and the file
> name.  This would become particularly important once the facility to
> include files gets added.  We won't use IdentFileName for this
> purpose, but at least we would know which areas to change.  Also, even
> if the the view proposes line_number, there is an argument in favor of
> consistency here.

I don't really like it.  The file inclusion patch adds a file_name column in
both views so that you have a direct access to the information, whether the
line is in error or not.  Having the file name and line number in error message
doesn't add any value as it would be redundant, and just make the view output
bigger (on top of making testing more difficult).  I kept the err_msg as-is
(and fixed the ereport filename in the file inclusion patch that I indeed
missed).
>
> +select count(*) >= 0 as ok from pg_ident_file_mappings;
>
> I'd really like to see more tests for this stuff

I didn't like the various suggestions, as it would mean to scatter the tests
all over the place.  The whole point of those views is indeed to check the
current content of a file without applying the configuration change (not on
Windows or EXEC_BACKEND, but there's nothing we can do there), so let's use
this way.  I added a naive src/test/authentication/003_hba_ident_views.pl test
that validates that specific new valid and invalid lines in both files are
correctly reported.  Note that I didn't update those tests for the file
inclusion.

Note that those tests fail on Windows (and I'm assuming on EXEC_BACKEND
builds), as they're testing invalid files which by definition prevent any
further connection attempt.  I'm not sure what would be best to do here, apart
from bypassing the invalid config tests on such platforms.  I don't think that
validating that trying to connect on such platforms when an invalid
pg_hba/pg_ident file brings anything.

> +a.pg_usernamee,
> [...]
> +  
> +   pg_username text
>
> Perhaps that's just a typo in the function output and you
> intended to use pg_username?

Yes that was a typo :)  It's correctly documented in catalogs.sgml, so I just
fixed pg_proc.dat and rules.out.

> +   /* Free parse_hba_line memory */
> +   MemoryContextSwitchTo(oldcxt);
> +   MemoryContextDelete(identcxt);
> Incorrect comment, this should be parse_ident_line.

Indeed.  I actually fixed it before but lost the change when rebasing after the
2nd hbafuncs.c refactoring.  I also fixed an incorrect comment about
pg_hba_file_mappings.
>From ab685e1db37239df06ccdd7be8cfc14789e73e8c Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Mon, 21 Feb 2022 17:38:34 +0800
Subject: [PATCH v4 1/3] Add a pg_ident_file_mappings view.

This view is similar to pg_hba_file_rules view, and can be also helpful to help
diagnosing configuration problems.

A following commit will add the possibility to include files in pg_hba and
pg_ident configuration files, which will then make this view even more useful.

Catversion is bumped.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya%40jrouhaud
---
 doc/src/sgml/catalogs.sgml| 108 ++
 doc/src/sgml/client-auth.sgml |  10 ++
 doc/src/sgml/func.sgml|   5 +-
 src/backend/catalog/system_views.sql  |   6 +
 src/backend/libpq/hba.c   |  31 ++--
 src/backend/utils/adt/hbafuncs.c  | 136 ++
 src/include/catalog/pg_proc.dat   |   7 +
 src/include/libpq/hba.h   |   1 +
 .../authentication/t/003_hba_ident_views.pl   |  80 +++
 src/test/regress/expected/rules.out   |   6 +
 src/test/regress/expected/sysviews.out|   6 +
 src/test/regress/sql/sysviews.sql |   2 +
 12 files changed, 382 insertions(+), 16 deletions(-)
 create mode 100644 src/test/authentication/t/003_hba_ident_views.pl

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 94f01e4099..75fedfa07e 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -9591,6 +9591,11 @@ SCRAM-SHA-256$:&l
   summary of client authentication configuration file 
contents
  
 
+ 
+  pg_ident_file_mappings
+  summary of client user name mapping configuration file 
contents
+

GSoC: Improve PostgreSQL Regression Test Coverage

2022-03-27 Thread Christos Maris

Hi all,

This is Christos Maris , and
I'd like to declare my interest in the GSoC project mentioned above.
I'm an experienced Fullstack Developer, but still an open-source beginner.
That, combined with the fact that I'd really like to contribute to Postgres
and learn Perl, makes this particular project ideal for me.

I want to ask if the mentor of this project (Stephen Frost) could help me
with my application (if that's allowed, of course)?

I appreciate any help you can provide.
Christos

Re: Why is lorikeet so unstable in v14 branch only?

2022-03-27 Thread Andrew Dunstan

On 3/26/22 18:10, Tom Lane wrote:
>
>> If I understand correctly that you're only seeing this in v13 and
>> HEAD, then it seems like bf68b79e5 (Refactor ps_status.c API)
>> deserves a hard look.
> I still stand by this opinion.  Can you verify which of the ps_status.c
> code paths gets used on this build?
>
>   

It appears that it is using PS_USE_NONE, as it doesn't have any of the
defines required for the other paths. I note that the branch for that in
get_ps_display() doesn't set *displen, which looks a tad suspicious. It
could be left with any old junk. And maybe there's a good case for also
surrounding some of the code in WaitOnLock() with "if (len) ..."

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Add LZ4 compression in pg_dump

2022-03-27 Thread Robert Haas

On Sat, Mar 26, 2022 at 12:22 PM Justin Pryzby  wrote:
> 0002: I wonder if you're able to re-use any of the basebackup parsing stuff
> from commit ffd53659c.  You're passing both the compression method *and* 
> level.
> I think there should be a structure which includes both.  In the future, that
> can also handle additional options.  I hope to re-use these same things for
> wal_compression=method:level.

Yeah, we should really try to use that infrastructure instead of
inventing a bunch of different ways to do it. It might require some
renaming here and there, and I'm not sure whether we really want to
try to rush all this into the current release, but I think we should
find a way to get it done.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Assert in pageinspect with NULL pages

2022-03-27 Thread Robert Haas

On Fri, Mar 25, 2022 at 12:57 AM Michael Paquier  wrote:
> On Thu, Mar 24, 2022 at 08:54:14PM -0700, Peter Geoghegan wrote:
> > amcheck's palloc_btree_page() function validates that an 8KiB page is
> > in fact an nbtree page, in a maximally paranoid way. Might be an
> > example worth following here.
>
> Sure, we could do that.  However, I don't think that going down to
> that is something we have any need for in pageinspect, as the module
> is also useful to look at the contents of the full page, even if
> slightly corrupted, and too many checks would prevent a lookup at the
> internal contents of a page.

It's my impression that there are many ways of crashing the system
using pageinspect right now. We aren't very careful about making sure
that our code that reads from disk doesn't crash if the data on disk
is corrupted, and all of that systemic weakness is inherited by
pageinspect. But, with pageinspect, you can supply your own pages, not
just use the ones that actually exist on disk.

Fixing this seems like a lot of work and problematic for various
reasons. And I do agree that if we can allow people to look at what's
going on with a corrupt page, that's useful. On the other hand, if by
"looking" they crash the system, that sucks a lot. So overall I think
we need a lot more sanity checks here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Document atthasmissing default optimization avoids verification table scan

2022-03-27 Thread Robert Haas

On Sat, Mar 26, 2022 at 6:25 PM James Coleman  wrote:
> I simply do not accept the claim that this is not a reasonable concern
> to have nor that this isn't worth documenting.

I don't think I said that the concern wasn't reasonable, but I don't
think the fact that one person or organization had a concern means
that it has to be worth documenting. And I didn't say either that it's
not intrinsically worth documenting. I said it doesn't fit nicely into
the documentation we have.

Since you didn't like my last example, let's try another one. If
someone shows up and proposes a documentation patch to explain what a
BitmapOr node means, we're probably going to reject it, because it
makes no sense to document that one node and not all the others. That
doesn't mean that people shouldn't want to know what BitmapOr means,
but it's just not sensible to document that one thing in isolation,
even if somebody somewhere happened to be confused by that thing and
not any of the other nodes.

In the same way, I think you're trying to jam documentation of one
particular point into the documentation when there are many other
similar points that are not documented, and I think it's very awkward.
It looks to me like you want to document that a table scan isn't
performed in a certain case when we haven't documented the rule that
would cause that table scan to be performed in other cases, or even
what a table scan means in this context, or any of the similar things
that are equally important, like a table rewrite or an index rebuild,
or any of the rules for when those things happen.

It's arguable in my mind whether it is worth documenting all of those
rules, although I am not opposed to it if somebody wants to do the
work. But I *am* opposed to documenting that a certain situation is an
exception to an undocumented rule about an undocumented concept.
That's going to create confusion, not dispel it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add a pg_get_query_def function (was Re: Deparsing rewritten query)

2022-03-27 Thread Tom Lane

Julien Rouhaud  writes:
> [ v4-0001-Add-a-pg_get_query_def-wrapper-around-get_query_d.patch ]

This seems about ready to go to me, except for

(1) as an exported API, pg_get_querydef needs a full API spec in its
header comment.  "Read the code to figure out what to do" is not OK
in my book.

(2) I don't think this has been thought out too well:

> I also kept the wrapColument and startIdent as they
> can be easily used by callers.

The appropriate values for these are determined by macros that are
local in ruleutils.c, so it's not that "easy" for outside callers
to conform to standard practice.  I suppose we could move
WRAP_COLUMN_DEFAULT etc into ruleutils.h, but is there actually a
use-case for messing with those?  I don't see any other exported
functions that go beyond offering a "bool pretty" option, so
I think it might be a mistake for this one to be different.
(The pattern that I see is that a ruleutils function could have
"bool pretty", or it could have "int prettyFlags, int startIndent"
which is an expansion of that; but mixing those levels of detail
doesn't seem very wise.)

regards, tom lane

Re: Add LZ4 compression in pg_dump

2022-03-27 Thread Justin Pryzby

On Sun, Mar 27, 2022 at 10:13:00AM -0400, Robert Haas wrote:
> On Sat, Mar 26, 2022 at 12:22 PM Justin Pryzby  wrote:
> > 0002: I wonder if you're able to re-use any of the basebackup parsing stuff
> > from commit ffd53659c.  You're passing both the compression method *and* 
> > level.
> > I think there should be a structure which includes both.  In the future, 
> > that
> > can also handle additional options.  I hope to re-use these same things for
> > wal_compression=method:level.
> 
> Yeah, we should really try to use that infrastructure instead of
> inventing a bunch of different ways to do it. It might require some
> renaming here and there, and I'm not sure whether we really want to
> try to rush all this into the current release, but I think we should
> find a way to get it done.

It seems like something a whole lot like parse_compress_options() should be in
common/.  Nobody wants to write it again, and I couldn't convince myself to
copy it when I looked at using it for wal_compression.

Maybe it should take an argument which specifies the default algorithm to use
for input of a numeric "level".  And reject such input if not specified, since
wal_compression has never taken a "level", so it's not useful or desirable to
have that default to some new algorithm.

I could write this down if you want, although I'm not sure how/if you intend
other people to use bc_algorithm and bc_algorithm.  I don't think it's
important to do for v15, but it seems like it could be done after featue
freeze.  pg_dump+lz4 is targetting v16, although there's a cleanup patch that
could also go in before branching.

-- 
Justin

Re: [WIP] Allow pg_upgrade to copy segments of the same relfilenode in parallel

2022-03-27 Thread Jaime Casanova

On Mon, Mar 21, 2022 at 05:34:31PM -0700, Andres Freund wrote:
> Hi,
> 
> On 2022-02-01 21:57:00 -0500, Jaime Casanova wrote:
> > This patch adds a new option (-J num, --jobs-per-disk=num) in 
> > pg_upgrade to speed up copy mode. This generates upto ${num} 
> > processes per tablespace to copy segments of the same relfilenode 
> > in parallel.
> > 
> > This can help when you have many multi gigabyte tables (each segment 
> > is 1GB by default) in different tablespaces (each tablespace in a 
> > different disk) and multiple processors.
> > 
> > In a customer's database (~20Tb) it went down from 6h to 4h 45min.
> > 
> > It lacks documentation and I need help with WIN32 part of it, I created
> > this new mail to put the patch on the next commitfest.
> 
> The patch currently fails on cfbot due to warnings, likely related due to the
> win32 issue: 
> https://cirrus-ci.com/task/4566046517493760?logs=mingw_cross_warning#L388
> 
> As it's a new patch submitted to the last CF, hasn't gotten any review yet and
> misses some platform support, it seems like there's no chance it can make it
> into 15?
> 

Hi,

Because I have zero experience on the windows side of this, I will take
some time to complete that part.

Should we move this to the next commitfest (and make 16 the target for
this)?

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Re: Assert in pageinspect with NULL pages

2022-03-27 Thread Matthias van de Meent

On Sun, 27 Mar 2022 at 16:24, Robert Haas  wrote:
>
> On Fri, Mar 25, 2022 at 12:57 AM Michael Paquier  wrote:
> > On Thu, Mar 24, 2022 at 08:54:14PM -0700, Peter Geoghegan wrote:
> > > amcheck's palloc_btree_page() function validates that an 8KiB page is
> > > in fact an nbtree page, in a maximally paranoid way. Might be an
> > > example worth following here.
> >
> > Sure, we could do that.  However, I don't think that going down to
> > that is something we have any need for in pageinspect, as the module
> > is also useful to look at the contents of the full page, even if
> > slightly corrupted, and too many checks would prevent a lookup at the
> > internal contents of a page.
>
> It's my impression that there are many ways of crashing the system
> using pageinspect right now. We aren't very careful about making sure
> that our code that reads from disk doesn't crash if the data on disk
> is corrupted, and all of that systemic weakness is inherited by
> pageinspect. But, with pageinspect, you can supply your own pages, not
> just use the ones that actually exist on disk.
>
> Fixing this seems like a lot of work and problematic for various
> reasons. And I do agree that if we can allow people to look at what's
> going on with a corrupt page, that's useful. On the other hand, if by
> "looking" they crash the system, that sucks a lot. So overall I think
> we need a lot more sanity checks here.

I noticed this thread due to recent commit 291e517a, and noticed that
it has some overlap with one of my patches [0], in which I fix the
corresponding issue in core postgres by assuming (and in
assert-enabled builds, verifying) the size and location of the special
area. As such, you might be interested in that patch.

Note that currently in core postgres a corrupted special area pointer
will likely target neighbouring blocks in the buffer pool; resulting
in spreading corruption when the special area is updated. This
spreading corruption should be limited to only the corrupted block
with my patch.

Kind regards,

Matthias van de Meent

[0] https://commitfest.postgresql.org/37/3543/
https://www.postgresql.org/message-id/caeze2wje3+tgo9fs9+izmu+z6mmzko54w1zt98wkqbeuhbh...@mail.gmail.com

Re: Why is lorikeet so unstable in v14 branch only?

2022-03-27 Thread Tom Lane

Andrew Dunstan  writes:
> It appears that it is using PS_USE_NONE, as it doesn't have any of the
> defines required for the other paths. I note that the branch for that in
> get_ps_display() doesn't set *displen, which looks a tad suspicious.

Indeed.  I forced it to use PS_USE_NONE on my Linux machine, and got
a core dump on the first try of the regression tests:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  __memmove_avx_unaligned_erms ()
at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:516
516 VMOVNT  %VEC(0), (%r9)
(gdb) bt
#0  __memmove_avx_unaligned_erms ()
at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:516
#1  0x008299b3 in WaitOnLock (locallock=locallock@entry=0x2a5e700, 
owner=owner@entry=0x2aba8f0) at lock.c:1831
#2  0x0082adc6 in LockAcquireExtended (
locktag=locktag@entry=0x7ffc864fad90, lockmode=lockmode@entry=1, 
sessionLock=sessionLock@entry=false, dontWait=dontWait@entry=false, 
reportMemoryError=reportMemoryError@entry=true, 
locallockp=locallockp@entry=0x7ffc864fad88) at lock.c:1101
#3  0x0082861f in LockRelationOid (relid=1259, lockmode=1)
at lmgr.c:117
#4  0x0051c5ed in relation_open (relationId=1259, 
lockmode=lockmode@entry=1) at relation.c:56
...

(gdb) f 1
#1  0x008299b3 in WaitOnLock (locallock=locallock@entry=0x2a5e700, 
owner=owner@entry=0x2aba8f0) at lock.c:1831
1831memcpy(new_status, old_status, len);
(gdb) p len
$1 = -1

Problem explained, good detective work!

> And maybe there's a good case for also
> surrounding some of the code in WaitOnLock() with "if (len) ..."

+1.  I'll make it so, and check the other callers too.

Once I push this, you should remove the update_process_title hack
from lorikeet's config, since that was just a workaround until
we tracked down the problem, which I think we just did.

regards, tom lane

Re: Document atthasmissing default optimization avoids verification table scan

2022-03-27 Thread James Coleman

On Sun, Mar 27, 2022 at 11:43 AM Robert Haas  wrote:
>
> On Sat, Mar 26, 2022 at 6:25 PM James Coleman  wrote:
> > I simply do not accept the claim that this is not a reasonable concern
> > to have nor that this isn't worth documenting.
>
> I don't think I said that the concern wasn't reasonable, but I don't
> think the fact that one person or organization had a concern means
> that it has to be worth documenting.

You said "My first reaction when I read this sentence was that it
was warning the user about the absence of a hazard that no one would
expect in the first place." That seemed to me even stronger than "not
a reasonable concern", and so while I agree that one organization
having a concern doesn't mean that it has to be documented, it does
seem clear to me that one such organization dispels the idea that "no
one would expect [this]", which is why I said it in response to that
statement.

> And I didn't say either that it's
> not intrinsically worth documenting. I said it doesn't fit nicely into
> the documentation we have.

That was not the critique I took away from your email at all. It is,
however, what Tom noted, and I agree it's a relevant question.

> Since you didn't like my last example, let's try another one. If
> someone shows up and proposes a documentation patch to explain what a
> BitmapOr node means, we're probably going to reject it, because it
> makes no sense to document that one node and not all the others. That
> doesn't mean that people shouldn't want to know what BitmapOr means,
> but it's just not sensible to document that one thing in isolation,
> even if somebody somewhere happened to be confused by that thing and
> not any of the other nodes.
>
> In the same way, I think you're trying to jam documentation of one
> particular point into the documentation when there are many other
> similar points that are not documented, and I think it's very awkward.
> It looks to me like you want to document that a table scan isn't
> performed in a certain case when we haven't documented the rule that
> would cause that table scan to be performed in other cases, or even
> what a table scan means in this context, or any of the similar things
> that are equally important, like a table rewrite or an index rebuild,
> or any of the rules for when those things happen.

In the ALTER TABLE docs page "table scan" is used in the section on
"SET NOT NULL", "full table scan" is used in the sections on "ADD
table_constraint_using_index" and "ATTACH PARTITION", and "table scan"
is used again in the "Note" section. Table rewrites are similarly
discussed repeatedly on that page. Indeed the docs make a clear effort
to point out where table scans and table rewrites do and do not occur
(albeit not in one unified place -- it's scattered through the various
subcommands and notes sections. Indeed the Notes section explicitly
say "The main reason for providing the option to specify multiple
changes in a single ALTER TABLE is that multiple table scans or
rewrites can thereby be combined into a single pass over the table."

So I believe it is just factually incorrect to say that "we haven't
documented...what a table scan means in this context, or any of the
similar things that are equally important, like a table rewrite or an
index rebuild, or any of the rules for when those things happen."

> It's arguable in my mind whether it is worth documenting all of those
> rules, although I am not opposed to it if somebody wants to do the
> work. But I *am* opposed to documenting that a certain situation is an
> exception to an undocumented rule about an undocumented concept.
> That's going to create confusion, not dispel it.

As shown above, table scans (and specifically table scans used to
validate constraints, which is what this patch is about) are clearly
documented (more than once!) in the ALTER TABLE documentation. In fact
it's documented specifically in reference to SET NOT NULL, which is
even more specifically the type of constraint this patch is about.

So "undocumented concept" is just not accurate, and so I don't see it
as a valid reason to reject the patch.

Thanks,
James Coleman

Re: Why is lorikeet so unstable in v14 branch only?

2022-03-27 Thread Tom Lane

I wrote:
> Andrew Dunstan  writes:
>> And maybe there's a good case for also
>> surrounding some of the code in WaitOnLock() with "if (len) ..."

> +1.  I'll make it so, and check the other callers too.

I had second thoughts about that part after realizing that callers
cannot tell the difference between "ps_display is disabled" and
"the activity part of the display is currently empty".  In the latter
case I think we'd rather have WaitOnLock still append " waiting";
and it's not like PS_USE_NONE is so common as to be worth optimizing
for.  (Else we'd have identified this problem sooner.)

> Once I push this, you should remove the update_process_title hack
> from lorikeet's config, since that was just a workaround until
> we tracked down the problem, which I think we just did.

Minimal fix pushed, so please adjust that animal's config.

regards, tom lane

Re: Add pg_freespacemap extension sql test

2022-03-27 Thread Tom Lane

Michael Paquier  writes:
> Yes, we could extend that more.  For now, I am curious to see what the
> buildfarm has to say with the current contents of the patch, and I can
> keep an eye on the buildfarm today, so I have applied it.

It seems this is unstable under valgrind [1]:

--- 
/mnt/resource/bf/build/skink-master/HEAD/pgsql/contrib/pg_freespacemap/expected/pg_freespacemap.out
 2022-03-24 09:39:43.974477703 +
+++ 
/mnt/resource/bf/build/skink-master/HEAD/pgsql.build/contrib/pg_freespacemap/results/pg_freespacemap.out
2022-03-27 17:07:23.896287669 +
@@ -60,6 +60,7 @@
 ORDER BY 1, 2;
id| blkno | is_avail 
 -+---+--
+ freespace_tab   | 0 | t
  freespace_brin  | 0 | f
  freespace_brin  | 1 | f
  freespace_brin  | 2 | t
@@ -75,7 +76,7 @@
  freespace_hash  | 7 | f
  freespace_hash  | 8 | f
  freespace_hash  | 9 | f
-(15 rows)
+(16 rows)
 
 -- failures with incorrect block number
 SELECT * FROM pg_freespace('freespace_tab', -1);

skink has passed several runs since the commit went in, so it's
"unstable" not "fails consistently".  I see the test tries to
disable autovacuum on that table, so that doesn't seem to be
the problem ... what is?

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2022-03-27%2008%3A26%3A20

Re: Document atthasmissing default optimization avoids verification table scan

2022-03-27 Thread David G. Johnston

On Sun, Mar 27, 2022 at 10:00 AM James Coleman  wrote:

> As shown above, table scans (and specifically table scans used to
> validate constraints, which is what this patch is about) are clearly
> documented (more than once!) in the ALTER TABLE documentation. In fact
> it's documented specifically in reference to SET NOT NULL, which is
> even more specifically the type of constraint this patch is about.
>
> So "undocumented concept" is just not accurate, and so I don't see it
> as a valid reason to reject the patch.
>
>
As you point out, where these scans are performed is documented.  Your
request, though, is to document a location where they are not performed
instead of trusting in the absence of a statement meaning that no such scan
happens.  In this case no such scan of the table is ever needed when adding
a column and so ADD COLUMN doesn't mention table scanning.  We almost
always choose not to document those things which do not happen.  I don't
always agree with this position but it is valid and largely adhered to.  On
that documentation theory/policy basis alone this patch can be rejected.
0001 as proposed is especially strong in violating this principle.

My two thoughts from yesterday take slightly different approaches to try
and mitigate the same misunderstanding while also providing useful guidance
to the reader to make sure the hazard of ALTER COLUMN SET NOT NULL is
something they are thinking about even when adding a new column since
forgetting to incorporate the NOT NULL during the add can be a costly
mistake.  The tweaking the notes section seems to be the more productive of
the two approaches.

David J.

Re: refactoring basebackup.c (zstd workers)

2022-03-27 Thread Tom Lane

Robert Haas  writes:
> [ v5-0001-Replace-BASE_BACKUP-COMPRESSION_LEVEL-option-with.patch ]

Coverity has a nitpick about this:

/srv/coverity/git/pgsql-git/postgresql/src/common/backup_compression.c: 194 in 
parse_bc_specification()
193 /* Advance to next entry and loop around. */
>>> CID 1503251:  Null pointer dereferences  (REVERSE_INULL)
>>> Null-checking "vend" suggests that it may be null, but it has already 
>>> been dereferenced on all paths leading to the check.
194 specification = vend == NULL ? kwend + 1 : vend + 1;
195 }
196 }

Not sure if you should remove this null-check or add some other ones,
but I think you ought to do one or the other.

regards, tom lane

Re: Document atthasmissing default optimization avoids verification table scan

2022-03-27 Thread James Coleman

On Sun, Mar 27, 2022 at 1:46 PM David G. Johnston
 wrote:
>
> On Sun, Mar 27, 2022 at 10:00 AM James Coleman  wrote:
>>
>> As shown above, table scans (and specifically table scans used to
>> validate constraints, which is what this patch is about) are clearly
>> documented (more than once!) in the ALTER TABLE documentation. In fact
>> it's documented specifically in reference to SET NOT NULL, which is
>> even more specifically the type of constraint this patch is about.
>>
>> So "undocumented concept" is just not accurate, and so I don't see it
>> as a valid reason to reject the patch.
>>
>
> As you point out, where these scans are performed is documented.  Your 
> request, though, is to document a location where they are not performed 
> instead of trusting in the absence of a statement meaning that no such scan 
> happens.  In this case no such scan of the table is ever needed when adding a 
> column and so ADD COLUMN doesn't mention table scanning.  We almost always 
> choose not to document those things which do not happen.  I don't always 
> agree with this position but it is valid and largely adhered to.  On that 
> documentation theory/policy basis alone this patch can be rejected. 0001 as 
> proposed is especially strong in violating this principle.

Hmm,  I didn't realize that was project policy, but I'm a bit
surprised given that the sentence which 0001 replaces seems like a
direct violation of that also: "In neither case is a rewrite of the
table required."

> My two thoughts from yesterday take slightly different approaches to try and 
> mitigate the same misunderstanding while also providing useful guidance to 
> the reader to make sure the hazard of ALTER COLUMN SET NOT NULL is something 
> they are thinking about even when adding a new column since forgetting to 
> incorporate the NOT NULL during the add can be a costly mistake.  The 
> tweaking the notes section seems to be the more productive of the two 
> approaches.

Yes, I like those suggestions. I've attached an updated patch that I
think fits a good bit more naturally into the Notes section
specifically addressing scans and rewrites on NOT NULL constraints.

Thanks for the feedback,
James Coleman


v4-0001-Document-atthasmissing-default-avoids-verificatio.patch
Description: Binary data

TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

2022-03-27 Thread Erik Rijkers


Hi,

On master I got a FailedAssertion("HaveRegisteredOrActiveSnapshot()"
on an assert-enabled instance and with (I think) data over a certain length.

I whittled it down to the attached bash (careful - it drops stuff).  It 
has 5 tsv-data lines (one long line) that COPY slurps into a table.  The 
middle, third line causes the problem, later on.  Shortening the long 
line to somewhere below 2000 characters fixes it again.


More info in the attached .sh file.

If debug-assert is 'off', the problem does not occur. (REL_14_STABLE 
also does not have the problem, assertions or not)


thanks,

Erik Rijkers



bugsnapshot.sh
Description: application/shellscript

Re: Why is lorikeet so unstable in v14 branch only?

2022-03-27 Thread Andrew Dunstan



On 3/27/22 13:01, Tom Lane wrote:
>
>> Once I push this, you should remove the update_process_title hack
>> from lorikeet's config, since that was just a workaround until
>> we tracked down the problem, which I think we just did.
> Minimal fix pushed, so please adjust that animal's config.
>

Done.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Race conditions in 019_replslot_limit.pl

2022-03-27 Thread Tom Lane

Andres Freund  writes:
> How about the attached variation, which retries (for 15s, with 100ms sleeps)
> if there are multiple walsenders, printing the debugging info each time? It'll
> still fail the test if later iterations find just one walsender, which seems
> the right behaviour for now.

We have now four instances of failures with this version of the test,
and in every case the second iteration succeeded.  Is that enough
evidence yet?

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2022-03-27%2017%3A04%3A18
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2022-03-25%2009%3A00%3A09
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2022-03-25%2008%3A02%3A05
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2022-03-25%2003%3A00%3A18

I'd like to silence this noise so that we can start tracking
lower-probability failure modes, like say these:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=olingo&dt=2022-03-26%2002%3A59%3A03
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2022-03-26%2015%3A53%3A51

regards, tom lane

Re: Add LZ4 compression in pg_dump

2022-03-27 Thread Daniel Gustafsson

> On 27 Mar 2022, at 00:51, Justin Pryzby  wrote:
> 
> On Sun, Mar 27, 2022 at 12:37:27AM +0100, Daniel Gustafsson wrote:
>>> On 26 Mar 2022, at 17:21, Justin Pryzby  wrote:
>> 
>>> I suggested off-list to add an 0099 patch to change LZ4 to the default, to
>>> exercise it more on CI.
>> 
>> No need to change the defaults in autoconf for that.  The CFBot uses the 
>> cirrus
>> file in the tree so changing what the job includes can be easily done 
>> (assuming
>> the CFBot hasn't changed this recently which I think it hasn't).  I used that
>> trick in the NSS patchset to add a completely new job for --with-ssl=nss 
>> beside
>> the --with-ssl=openssl job.
> 
> I think you misunderstood - I'm suggesting not only to use with-lz4 (which was
> always true since 93d973494), but to change pg_dump -Fc and -Fd to use LZ4 by
> default (the same as I suggested for toast_compression, wal_compression, and
> again in last year's patch to add zstd compression to pg_dump, for which
> postgres was not ready).

Right, I clearly misunderstood, thanks for the clarification.

--
Daniel Gustafsson   https://vmware.com/

Re: Assert in pageinspect with NULL pages

2022-03-27 Thread Peter Geoghegan

On Sun, Mar 27, 2022 at 7:24 AM Robert Haas  wrote:
> It's my impression that there are many ways of crashing the system
> using pageinspect right now. We aren't very careful about making sure
> that our code that reads from disk doesn't crash if the data on disk
> is corrupted, and all of that systemic weakness is inherited by
> pageinspect.

It's significantly worse than the core code, I'd say (before we even
consider the fact that you can supply your own pages). At least with
index AMs, where functions like _bt_checkpage() and gistcheckpage()
provide the most basic sanity checks for any page that is read from
disk.

> Fixing this seems like a lot of work and problematic for various
> reasons. And I do agree that if we can allow people to look at what's
> going on with a corrupt page, that's useful. On the other hand, if by
> "looking" they crash the system, that sucks a lot. So overall I think
> we need a lot more sanity checks here.

I don't think it's particularly hard to do a little more. That's all
it would take to prevent practically all crashes. We're not dealing
with adversarial page images here.

Sure, it would be overkill to fully adapt something like
palloc_btree_page() for pageinspect, for the reason Michael gave. But
there are a couple of checks that it does that would avoid practically
all real world hard crashes, without any plausible downside:

* The basic _bt_checkpage() call that every nbtree page uses in the
core code (as well as in amcheck).

* The maxoffset > MaxIndexTuplesPerPage check.

You could even go a bit further, and care about individual index
tuples. My commit 93ee38eade added some basic sanity checks for index
tuples to bt_page_items(). That could go a bit further as well. In
particular, the sanity checks from amcheck's PageGetItemIdCareful()
function could be carried over. That would make it impossible for
bt_page_items() to read past the end of the page when given a page
that has an index tuple whose item pointer offset has some wildly
unreasonable value.

I'm not volunteering. Just saying that this is quite possible.

-- 
Peter Geoghegan

Re: Small TAP tests cleanup for Windows and unused modules

2022-03-27 Thread Daniel Gustafsson

> On 27 Mar 2022, at 00:20, Daniel Gustafsson  wrote:

> I'll take it for a tour on the CI and will then apply.

Which is now done, thanks!

--
Daniel Gustafsson   https://vmware.com/

Re: multithreaded zstd backup compression for client and server

2022-03-27 Thread Justin Pryzby

On Wed, Mar 23, 2022 at 06:57:04PM -0400, Robert Haas wrote:
> On Wed, Mar 23, 2022 at 5:52 PM Justin Pryzby  wrote:
> > Also because the library may not be compiled with threading.  A few days 
> > ago, I
> > tried to rebase the original "parallel workers" patch over the COMPRESS 
> > DETAIL
> > patch but then couldn't test it, even after trying various versions of the 
> > zstd
> > package and trying to compile it locally.  I'll try again soon...
> 
> Ah. Right, I can update the comment to mention that.

Actually, I suggest to remove those comments:
| "We check for failure here because..."

That should be the rule rather than the exception, so shouldn't require
justifying why one might checks the return value of library and system calls.

In bbsink_zstd_new(), I think you need to check to see if workers were
requested (same as the issue you found with "level").  If someone builds
against a version of zstd which doesn't support some parameter, you'll
currently call SetParameter with that flag anyway, with a default value.
That's not currently breaking anything for me (even though workers=N doesn't
work) but I think it's fragile and could break, maybe when compiled against an
old zstd, or with future options.  SetParameter should only be called when the
user requested to set the parameter.  I handled that for workers in 003, but
didn't touch "level", which is probably fine, but maybe should change for
consistency.

src/backend/replication/basebackup_zstd.c:  elog(ERROR, "could not 
set zstd compression level to %d: %s",
src/bin/pg_basebackup/bbstreamer_gzip.c:pg_log_error("could not 
set compression level %d: %s",
src/bin/pg_basebackup/bbstreamer_zstd.c:
pg_log_error("could not set compression level to: %d: %s",

I'm not sure why these messages sometimes mention the current compression
method and sometimes don't.  I suggest that they shouldn't - errcontext will
have the algorithm, and the user already specified it anyway.  It'd allow the
compiler to merge strings.

Here's a patch for zstd --long mode.  (I don't actually use pg_basebackup, but
I will want to use long mode with pg_dump).  The "strategy" params may also be
interesting, but I haven't played with it.  rsyncable is certainly interesting,
but currently an experimental, nonpublic interface - and a good example of why
to not call SetParameter for params which the user didn't specify: PGDG might
eventually compile postgres against a zstd which supports rsyncable flag.  And
someone might install somewhere which doesn't support rsyncable, but the server
would try to call SetParameter(rsyncable, 0), and the rsyncable ID number
would've changed, so zstd would probably reject it, and basebackup would be
unusable...

$ time src/bin/pg_basebackup/pg_basebackup -h /tmp -Ft -D- --wal-method=none 
--no-manifest -Z zstd:long=1 --checkpoint fast |wc -c
4625935
real0m1,334s

$ time src/bin/pg_basebackup/pg_basebackup -h /tmp -Ft -D- --wal-method=none 
--no-manifest -Z zstd:long=0 --checkpoint fast |wc -c
8426516
real0m0,880s
>From e73af18e791f784b3853511f10fe9e573984bcf4 Mon Sep 17 00:00:00 2001
From: Robert Haas 
Date: Thu, 24 Mar 2022 17:21:11 -0400
Subject: [PATCH 1/5] Fix a few goofs in new backup compression code.

When we try to set the zstd compression level either on the client
or on the server, check for errors.

For any algorithm, on the client side, don't try to set the compression
level unless the user specified one. This was visibly broken for
zstd, which managed to set -1 rather than 0 in this case, but tidy
up the code for the other methods, too.

On the client side, if we fail to create a ZSTD_CCtx, exit after
reporting the error. Otherwise we'll dereference a null pointer.
---
 src/backend/replication/basebackup_zstd.c |  8 ++--
 src/bin/pg_basebackup/bbstreamer_gzip.c   |  3 ++-
 src/bin/pg_basebackup/bbstreamer_lz4.c|  3 ++-
 src/bin/pg_basebackup/bbstreamer_zstd.c   | 19 +--
 4 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/src/backend/replication/basebackup_zstd.c b/src/backend/replication/basebackup_zstd.c
index bb5b668c2ab..5496eaa72b7 100644
--- a/src/backend/replication/basebackup_zstd.c
+++ b/src/backend/replication/basebackup_zstd.c
@@ -98,13 +98,17 @@ bbsink_zstd_begin_backup(bbsink *sink)
 {
 	bbsink_zstd *mysink = (bbsink_zstd *) sink;
 	size_t		output_buffer_bound;
+	size_t		ret;

 	mysink->cctx = ZSTD_createCCtx();
 	if (!mysink->cctx)
 		elog(ERROR, "could not create zstd compression context");

-	ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
-		   mysink->compresslevel);
+	ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
+ mysink->compresslevel);
+	if (ZSTD_isError(ret))
+		elog(ERROR, "could not set zstd compression level to %d: %s",
+			 mysink->compresslevel, ZSTD_getErrorName(ret));

 	/*
 	 * We need our own buffer, because we're going to pass different data to
diff --git a/src

Re: SQL/JSON: JSON_TABLE

2022-03-27 Thread Andrew Dunstan



On 3/24/22 18:54, Andrew Dunstan wrote:
> On 3/5/22 09:35, Andrew Dunstan wrote:
>> On 3/4/22 15:05, Andrew Dunstan wrote:
>>> On 3/4/22 13:13, Erikjan Rijkers wrote:
 Op 04-03-2022 om 17:33 schreef Andrew Dunstan:
> This set of patches deals with items 1..7 above, but not yet the ERROR
> ON ERROR issue. It also makes some message cleanups, but there is more
> to come in that area.
>
> It is based on the latest SQL/JSON Functions patch set, which does not
> include the sql_json GUC patch.
>
> [0001-SQL-JSON-functions-without-sql_json-GUC-v56.patch]
> [0002-JSON_TABLE-v56.patch]
> [0003-JSON_TABLE-PLAN-DEFAULT-clause-v56.patch]
> [0004-JSON_TABLE-PLAN-clause-v56.patch]
 Hi,

 I quickly tried the tests I already had and there are two statements
 that stopped working:

 testdb=# SELECT JSON('{"a": 123, "b": [true, "foo"], "a2": "bar"}'
 RETURNING jsonb);
 ERROR:  syntax error at or near "RETURNING"
 LINE 1: ...SON('{"a": 123, "b": [true, "foo"], "a2": "bar"}' RETURNING
 ...

 testdb=# select JSON_SCALAR(123.45 returning jsonb);
 ERROR:  syntax error at or near "returning"
 LINE 1: select JSON_SCALAR(123.45 returning jsonb)

   (the '^' pointer in both cases underneath 'RETURNING'



>>> Yes, you're right, that was implemented as part of the GUC patch. I'll
>>> try to split that out and send new patchsets with the RETURNING clause
>>> but without the GUC (see upthread for reasons)
>>>
>>>
>> Here's a patchset with RETURNING for JSON() and JSON_SCALAR() but
>> without the GUC
>>
> Here's a new set with the latest sql/json functions patch and fixes for
> some further node handling  inadequacies.
>
>


I have been wrestling with the docs in these patches, which are somewhat
haphazardly spread across the various patches, and tying myself up in
knots. Fixing them so I don't cause later merge pain is difficult. I'm
therefore going to commit this series (staggered as previously
requested) without documentation and then commit a consolidated
documentation patch for them.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Assert in pageinspect with NULL pages

2022-03-27 Thread Robert Haas

On Sun, Mar 27, 2022 at 4:26 PM Peter Geoghegan  wrote:
> We're not dealing
> with adversarial page images here.

I think it's bad that we have to make that assumption, considering
that there's nothing whatever to keep users from supplying arbitrary
page images to pageinspect. But I also agree that if we're unable or
unwilling to make things perfect, it's still good to make them better.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [PATCH] Accept IP addresses in server certificate SANs

2022-03-27 Thread Tom Lane

Jacob Champion  writes:
> [ v10-0001-Move-inet_net_pton-to-src-port.patch etc ]

There is something broken about the ssl tests as modified by
this patch.  The cfbot doesn't provide a lot of evidence about
why it's failing, but I applied the patchset locally and what
I see is

...
ok 47 - mismatch between host name and server certificate sslmode=verify-full: m
atches
Odd number of elements in hash assignment at /home/postgres/pgsql/src/test/ssl/t
/SSL/Server.pm line 288.
Use of uninitialized value in concatenation (.) or string at /home/postgres/pgsq
l/src/test/ssl/t/SSL/Backend/OpenSSL.pm line 178.
Use of uninitialized value in concatenation (.) or string at /home/postgres/pgsq
l/src/test/ssl/t/SSL/Backend/OpenSSL.pm line 178.
### Restarting node "primary"
# Running: pg_ctl -w -D /home/postgres/pgsql/src/test/ssl/tmp_check/t_001_ssltes
ts_primary_data/pgdata -l /home/postgres/pgsql/src/test/ssl/tmp_check/log/001_ss
ltests_primary.log restart
waiting for server to shut down done
server stopped
waiting for server to start stopped waiting
pg_ctl: could not start server

The tail end of the server log is

2022-03-27 17:13:11.482 EDT [551720] FATAL:  could not load server certificate 
file ".crt": No such file or directory

so it seems pretty clear that something's fouling up computation of
a certificate file name.  This may be caused by 9ca234bae or
4a7e964fc.

regards, tom lane

Re: Race conditions in 019_replslot_limit.pl

2022-03-27 Thread Andres Freund

Hi,

On 2022-03-27 15:28:05 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > How about the attached variation, which retries (for 15s, with 100ms sleeps)
> > if there are multiple walsenders, printing the debugging info each time? 
> > It'll
> > still fail the test if later iterations find just one walsender, which seems
> > the right behaviour for now.
> 
> We have now four instances of failures with this version of the test,
> and in every case the second iteration succeeded.  Is that enough
> evidence yet?

I still feel like there's something off here. But that's probably not enough
to keep causing failures. I'm inclined to leave the debugging in for a bit
longer, but not fail the test anymore?


The way the temporary slot removal hangs for a while seems just off:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2022-03-27%2017%3A04%3A18
  2022-03-27 13:30:56.993 EDT [2022-03-27 13:30:56 EDT 750695:20] 
019_replslot_limit.pl DEBUG:  replication slot drop: pg_basebackup_750695: 
removed on-disk
  ...
  2022-03-27 13:30:57.456 EDT [2022-03-27 13:30:57 EDT 750759:3] [unknown] LOG: 
 connection authorized: user=andrew database=postgres 
application_name=019_replslot_limit.pl
  2022-03-27 13:30:57.466 EDT [2022-03-27 13:30:57 EDT 750759:4] 
019_replslot_limit.pl LOG:  statement: SELECT * FROM pg_stat_activity
  .
  2022-03-27 13:30:57.474 EDT [2022-03-27 13:30:56 EDT 750679:87] DEBUG:  
server process (PID 750759) exited with exit code 0
  2022-03-27 13:30:57.507 EDT [2022-03-27 13:30:56 EDT 750695:21] 
019_replslot_limit.pl DEBUG:  replication slot drop: pg_basebackup_750695: done

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2022-03-25%2009%3A00%3A09
  2022-03-25 10:13:30.364 CET 
[4022819][walsender][4/0:0][019_replslot_limit.pl] DEBUG:  replication slot 
drop: pg_basebackup_4022819: begin
  2022-03-25 10:13:30.364 CET 
[4022819][walsender][4/0:0][019_replslot_limit.pl] DEBUG:  replication slot 
drop: pg_basebackup_4022819: removed on-disk
  ...
  2022-03-25 10:13:31.038 CET [4022841][client backend][5/7:0][[unknown]] LOG:  
connection authorized: user=bf database=postgres 
application_name=019_replslot_limit.pl
  2022-03-25 10:13:31.039 CET [4022841][client 
backend][5/8:0][019_replslot_limit.pl] LOG:  statement: SELECT * FROM 
pg_stat_activity
  ...
  2022-03-25 10:13:31.045 CET [4022807][postmaster][:0][] DEBUG:  server 
process (PID 4022841) exited with exit code 0
  2022-03-25 10:13:31.056 CET 
[4022819][walsender][4/0:0][019_replslot_limit.pl] DEBUG:  replication slot 
drop: pg_basebackup_4022819: done

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2022-03-25%2008%3A02%3A05
  2022-03-25 09:15:20.558 CET 
[3730425][walsender][4/0:0][019_replslot_limit.pl] DEBUG:  replication slot 
drop: pg_basebackup_3730425: removed on-disk
  ...
  2022-03-25 09:15:20.803 CET [3730461][client backend][5/7:0][[unknown]] LOG:  
connection authorized: user=bf database=postgres 
application_name=019_replslot_limit.pl
  2022-03-25 09:15:20.804 CET [3730461][client 
backend][5/8:0][019_replslot_limit.pl] LOG:  statement: SELECT * FROM 
pg_stat_activity
  ...
  2022-03-25 09:15:20.834 CET [3730381][postmaster][:0][] DEBUG:  server 
process (PID 3730461) exited with exit code 0
  2022-03-25 09:15:20.861 CET 
[3730425][walsender][4/0:0][019_replslot_limit.pl] DEBUG:  replication slot 
drop: pg_basebackup_3730425: done

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2022-03-25%2003%3A00%3A18
  2022-03-25 04:14:03.025 CET 
[2674398][walsender][4/0:0][019_replslot_limit.pl] DEBUG:  replication slot 
drop: pg_basebackup_2674398: removed on-disk
  ...
  2022-03-25 04:14:03.264 CET [2674463][client backend][5/7:0][[unknown]] LOG:  
connection authorized: user=bf database=postgres 
application_name=019_replslot_limit.pl
  2022-03-25 04:14:03.265 CET [2674463][client 
backend][5/8:0][019_replslot_limit.pl] LOG:  statement: SELECT * FROM 
pg_stat_activity
  ...
  2022-03-25 04:14:03.270 CET [2674384][postmaster][:0][] DEBUG:  server 
process (PID 2674463) exited with exit code 0
  ...
  2022-03-25 04:14:03.324 CET 
[2674398][walsender][4/0:0][019_replslot_limit.pl] DEBUG:  replication slot 
drop: pg_basebackup_2674398: done

We are able to start / finish several new connections between the two debug
elog()sin ReplicationSlotDropPtr()?


I wonder if there's something odd going on with ConditionVariableBroadcast().

Might be useful to add another debug message before/after
ConditionVariableBroadcast() and rmtree(). If the delay is due to rmtree()
being slow under concurrent tests I'd feel less concerned (although that
machine has dual NVMe drives...).


I really wish I could reproduce the failure. I went through a few hundred
cycles of that test in a separate checkout on that machine.


> I'd like to silence this noise so that we can start tracking
> lower-probability failure modes, like say these:
> 
> https://buildfarm.postgresql.org/cgi-b

Re: Race conditions in 019_replslot_limit.pl

2022-03-27 Thread Tom Lane

Andres Freund  writes:
> I still feel like there's something off here. But that's probably not enough
> to keep causing failures. I'm inclined to leave the debugging in for a bit
> longer, but not fail the test anymore?

WFM.

> The way the temporary slot removal hangs for a while seems just off:

Perhaps, but right now we're causing noise in the buildfarm and
learning nothing.

regards, tom lane

Re: Assert in pageinspect with NULL pages

2022-03-27 Thread Peter Geoghegan

On Sun, Mar 27, 2022 at 2:02 PM Robert Haas  wrote:
> On Sun, Mar 27, 2022 at 4:26 PM Peter Geoghegan  wrote:
> > We're not dealing
> > with adversarial page images here.
>
> I think it's bad that we have to make that assumption, considering
> that there's nothing whatever to keep users from supplying arbitrary
> page images to pageinspect.

Maybe it isn't strictly necessary for bt_page_items(), but making that
level of guarantee is really hard, and not particularly useful. And
that's the easy case for pageinspect: gist_page_items() takes a raw
bytea, and puts it through the underlying types output functions.

I think that it might actually be fundamentally impossible to
guarantee that that'll be safe, because we have no idea what the
output function might be doing. It's arbitrary user-defined code that
could easily be implemented in C. Combined with an arbitrary page
image.

> But I also agree that if we're unable or
> unwilling to make things perfect, it's still good to make them better.

I think that most of the functions can approach being perfectly
robust, with a little work. In practical terms they can almost
certainly be made so robust that no real user of bt_page_items() will
ever crash the server. Somebody that goes out of their way to do that
*might* find a way (even with the easier cases), but that doesn't
particularly concern me.

-- 
Peter Geoghegan

Re: [PATCH] Accept IP addresses in server certificate SANs

2022-03-27 Thread Daniel Gustafsson

> On 27 Mar 2022, at 23:19, Tom Lane  wrote:

> This may be caused by 9ca234bae or 4a7e964fc.

I'd say 4a7e964fc is the culprit here.  From a quick skim the the
switch_server_cert() calls need to be changed along the lines of:

  from: switch_server_cert($node, 'server-ip-in-dnsname');
to: switch_server_cert($node, certfile => 'server-ip-in-dnsname');

There migth be more changes required, that was the one that stood out.  Unless
someone beats me to it I'll take a look at fixing up the test in this patch
tomorrow.

--
Daniel Gustafsson   https://vmware.com/

Re: SQL/JSON: functions

2022-03-27 Thread Tom Lane

I wrote:
>> Andres Freund  writes:
>>> There's also
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jabiru&dt=2022-03-22%2022%3A25%3A26
>>> that started failing with
>>> ../../preproc/ecpg --regression -I./../../include -I. -o test1.c test1.pgc
>>> test1.pgc:12: ERROR: syntax error at or near "int"
>>> with this commit.

>> Yeah, I was just scratching my head about that.

This problem came back as soon as we de-reverted that patch :-(.
So much for my guess about unused rules.

What's worse, I'm unable to replicate the failure on an OpenBSD 7.0
system here.  So there's something odd about jabiru's build
environment; but what?

regards, tom lane

Re: SQL/JSON: functions

2022-03-27 Thread Andrew Dunstan




> On Mar 27, 2022, at 7:14 PM, Tom Lane  wrote:
> 
> I wrote:
>>> Andres Freund  writes:
 There's also
 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jabiru&dt=2022-03-22%2022%3A25%3A26
 that started failing with
 ../../preproc/ecpg --regression -I./../../include -I. -o test1.c test1.pgc
 test1.pgc:12: ERROR: syntax error at or near "int"
 with this commit.
> 
>>> Yeah, I was just scratching my head about that.
> 
> This problem came back as soon as we de-reverted that patch :-(.
> So much for my guess about unused rules.
> 
> What's worse, I'm unable to replicate the failure on an OpenBSD 7.0
> system here.  So there's something odd about jabiru's build
> environment; but what?
> 

Very odd. How did it pick up just this commit and not the following one which 
was pushed at the same time?

Cheers

Andrew

Re: SQL/JSON: functions

2022-03-27 Thread Andres Freund

Hi,

On 2022-03-27 20:21:05 -0400, Andrew Dunstan wrote:
> Very odd. How did it pick up just this commit and not the following one which 
> was pushed at the same time?

The first recent failing run was with
f4fb45d15c Sun Mar 27 21:03:34 2022 UTC  SQL/JSON constructors
f79b803dcc Sun Mar 27 21:03:33 2022 UTC  Common SQL/JSON clauses
b64c3bd62e Sun Mar 27 20:26:40 2022 UTC  Remove more unused module imports from 
TAP tests

And then a second failure with:
cc7401d5ca Sun Mar 27 22:32:40 2022 UTC  Fix up compiler warnings/errors from 
f4fb45d15.

Which is what I'd expect?

Greetings,

Andres Freund

Re: [PATCH] Accept IP addresses in server certificate SANs

2022-03-27 Thread Andres Freund

Hi,

On 2022-03-27 17:19:27 -0400, Tom Lane wrote:
> The cfbot doesn't provide a lot of evidence about
> why it's failing, but I applied the patchset locally and what
> I see is

FWIW - and I agree that's not nice user interface wise - just below the cpu /
memory graphs there's a "directory browser", allowing you to navigate to the
saved log files. Navigating to log/src/test/ssl/tmp_check/log allows you to
download
https://api.cirrus-ci.com/v1/artifact/task/5261015175659520/log/src/test/ssl/tmp_check/log/regress_log_001_ssltests
https://api.cirrus-ci.com/v1/artifact/task/5261015175659520/log/src/test/ssl/tmp_check/log/001_ssltests_primary.log

Greetings,

Andres Freund

Re: Corruption during WAL replay

2022-03-27 Thread Kyotaro Horiguchi

At Thu, 24 Mar 2022 15:33:29 -0400, Robert Haas  wrote 
in 
> On Thu, Mar 17, 2022 at 9:21 PM Kyotaro Horiguchi
>  wrote:
> > All versions pass check world.
> 
> Thanks, committed.

(I was overwhelmed by the flood of following discussion..)

Anyway, thanks for picking up this and committing!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-03-27 Thread Kyotaro Horiguchi

At Fri, 25 Mar 2022 13:26:05 +0100, Alvaro Herrera  
wrote in 
> On 2022-Mar-21, Alvaro Herrera wrote:
> 
> > I had a look at this latest version of the patch, and found some things
> > to tweak.  Attached is v21 with three main changes from Kyotaro's v20:
> 
> Pushed this, backpatching to 14 and 13.  It would have been good to
> backpatch further, but there's an (textually trivial) merge conflict
> related to commit e6d8069522c8.  Because that commit conceptually
> touches the same area that this bugfix is about, I'm not sure that
> backpatching further without a lot more thought is wise -- particularly
> so when there's no way to automate the test in branches older than
> master.

Thaks for committing.

> This is quite annoying, considering that the bug was reported shortly
> before 12 went into beta.

Sure.  I'm going to look into that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: SQL/JSON: functions

2022-03-27 Thread Andrew Dunstan



On 3/27/22 20:50, Andres Freund wrote:
> Hi,
>
> On 2022-03-27 20:21:05 -0400, Andrew Dunstan wrote:
>> Very odd. How did it pick up just this commit and not the following one 
>> which was pushed at the same time?
> The first recent failing run was with
> f4fb45d15c Sun Mar 27 21:03:34 2022 UTC  SQL/JSON constructors
> f79b803dcc Sun Mar 27 21:03:33 2022 UTC  Common SQL/JSON clauses
> b64c3bd62e Sun Mar 27 20:26:40 2022 UTC  Remove more unused module imports 
> from TAP tests
>
> And then a second failure with:
> cc7401d5ca Sun Mar 27 22:32:40 2022 UTC  Fix up compiler warnings/errors from 
> f4fb45d15.
>
> Which is what I'd expect?
>


Yes, sorry, The url was for the previous commit, not today's. My mistake.


I'll look into it tomorrow, too tired tonight to be very productive.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-03-27 Thread Thomas Munro

On Mon, Mar 28, 2022 at 2:01 PM Kyotaro Horiguchi
 wrote:
> At Fri, 25 Mar 2022 13:26:05 +0100, Alvaro Herrera  
> wrote in
> > Pushed this, backpatching to 14 and 13.  It would have been good to
> > backpatch further, but there's an (textually trivial) merge conflict
> > related to commit e6d8069522c8.  Because that commit conceptually
> > touches the same area that this bugfix is about, I'm not sure that
> > backpatching further without a lot more thought is wise -- particularly
> > so when there's no way to automate the test in branches older than
> > master.

Just a thought:  we could consider back-patching
allow_in_place_tablespaces, after a little while, if we're happy with
how that is working out, if it'd be useful for verifying bug fixes in
back branches.  It's non-end-user-facing testing infrastructure.

RE: Logical replication timeout problem

2022-03-27 Thread kuroda.hay...@fujitsu.com

Dear Wang-san,

Thank you for updating!
...but it also cannot be applied to current HEAD
because of the commit 923def9a533.

Your patch seems to conflict the adding an argument of 
logicalrep_write_insert().
It allows specifying columns to publish by skipping some columns in 
logicalrep_write_tuple()
which is called from logicalrep_write_insert() and logicalrep_write_update().

Do we have to consider something special case for that?
I thought timeout may occur if users have huge table and publish few columns,
but it is corner case.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-03-27 Thread Yugo NAGATA

On Sun, 27 Mar 2022 15:28:41 +0900 (JST)
Tatsuo Ishii  wrote:

> > This patch has caused the PDF documentation to fail to build cleanly:
> > 
> > [WARN] FOUserAgent - The contents of fo:block line 1 exceed the available 
> > area in the inline-progression direction by more than 50 points. (See 
> > position 125066:375)
> > 
> > It's complaining about this:
> > 
> > 
> > interval_start 
> > num_transactions 
> > sum_latency 
> > sum_latency_2 
> > min_latency 
> > max_latency { 
> > failures | 
> > serialization_failures 
> > deadlock_failures }  
> > sum_lag sum_lag_2 
> > min_lag max_lag 
> >  skipped   
> >  retried 
> > retries 
> > 
> > 
> > which runs much too wide in HTML format too, even though that toolchain
> > doesn't tell you so.
> 
> Yeah.
> 
> > We could silence the warning by inserting an arbitrary line break or two,
> > or refactoring the syntax description into multiple parts.  Either way
> > seems to create a risk of confusion.
> 
> I think we can fold the line nicely. Here is the rendered image.
> 
> Before:
> interval_start num_transactions sum_latency sum_latency_2 min_latency 
> max_latency { failures | serialization_failures deadlock_failures } [ sum_lag 
> sum_lag_2 min_lag max_lag [ skipped ] ] [ retried retries ]
> 
> After:
> interval_start num_transactions sum_latency sum_latency_2 min_latency 
> max_latency
>   { failures | serialization_failures deadlock_failures } [ sum_lag sum_lag_2 
> min_lag max_lag [ skipped ] ] [ retried retries ]
> 
> Note that before it was like this:
> 
> interval_start num_transactions sum_latency sum_latency_2 min_latency 
> max_latency [ sum_lag sum_lag_2 min_lag max_lag [ skipped ] ]
> 
> So newly added items are "{ failures | serialization_failures 
> deadlock_failures }" and " [ retried retries ]".
> 
> > TBH, I think the *real* problem is that the complexity of this log format
> > has blown past "out of hand".  Can't we simplify it?  Who is really going
> > to use all these numbers?  I pity the poor sucker who tries to write a
> > log analysis tool that will handle all the variants.
> 
> Well, the extra logging items above only appear when the retry feature
> is enabled. For those who do not use the feature the only new logging
> item is "failures". For those who use the feature, the extra logging
> items are apparently necessary. For example if we write an application
> using repeatable read or serializable transaction isolation mode,
> retrying failed transactions due to srialization error is an essential
> technique. Also the retry rate of transactions will deeply affect the
> performance and in such use cases the newly added items will be
> precisou information. I would suggest leave the log items as it is.
> 
> Patch attached.

Even applying this patch, "make postgres-A4.pdf" arises the warning on my
machine. After some investigations, I found that previous document had a break
after 'num_transactions', but it has been removed due to this commit. So,
I would like to get back this as it was. I attached the patch.

Regards,
Yugo Nagata


-- 
Yugo NAGATA 
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ebdb4b3f46..4437d5ef53 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -2401,7 +2401,7 @@ END;
format is used for the log files:
 
 
-interval_start num_transactions sum_latency sum_latency_2 min_latency max_latency { failures | serialization_failures deadlock_failures }  sum_lag sum_lag_2 min_lag max_lag  skippedretried retries 
+interval_start num_transactions&zwsp sum_latency sum_latency_2 min_latency max_latency { failures | serialization_failures deadlock_failures }  sum_lag sum_lag_2 min_lag max_lag  skippedretried retries 
 
 
where

Re: pg_stat_get_replication_slot() marked not strict, crashes

2022-03-27 Thread Amit Kapila

On Sun, Mar 27, 2022 at 11:59 AM vignesh C  wrote:
>
> On Sun, Mar 27, 2022 at 2:54 AM Andres Freund  wrote:
> >
> > Hi,
> >
> > I'm working to increase the test coverage of pgstat related stuff higher 
> > (for
> > the shared memory stats patch, of course).
> >
> > "Accidentally" noticed that
> >   SELECT * FROM pg_stat_get_replication_slot(NULL);
> > crashes.  This is present in HEAD and 14.
> >
> > I guess we'll have to add a code-level check in 14 to deal with this?
>
> This problem is reproducible in both PG14 & Head, changing isstrict
> solves the problem. In PG14 should we also add a check in
> pg_stat_get_replication_slot so that it can solve the problem for the
> existing users who have already installed PG14 or will this be handled
> automatically when upgrading to the new version.
>

I am not sure if for 14 we can make a catalog change as that would
require catversion bump, so adding a code-level check as suggested by
Andres seems like a better option. Andres/Tom, any better ideas for
this?

Thanks for the patch but for HEAD, we also need handling and test for
pg_stat_get_subscription_stats. Considering this for HEAD, we can mark
both pg_stat_get_replication_slot and pg_stat_get_subscription_stats
as strict and in 14, we need to add a code-level check for
pg_stat_get_replication_slot.

-- 
With Regards,
Amit Kapila.

Fix pg_waldump documentation about block option

2022-03-27 Thread Japin Li


Hi,

When I read the documentation of pg_waldump, I found the description about
-B option is wrong.

   -B block
   --block=block
   Only display records that modify the given block. The relation must 
also be provided with --relation or -l.

Before 52b5568, the -l option is short for --relation, however, it has been
changed to -R, and we forgot to update the documentation.

Here is a patch for it.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 4e86f44c5b..1a05af5d97 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -85,7 +85,7 @@ PostgreSQL documentation

 Only display records that modify the given block.  The relation must
 also be provided with --relation or
--l.
+-R.

Re: A test for replay of regression tests

2022-03-27 Thread Thomas Munro

cfbot found another source of nondeterminism in the regression tests,
due to the smaller shared_buffers used in this TAP test:

https://cirrus-ci.com/task/4611828654276608
https://api.cirrus-ci.com/v1/artifact/task/4611828654276608/log/src/test/recovery/tmp_check/regression.diffs

Turned out that we had already diagnosed that once before, when tiny
build farm animal chipmunk reported the same, but we didn't commit a
fix:

https://www.postgresql.org/message-id/flat/CA%2BhUKGLTK6ZuEkpeJ05-MEmvmgZveCh%2B_w013m7%2ByKWFSmRcDA%40mail.gmail.com

Re: Document atthasmissing default optimization avoids verification table scan

2022-03-27 Thread David G. Johnston

On Sun, Mar 27, 2022 at 11:17 AM James Coleman  wrote:

> Hmm,  I didn't realize that was project policy,

Guideline/Rule of Thumb is probably a better concept.

> but I'm a bit
> surprised given that the sentence which 0001 replaces seems like a
> direct violation of that also: "In neither case is a rewrite of the
> table required."
>
>
IMO mostly due to the structuring of the paragraphs; something like the
following makes it less problematic (and as shown below may be
sufficient to address the purpose of this patch)

"""
[...]
The following alterations of the table require the entire table, and/or its
indexes, to be rewritten; which may take a significant amount of time for a
large table, and will temporarily require as much as double the disk space.

Changing the type of an existing column will require the entire table and
its indexes to be rewritten. As an exception, if the USING clause does not
change the column contents and the old type is either binary coercible to
the new type, or an unconstrained domain over the new type, a table rewrite
is not needed; but any indexes on the affected columns must still be
rewritten.

Adding a column with a volatile DEFAULT also requires the entire table and
its indexes to be rewritten.

The reason a non-volatile (or absent) DEFAULT does not require a rewrite of
the table is because the DEFAULT expression (or NULL) is evaluated at the
time of the statement and the result is stored in the table's metadata.

The following alterations of the table require that it be scanned in its
entirety to ensure that no existing values are contrary to the new
constraints placed on the table.  Constraints backed by indexes will scan
the table as a side-effect of populating the new index with data.

Adding a CHECK constraint requires scanning the table to verify that
existing rows meet the constraint.  The same goes for adding a NOT NULL
constraint to an existing column.

A newly attached partition requires scanning the table to verify that
existing rows meet the partition constraint.

A foreign key constraint requires scanning the table to verify that all
existing values exist on the referenced table.

The main reason for providing the option to specify multiple changes in a
single ALTER TABLE is that multiple table scans or rewrites can thereby be
combined into a single pass over the table.

Scanning a large table to verify a new constraint can take a long time, and
other updates to the table are locked out until the ALTER TABLE ADD
CONSTRAINT command is committed. For CHECK and FOREIGN KEY constraints
there is an option, NOT VALID, that reduces the impact of adding a
constraint on concurrent updates. With NOT VALID, the ADD CONSTRAINT
command does not scan the table and can be committed immediately. After
that, a VALIDATE CONSTRAINT command can be issued to verify that existing
rows satisfy the constraint. The validation step does not need to lock out
concurrent updates, since it knows that other transactions will be
enforcing the constraint for rows that they insert or update; only
pre-existing rows need to be checked. Hence, validation acquires only a
SHARE UPDATE EXCLUSIVE lock on the table being altered. (If the constraint
is a foreign key then a ROW SHARE lock is also required on the table
referenced by the constraint.) In addition to improving concurrency, it can
be useful to use NOT VALID and VALIDATE CONSTRAINT in cases where the table
is known to contain pre-existing violations. Once the constraint is in
place, no new violations can be inserted, and the existing problems can be
corrected at leisure until VALIDATE CONSTRAINT finally succeeds.

The DROP COLUMN form does not physically remove the column, but simply
makes it invisible to SQL operations. Subsequent insert and update
operations in the table will store a null value for the column. Thus,
dropping a column is quick but it will not immediately reduce the on-disk
size of your table, as the space occupied by the dropped column is not
reclaimed. The space will be reclaimed over time as existing rows are
updated.

To force immediate reclamation of space occupied by a dropped column, you
can execute one of the forms of ALTER TABLE that performs a rewrite of the
whole table. This results in reconstructing each row with the dropped
column replaced by a null value.

The rewriting forms of ALTER TABLE are not MVCC-safe. After a table
rewrite, the table will appear empty to concurrent transactions, if they
are using a snapshot taken before the rewrite occurred. See Section 13.5
for more details.
[...]
"""

I'm liking the idea of breaking out multiple features into their own
sentences or paragraphs instead of saying:

"Adding a column with a volatile DEFAULT or changing the type of an
existing column"

"Adding a CHECK or NOT NULL constraint"

This later combination probably doesn't catch my attention except for this
discussion and the fact that there are multiple ways to add these
constraints and we might as well be clear

Re: Add pg_freespacemap extension sql test

2022-03-27 Thread Michael Paquier

On Sun, Mar 27, 2022 at 01:18:46PM -0400, Tom Lane wrote:
> skink has passed several runs since the commit went in, so it's
> "unstable" not "fails consistently".  I see the test tries to
> disable autovacuum on that table, so that doesn't seem to be
> the problem ... what is?

This is a race condition, directly unrelated to valgrind but easier to
trigger under it because things get slower.  It takes me a dozen of
tries to be able to reproduce the failure locally, but I can wiht
valgrind enabled.

So, the output of the test is simply telling us that the FSM of the
main table is not getting truncated.  From what I can see, the
difference is in should_attempt_truncation(), where we finish with
nonempty_pages set to 1 rather than 0 on failure.  And it just takes
one autovacuum to run in parallel of the manual VACUUM after the
DELETE to prevent the removal of those tuples, which is what I can see
from the logs on failure:
LOG:  statement: DELETE FROM freespace_tab;
DEBUG: autovacuum: processing database "contrib_regression"
LOG:  statement: VACUUM freespace_tab;

It seems to me here that the snapshot hold by autovacuum during the
scan of pg_database to find the relations to process is enough to
prevent the FSM truncation, as the tuples cleaned up by the DELETE
query still need to be visible.  One simple way to keep this test
would be a custom configuration file with autovacuum disabled and
NO_INSTALLCHECK.  Any better ideas?
--
Michael

signature.asc
Description: PGP signature

Re: Add a pg_get_query_def function (was Re: Deparsing rewritten query)

2022-03-27 Thread Julien Rouhaud

On Sun, Mar 27, 2022 at 11:53:58AM -0400, Tom Lane wrote:
> Julien Rouhaud  writes:
> > [ v4-0001-Add-a-pg_get_query_def-wrapper-around-get_query_d.patch ]
>
> This seems about ready to go to me, except for
>
> (1) as an exported API, pg_get_querydef needs a full API spec in its
> header comment.  "Read the code to figure out what to do" is not OK
> in my book.

Fixed.

> (2) I don't think this has been thought out too well:
>
> > I also kept the wrapColument and startIdent as they
> > can be easily used by callers.
>
> The appropriate values for these are determined by macros that are
> local in ruleutils.c, so it's not that "easy" for outside callers
> to conform to standard practice.  I suppose we could move
> WRAP_COLUMN_DEFAULT etc into ruleutils.h, but is there actually a
> use-case for messing with those?

As far as I can see the wrapColumn and startIndent are independant of the
pretty flags, and don't really have magic numbers for those
(WRAP_COLUMN_DEFAULT sure exists, but the value isn't really surprising).

> I don't see any other exported
> functions that go beyond offering a "bool pretty" option, so
> I think it might be a mistake for this one to be different.

There's the SQL function pg_get_viewdef_wrap() that accept a custom wrapColumn.

That being said I'm totally ok with just exposing a "pretty" parameter and use
WRAP_COLUMN_DEFAULT.  In any case I agree that exposing startIndent doesn't
serve any purpose.

I'm attaching a v5 with hopefully a better comment for the function, and only
the "pretty" parameter.
>From 6497eb63cecdccc2669ec4ff31ab24f6f3e1 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Tue, 29 Jun 2021 00:07:04 +0800
Subject: [PATCH v5] Add a pg_get_query_def() wrapper around get_query_def().

This function can be useful for external modules, for instance if they want to
display a statement after the rewrite stage.

In order to emit valid SQL, make sure that any subquery RTE comes with an
alias.  This is always the case for user facing queries, as the parser will
refuse a subquery without an alias, but that may not be the case for a Query
after rewriting for instance.

Catversion is bumped.

Author: Julien Rouhaud
Reviewed-by: Pavel Stehule
Discussion: https://postgr.es/m/20210627041138.zklczwmu3ms4ufnk@nol
---
 src/backend/utils/adt/ruleutils.c | 55 +--
 src/include/utils/ruleutils.h |  1 +
 2 files changed, 46 insertions(+), 10 deletions(-)

diff --git a/src/backend/utils/adt/ruleutils.c 
b/src/backend/utils/adt/ruleutils.c
index df5c486501..60fb167067 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -90,6 +90,8 @@
 #define PRETTYFLAG_INDENT  0x0002
 #define PRETTYFLAG_SCHEMA  0x0004
 
+#define GET_PRETTY_FLAGS(pretty) ((pretty) ? (PRETTYFLAG_PAREN | 
PRETTYFLAG_INDENT | PRETTYFLAG_SCHEMA) : PRETTYFLAG_INDENT)
+
 /* Default line length for pretty-print wrapping: 0 means wrap always */
 #define WRAP_COLUMN_DEFAULT0
 
@@ -532,7 +534,7 @@ pg_get_ruledef_ext(PG_FUNCTION_ARGS)
int prettyFlags;
char   *res;
 
-   prettyFlags = pretty ? (PRETTYFLAG_PAREN | PRETTYFLAG_INDENT | 
PRETTYFLAG_SCHEMA) : PRETTYFLAG_INDENT;
+   prettyFlags = GET_PRETTY_FLAGS(pretty);
 
res = pg_get_ruledef_worker(ruleoid, prettyFlags);
 
@@ -653,7 +655,7 @@ pg_get_viewdef_ext(PG_FUNCTION_ARGS)
int prettyFlags;
char   *res;
 
-   prettyFlags = pretty ? (PRETTYFLAG_PAREN | PRETTYFLAG_INDENT | 
PRETTYFLAG_SCHEMA) : PRETTYFLAG_INDENT;
+   prettyFlags = GET_PRETTY_FLAGS(pretty);
 
res = pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT);
 
@@ -673,7 +675,7 @@ pg_get_viewdef_wrap(PG_FUNCTION_ARGS)
char   *res;
 
/* calling this implies we want pretty printing */
-   prettyFlags = PRETTYFLAG_PAREN | PRETTYFLAG_INDENT | PRETTYFLAG_SCHEMA;
+   prettyFlags = GET_PRETTY_FLAGS(true);
 
res = pg_get_viewdef_worker(viewoid, prettyFlags, wrap);
 
@@ -719,7 +721,7 @@ pg_get_viewdef_name_ext(PG_FUNCTION_ARGS)
Oid viewoid;
char   *res;
 
-   prettyFlags = pretty ? (PRETTYFLAG_PAREN | PRETTYFLAG_INDENT | 
PRETTYFLAG_SCHEMA) : PRETTYFLAG_INDENT;
+   prettyFlags = GET_PRETTY_FLAGS(pretty);
 
/* Look up view name.  Can't lock it - we might not have privileges. */
viewrel = makeRangeVarFromNameList(textToQualifiedNameList(viewname));
@@ -1062,7 +1064,7 @@ pg_get_triggerdef_worker(Oid trigid, bool pretty)
context.windowClause = NIL;
context.windowTList = NIL;
context.varprefix = true;
-   context.prettyFlags = pretty ? (PRETTYFLAG_PAREN | 
PRETTYFLAG_INDENT | PRETTYFLAG_SCHEMA) : PRETTYFLAG_INDENT;
+   context.prettyFlags = GET_PRETTY_FLAGS(pretty);
context.wrapColumn = WRAP_COLUMN_DEFAULT;

Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-03-27 Thread Tatsuo Ishii

> Even applying this patch, "make postgres-A4.pdf" arises the warning on my
> machine. After some investigations, I found that previous document had a break
> after 'num_transactions', but it has been removed due to this commit.

Yes, your patch removed "&zwsp;".

> So,
> I would like to get back this as it was. I attached the patch.

This produces errors. Needs ";" postfix?

ref/pgbench.sgml:2404: parser error : EntityRef: expecting ';'
le>interval_start num_transactions&zwsp
   ^
ref/pgbench.sgml:2781: parser error : chunk is not well balanced

^
reference.sgml:251: parser error : Failure to process entity pgbench
   &pgbench;
^
reference.sgml:251: parser error : Entity 'pgbench' not defined
   &pgbench;
^
reference.sgml:296: parser error : chunk is not well balanced

^
postgres.sgml:240: parser error : Failure to process entity reference
 &reference;
^
postgres.sgml:240: parser error : Entity 'reference' not defined
 &reference;
^
make: *** [Makefile:135: html-stamp] エラー 1

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-27 Thread Peter Geoghegan

On Thu, Mar 24, 2022 at 2:40 PM Peter Geoghegan  wrote:
> > > This is absolutely mandatory in the aggressive case, because otherwise
> > > relfrozenxid advancement might be seen as unsafe. My observation is:
> > > Why should we accept the same race in the non-aggressive case? Why not
> > > do essentially the same thing in every VACUUM?
> >
> > Sure, that seems like a good idea. I think I basically agree with the
> > goals of the patch.
>
> Great.

Attached is v12. My current goal is to commit all 3 patches before
feature freeze. Note that this does not include the more complicated
patch including with previous revisions of the patch series (the
page-level freezing work that appeared in versions before v11).

Changes that appear in this new revision, v12:

* Reworking of the commit messages based on feedback from Robert.

* General cleanup of the changes to heapam.c from 0001 (the changes to
heap_prepare_freeze_tuple and related functions).  New and existing
code now fits together a bit better. I also added a couple of new
documenting assertions, to make the flow a bit easier to understand.

* Added new assertions that document
OldestXmin/FreezeLimit/relfrozenxid invariants, right at the point we
update pg_class within vacuumlazy.c.

These assertions would have a decent chance of failing if there were
any bugs in the code.

* Removed patch that made DISABLE_PAGE_SKIPPING not force aggressive
VACUUM, limiting the underlying mechanism to forcing scanning of all
pages in lazy_scan_heap (v11 was the first and last revision that
included this patch).

* Adds a new small patch 0003. This just moves the last piece of
resource allocation that still took place at the top of
lazy_scan_heap() back into its caller, heap_vacuum_rel().

The work in 0003 probably should have happened as part of the patch
that became commit 73f6ec3d -- same idea. It's totally mechanical
stuff. With 0002 and 0003, there is hardly any lazy_scan_heap code
before the main loop that iterates through blocks in rel_pages (and
the code that's still there is obviously related to the loop in a
direct and obvious way). This seems like a big overall improvement in
maintainability.

Didn't see a way to split up 0002, per Robert's suggestion 3 days ago.
As I said at the time, it's possible to split it up, but not in a way
that highlights the underlying issue (since the issue 0002 fixes was
always limited to non-aggressive VACUUMs). The commit message may have
to suffice.

--
Peter Geoghegan

v12-0001-Set-relfrozenxid-to-oldest-extant-XID-seen-by-VA.patch
Description: Binary data

v12-0003-vacuumlazy.c-Move-resource-allocation-to-heap_va.patch
Description: Binary data

v12-0002-Generalize-how-VACUUM-skips-all-frozen-pages.patch
Description: Binary data

Re: Fix pg_waldump documentation about block option

2022-03-27 Thread Thomas Munro

On Mon, Mar 28, 2022 at 4:02 PM Japin Li  wrote:
> Before 52b5568, the -l option is short for --relation, however, it has been
> changed to -R, and we forgot to update the documentation.
>
> Here is a patch for it.

Pushed.  Thanks!

Re: Invalid comment in ParallelQueryMain

2022-03-27 Thread Julien Rouhaud

Hi,

On Sun, Mar 27, 2022 at 04:13:00PM +0900, Michael Paquier wrote:
> On Sat, Mar 26, 2022 at 05:12:05PM +0100, Matthias van de Meent wrote:
> > On Sat, 26 Mar 2022 at 17:01, Julien Rouhaud  wrote:
> >> While reading code around I just noticed that I failed to adapt a comment a
> >> couple of lines above a removed line in 0f61727b75b9.  Patch attached.
> > 
> > +1, seems OK to me.
> 
> Yep.  Will fix.

For the archive's sake, this has been pushed as-of
411b91360f2711e36782b68cd0c9bc6de44d3384.

Thanks!

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

2022-03-27 Thread Yugo NAGATA

On Fri, 25 Mar 2022 16:19:54 -0400
Tom Lane  wrote:

> Fabien COELHO  writes:
> >> [...] One way to avoid these errors is to send Parse messages before 
> >> pipeline mode starts. I attached a patch to fix to prepare commands at 
> >> starting of a script instead of at the first execution of the command.
> 
> > ISTM that moving prepare out of command execution is a good idea, so I'm 
> > in favor of this approach: the code is simpler and cleaner.
> > ISTM that a minor impact is that the preparation is not counted in the 
> > command performance statistics. I do not think that it is a problem, even 
> > if it would change detailed results under -C -r -M prepared.
> 
> I am not convinced this is a great idea.  The current behavior is that
> a statement is not prepared until it's about to be executed, and I think
> we chose that deliberately to avoid semantic differences between prepared
> and not-prepared mode.  For example, if a script looks like
> 
> CREATE FUNCTION foo(...) ...;
> SELECT foo(...);
> DROP FUNCTION foo;
> 
> trying to prepare the SELECT in advance would lead to failure.
>
> We could perhaps get away with preparing the commands within a pipeline
> just before we start to execute the pipeline, but it looks to me like
> this patch tries to prepare the entire script in advance.
> 
Well, the semantic differences is already in the current behavior.
Currently, pgbench fails to execute the above script in prepared mode
because it prepares the entire script in advance just before the first
command execution. This patch does not change the semantic.

> BTW, the cfbot says the patch is failing to apply anyway ...
> I think it was sideswiped by 4a39f87ac.

I attached the rebased patch.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index acf3e56413..bf8fecf219 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3074,6 +3074,30 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/* Prepare SQL commands in the chosen script */
+static void
+prepareCommands(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	for (j = 0; commands[j] != NULL; j++)
+	{
+		PGresult   *res;
+		char		name[MAX_PREPARE_NAME];
+
+		if (commands[j]->type != SQL_COMMAND)
+			continue;
+		preparedStatementName(name, st->use_file, j);
+		res = PQprepare(st->con, name,
+		commands[j]->argv[0], commands[j]->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+	}
+	st->prepared[st->use_file] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -3107,42 +3131,6 @@ sendCommand(CState *st, Command *command)
 		char		name[MAX_PREPARE_NAME];
 		const char *params[MAX_ARGS];
 
-		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-PGresult   *res;
-char		name[MAX_PREPARE_NAME];
-
-if (commands[j]->type != SQL_COMMAND)
-	continue;
-preparedStatementName(name, st->use_file, j);
-if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-{
-	res = PQprepare(st->con, name,
-	commands[j]->argv[0], commands[j]->argc - 1, NULL);
-	if (PQresultStatus(res) != PGRES_COMMAND_OK)
-		pg_log_error("%s", PQerrorMessage(st->con));
-	PQclear(res);
-}
-else
-{
-	/*
-	 * In pipeline mode, we use asynchronous functions. If a
-	 * server-side error occurs, it will be processed later
-	 * among the other results.
-	 */
-	if (!PQsendPrepare(st->con, name,
-	   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-		pg_log_error("%s", PQerrorMessage(st->con));
-}
-			}
-			st->prepared[st->use_file] = true;
-		}
-
 		getQueryParams(&st->variables, command, params);
 		preparedStatementName(name, st->use_file, st->command);
 
@@ -3619,6 +3607,20 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 	memset(st->prepared, 0, sizeof(st->prepared));
 }
 
+/*
+ * Prepare SQL commands if needed.
+ *
+ * We should send Parse messages before executing meta commands
+ * especially /startpipeline. If a Parse message is sent in
+ * pipeline mode, a transaction starts before BEGIN is sent, and
+ * it could be a problem. For example, "BEGIN ISOLATION LEVEL
+ * SERIALIZABLE" is sent after a transaction starts, the error
+ * "ERROR:  SET TRANSACTION ISOLATION LEVEL must be called before any query"
+ * occurs.
+ */
+if (querymode == QUERY_PREPARED && !st->prepared[st->use_file])
+	prepareCommands(st);
+
 /*
  * It is the first try to run this transaction. Remember the
  * random state: maybe it will get an error and we will need to
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench

Re: Assert in pageinspect with NULL pages

2022-03-27 Thread Michael Paquier

On Sun, Mar 27, 2022 at 02:36:54PM -0700, Peter Geoghegan wrote:
> I think that it might actually be fundamentally impossible to
> guarantee that that'll be safe, because we have no idea what the
> output function might be doing. It's arbitrary user-defined code that
> could easily be implemented in C. Combined with an arbitrary page
> image.

My guess that you'd bring in a lot more safety if we completely cut
the capacity to fetch and pass down raw pages across the SQL calls
because you can add checks for the wanted AM, replacing all that with
a (regclass,blkno) pair.  I am however ready to buy that this may not
be worth rewriting 10~15 years after the fact.

> I think that most of the functions can approach being perfectly
> robust, with a little work. In practical terms they can almost
> certainly be made so robust that no real user of bt_page_items() will
> ever crash the server. Somebody that goes out of their way to do that
> *might* find a way (even with the easier cases), but that doesn't
> particularly concern me.

That does not concern me either, and my own take is to take as
realistic things that can be fetched from a get_raw_page() call rather
than a bytea specifically crafted.  Now, I have found myself in cases
where it was useful to see the contents of a page, as long as you can 
go through the initial checks, particularly in cases where corruptions
involved unaligned contents.  Trigerring an error on a sanity check
for a specific block may be fine, now I have also found myself doing
some full scans to see in one shot the extent of a damaged relation
file using the functions of pageinspect.  Fun times.
--
Michael


signature.asc
Description: PGP signature

Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-03-27 Thread Yugo NAGATA

On Mon, 28 Mar 2022 12:17:13 +0900 (JST)
Tatsuo Ishii  wrote:

> > Even applying this patch, "make postgres-A4.pdf" arises the warning on my
> > machine. After some investigations, I found that previous document had a 
> > break
> > after 'num_transactions', but it has been removed due to this commit.
> 
> Yes, your patch removed "&zwsp;".
> 
> > So,
> > I would like to get back this as it was. I attached the patch.
> 
> This produces errors. Needs ";" postfix?

Oops. Yes, it needs ';'. Also, I found another "&zwsp;" dropped.
I attached the fixed patch.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ebdb4b3f46..b16a5b9b7b 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -2401,7 +2401,7 @@ END;
format is used for the log files:
 
 
-interval_start num_transactions sum_latency sum_latency_2 min_latency max_latency { failures | serialization_failures deadlock_failures }  sum_lag sum_lag_2 min_lag max_lag  skippedretried retries 
+interval_start num_transactions&zwsp; sum_latency sum_latency_2 min_latency max_latency&zwsp; { failures | serialization_failures deadlock_failures }  sum_lag sum_lag_2 min_lag max_lag  skippedretried retries 
 
 
where

Re: pg_stat_get_replication_slot() marked not strict, crashes

2022-03-27 Thread Andres Freund

Hi,

On 2022-03-28 08:28:29 +0530, Amit Kapila wrote:
> I am not sure if for 14 we can make a catalog change as that would
> require catversion bump, so adding a code-level check as suggested by
> Andres seems like a better option. Andres/Tom, any better ideas for
> this?

I think we could do the catalog change too, so that future initdb's are marked
correctly. But we obviously do need the code-level check nevertheless.

> Thanks for the patch but for HEAD, we also need handling and test for
> pg_stat_get_subscription_stats. Considering this for HEAD, we can mark
> both pg_stat_get_replication_slot and pg_stat_get_subscription_stats
> as strict and in 14, we need to add a code-level check for
> pg_stat_get_replication_slot.

FWIW, I have a test for both, I was a bit "stuck" on where to put the
pg_stat_get_subscription_stats(NULL) test. I had put the
pg_stat_get_replication_slot(NULL) in contrib/test_decoding/sql/stats.sql
but pg_stat_get_subscription_stats() doesn't really fit there.  I think I'm
coming down to putting a section of such tests into 
src/test/regress/sql/stats.sql
instead. In the hope of preventing future such occurrances by encouraging
people to copy the test...

Greetings,

Andres Freund

Re: pg_stat_get_replication_slot() marked not strict, crashes

2022-03-27 Thread Tom Lane

Andres Freund  writes:
> On 2022-03-28 08:28:29 +0530, Amit Kapila wrote:
>> I am not sure if for 14 we can make a catalog change as that would
>> require catversion bump, so adding a code-level check as suggested by
>> Andres seems like a better option. Andres/Tom, any better ideas for
>> this?

> I think we could do the catalog change too, so that future initdb's are marked
> correctly. But we obviously do need the code-level check nevertheless.

Yeah.  We have to install the C-level check, so I don't see any
point in changing the catalogs in back branches.  That'll create
confusion while not saving anything.

regards, tom lane

Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas

2022-03-27 Thread Michael Paquier

On Wed, Feb 16, 2022 at 10:24:58PM +0100, Matthias van de Meent wrote:
> A first good reason to do this is preventing further damage when a
> page is corrupted: if I can somehow overwrite pd_special,
> non-assert-enabled builds would start reading and writing at arbitrary
> offsets from the page pointer, quite possibly in subsequent buffers
> (or worse, on the stack, in case of stack-allocated blocks).

Well, pageinspect has proved to be rather careless in this area for a
couple of years, but this just came from the fact that we called
PageGetSpecialPointer() before checking that PageGetSpecialSize() maps
with the MAXALIGN'd size of the opaque area, if the related AM has
one.

I am not sure that we need to worry about corruptions, as data
checksums would offer protection for most cases users usually need to
worry about, at the exception of page corruptions because of memory
for pages already read in the PG shared buffers from disk or even the
OS cache.

> A second reason would be less indirection to get to the opaque
> pointer. This should improve performance a bit in those cases where we
> (initially) only want to access the [Index]PageOpaque struct.

We don't have many cases that do that, do we?

> Lastly, 0002 and 0003 remove some repetitive tasks of dealing with the
> [nbtree, hash] opaques by providing a typed accessor macro similar to
> what is used in the GIN and (SP-)GIST index methods; improving the
> legibility of the code and decreasing the churn.

+#define PageGetSpecialOpaque(page, OpaqueDataTyp) \
+( \
+   AssertMacro(PageGetPageSize(page) == BLCKSZ && \
+   PageGetSpecialSize(page) == MAXALIGN(sizeof(OpaqueDataTyp))), \
+   (OpaqueDataTyp *) ((char *) (page) + (BLCKSZ - 
MAXALIGN(sizeof(OpaqueDataTyp \
+)

Did you notice PageValidateSpecialPointer() that mentions MSVC?  Could
this stuff a problem if not an inline function.

Perhaps you should do a s/OpaqueDataTyp/OpaqueDataType/.

As a whole, this patch set looks like an improvement in terms of
checks and consistency regarding the special area handling across the
different AMs for the backend code, while reducing the MAXALIGN-ism on
all those sizes, and this tends to be missed.  Any other opinions?
--
Michael

signature.asc
Description: PGP signature

Re: pg_stat_get_replication_slot() marked not strict, crashes

2022-03-27 Thread Masahiko Sawada

On Sun, Mar 27, 2022 at 6:52 AM Andres Freund  wrote:
>
> Hi,
>
> On 2022-03-26 17:41:53 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > I wonder if we ought to make PG_GETARG_DATUM(n) assert that 
> > > !PG_ARGISNULL(n)?
> > > That'd perhaps make it easier to catch some of these...
> >
> > Don't see the point; such cases will crash just fine without any
> > assert.  The problem is lack of test coverage ...
>
> Not reliably. Byval types typically won't crash, just do something
> bogus. As e.g. in the case of pg_stat_get_subscription_stats(NULL) I found to
> also be wrong upthread.

Right. But it seems like we cannot simply add PG_ARGISNULL () to
PG_GETARG_DATUM(). There are some codes such as array_remove() and
array_replace() that call PG_GETARG_DATUM() and PG_ARGISNULL() and
pass these values to functions that do is-null check

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: pg_stat_get_replication_slot() marked not strict, crashes

2022-03-27 Thread Andres Freund

Hi,

On 2022-03-27 21:09:29 -0700, Andres Freund wrote:
> FWIW, I have a test for both, I was a bit "stuck" on where to put the
> pg_stat_get_subscription_stats(NULL) test. I had put the
> pg_stat_get_replication_slot(NULL) in contrib/test_decoding/sql/stats.sql
> but pg_stat_get_subscription_stats() doesn't really fit there.  I think I'm
> coming down to putting a section of such tests into 
> src/test/regress/sql/stats.sql
> instead. In the hope of preventing future such occurrances by encouraging
> people to copy the test...

Pushed with tests there.

Vignesh, thanks for the patches! I already had something locally, should have
mentioned that...

Greetings,

Andres Freund

Re: logical decoding and replication of sequences

2022-03-27 Thread Amit Kapila

On Sat, Mar 26, 2022 at 3:26 PM Tomas Vondra
 wrote:
>
> On 3/26/22 08:28, Amit Kapila wrote:
> >>
> >> 2) The transaction handling in is a bit confusing. The non-transactional
> >> increments won't have any explicit commit later, so we can't just rely
> >> on begin_replication_step/end_replication_step. But I want to try
> >> spending a bit more time on this.
> >>
> >
> > I didn't understand what you want to say in point (2).
> >
>
> My point is that handle_apply_sequence() either needs to use the same
> transaction handling as other apply methods, or start (and commit) a
> separate transaction for the "transactional" case.
>
> Which means we can't use the begin_replication_step/end_replication_step
>

We already call CommitTransactionCommand after end_replication_step at
a few places in that file so as there is no explicit commit in
non-transactional case, we can probably call CommitTransactionCommand
for it.

> and the current code seems a bit complex. And I'm not sure it's quite
> correct. So this place needs more work.
>

Agreed.

>
> >> For a while I was thinking that maybe this means we don't need the
> >> transactional behavior at all, but I think we do - we have to handle
> >> ALTER SEQUENCE cases that are transactional.
> >>
> >
> > I need some time to think about this.
>

While thinking about this, I think I see a problem with the
non-transactional handling of sequences. It seems that we will skip
sending non-transactional sequence change if it occurs before the
decoding has reached a consistent point but the surrounding commit
occurs after a consistent point is reached. In such cases, the
corresponding DMLs like inserts will be sent but sequence changes
won't be sent. For example (this scenario is based on
twophase_snapshot.spec),

Initial setup:
==
Create table t1_seq(c1 int);
Create Sequence seq1;

Test Execution via multiple sessions (this test allows insert in
session-2 to happen before we reach a consistent point and commit
happens after a consistent point):
===

Session-2:
Begin;
SELECT pg_current_xact_id();

Session-1:
SELECT 'init' FROM pg_create_logical_replication_slot('test_slot',
'test_decoding', false, true);

Session-3:
Begin;
SELECT pg_current_xact_id();

Session-2:
Commit;
Begin;
INSERT INTO t1_seq SELECT nextval('seq1') FROM generate_series(1,100);

Session-3:
Commit;

Session-2:
Commit 'foo'

Session-1:
SELECT data  FROM pg_logical_slot_get_changes('test_slot', NULL, NULL,
'include-xids', 'false', 'skip-empty-xacts', '1');

 data
--
 BEGIN
 table public.t1_seq: INSERT: c1[integer]:1
 table public.t1_seq: INSERT: c1[integer]:2
 table public.t1_seq: INSERT: c1[integer]:3
 table public.t1_seq: INSERT: c1[integer]:4
 table public.t1_seq: INSERT: c1[integer]:5
 table public.t1_seq: INSERT: c1[integer]:6

Now, if we normally try to decode such an insert, the result would be
something like:
 data
--
 sequence public.seq1: transactional:0 last_value: 33 log_cnt: 0 is_called:1
 sequence public.seq1: transactional:0 last_value: 66 log_cnt: 0 is_called:1
 sequence public.seq1: transactional:0 last_value: 99 log_cnt: 0 is_called:1
 sequence public.seq1: transactional:0 last_value: 132 log_cnt: 0 is_called:1
 BEGIN
 table public.t1_seq: INSERT: c1[integer]:1
 table public.t1_seq: INSERT: c1[integer]:2
 table public.t1_seq: INSERT: c1[integer]:3
 table public.t1_seq: INSERT: c1[integer]:4
 table public.t1_seq: INSERT: c1[integer]:5
 table public.t1_seq: INSERT: c1[integer]:6

This will create an inconsistent replica as sequence changes won't be
replicated. I thought about changing snapshot dealing of
non-transactional sequence changes similar to transactional ones but
that also won't work because it is only at commit we decide whether we
can send the changes.

For the transactional case, as we are considering the create sequence
operation as transactional, we would unnecessarily queue them even
though that is not required. Basically, they don't need to be
considered transactional and we can simply ignore such messages like
other DDLs. But for that probably we need to distinguish Alter/Create
case which may or may not be straightforward. Now, queuing them is
probably harmless unless it causes the transaction to spill/stream.

I still couldn't think completely about cases where a mix of
transactional and non-transactional changes occur in the same
transaction as I think it somewhat depends on what we want to do about
the above cases.

> > At all places, it is mentioned
> > as creating a sequence for transactional cases which at the very least
> > need some tweak.
> >
>
> Which places?
>

In comments like:
a. When decoding sequences, we differentiate between sequences created
in a (running) transaction and seq

Re: Race conditions in 019_replslot_limit.pl

2022-03-27 Thread Andres Freund

Hi,

On 2022-03-27 17:36:14 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I still feel like there's something off here. But that's probably not enough
> > to keep causing failures. I'm inclined to leave the debugging in for a bit
> > longer, but not fail the test anymore?
> 
> WFM.

I've done so now.

Greetings,

Andres Freund

RE: Failed transaction statistics to measure the logical replication progress

2022-03-27 Thread osumi.takami...@fujitsu.com

Hi

On Friday, March 25, 2022 2:36 PM Masahiko Sawada  wrote:
> On Thu, Mar 24, 2022 at 12:30 PM Amit Kapila 
> wrote:
> >
> > On Thu, Mar 3, 2022 at 8:58 AM osumi.takami...@fujitsu.com
> >  wrote:
> > >
> >
> > This patch introduces two new subscription statistics columns
> > (apply_commit_count and apply_rollback_count) to the
> > pg_stat_subscription_stats view for counting cumulative transactions
> > commits/rollbacks for a particular subscription. Now, users can
> > already see the total number of xacts committed/rolled back in a
> > particular database via pg_stat_database, so this can be considered
> > duplicate information.
> 
> Right.
...
> > I am not sure if it is worth adding this additional information or how
> > useful it will be for users. Does anyone else want to weigh in on
> > this?
> >
> > If nobody else sees value in this then I feel it is better to mark
> > this patch as Returned with feedback or Rejected. We can come back to
> > it later if we see more demand for this.
> 
> Marking as Returned with feedback makes sense to me.
OK. Thank you so much for sharing your opinions, Sawada-san and Amit-san.

I changed the status of this entry to "Returned with feedback" accordingly.



Best Regards,
Takamichi Osumi

RE: Logical replication timeout problem

2022-03-27 Thread wangw.f...@fujitsu.com

On Mon, Mar 28, 2022 at 9:56 AM Kuroda, Hayato/黒田 隼人 
 wrote:
> Dear Wang-san,
Thanks for your comments.

> Thank you for updating!
> ...but it also cannot be applied to current HEAD
> because of the commit 923def9a533.
> 
> Your patch seems to conflict the adding an argument of
> logicalrep_write_insert().
> It allows specifying columns to publish by skipping some columns in
> logicalrep_write_tuple()
> which is called from logicalrep_write_insert() and logicalrep_write_update().
Thank for your kindly reminder.
Rebase the patch.

> Do we have to consider something special case for that?
> I thought timeout may occur if users have huge table and publish few columns,
> but it is corner case.
I think maybe we do not need to deal with this use case.
The maximum number of table columns allowed by PG is 1600
(macro MaxHeapAttributeNumber), and after loop through all columns in the
function logicalrep_write_tuple, the function OutputPluginWrite will be invoked
immediately to actually send the data to the subscriber. This refreshes the
last time the subscriber received a message.
So I think this loop will not cause timeout issues.

Regards,
Wang wei


v8-0001-Fix-the-logical-replication-timeout-during-large-.patch
Description:  v8-0001-Fix-the-logical-replication-timeout-during-large-.patch

Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-03-27 Thread Dilip Kumar

On Sat, Mar 26, 2022 at 5:55 PM Dilip Kumar  wrote:
>
> On Fri, Mar 25, 2022 at 8:16 PM Dilip Kumar  wrote:
> >
> > > On a quick look, I'm guessing that XLOG_DBASE_CREATE_WAL_LOG will need
> > > to mirror some of the logic that was added to the replay code for the
> > > existing strategy, but I haven't figured out the details.
> > >
> >
> > Yeah, I think I got it, for XLOG_DBASE_CREATE_WAL_LOG now we will have
> > to handle the missing parent directory case, like Alvaro handled for
> > the XLOG_DBASE_CREATE(_FILE_COPY) case.
>
> I have updated the patch so now we skip the XLOG_DBASE_CREATE_WAL_LOG
> as well if the tablespace directory is missing.  But with our new
> wal_log method there will be other follow up wal logs like,
> XLOG_RELMAP_UPDATE, XLOG_SMGR_CREATE and XLOG_FPI.
>
> I have put the similar logic for relmap_update WAL replay as well,

There was some mistake in the last patch, basically, for relmap update
also I have checked the missing tablespace directory but I should have
checked the missing database directory so I have fixed that.

> Now, is it possible to get the FPI without smgr_create wal in other
> cases?  If it is then that problem is orthogonal to this path, but
> anyway I could not find any such scenario.

I have digged further into it, tried manually removing the directory
before XLOG_FPI, but I noticed that during FPI also
XLogReadBufferExtended() take cares of creating the missing files
using smgrcreate() and that intern take care of missing directory
creation so I don't think we have any problem here.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From c35ba040770184d3048d9529668e30f8a59f5b75 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Sat, 26 Mar 2022 10:35:44 +0530
Subject: [PATCH v8] Add new block-by-block strategy for CREATE DATABASE.

Because this strategy logs changes on a block-by-block basis, it
avoids the need to checkpoint before and after the operation.
However, because it logs each changed block individually, it might
generate a lot of extra write-ahead logging if the template database
is large. Therefore, the older strategy remains available via a new
STRATEGY parameter to CREATE DATABASE, and a corresponding --strategy
option to createdb.

Somewhat controversially, this patch assembles the list of relations
to be copied to the new database by reading the pg_class relation of
the template database. Cross-database access like this isn't normally
possible, but it can be made to work here because there can't be any
connections to the database being copied, nor can it contain any
in-doubt transactions. Even so, we have to use lower-level interfaces
than normal, since the table scan and relcache interfaces will not
work for a database to which we're not connected. The advantage of
this approach is that we do not need to rely on the filesystem to
determine what ought to be copied, but instead on PostgreSQL's own
knowledge of the database structure. This avoids, for example,
copying stray files that happen to be located in the source database
directory.

Dilip Kumar, with a fairly large number of cosmetic changes by me.
---
 contrib/bloom/blinsert.c |   2 +-
 doc/src/sgml/monitoring.sgml |   4 +
 doc/src/sgml/ref/create_database.sgml|  22 +
 doc/src/sgml/ref/createdb.sgml   |  11 +
 src/backend/access/heap/heapam_handler.c |   6 +-
 src/backend/access/nbtree/nbtree.c   |   2 +-
 src/backend/access/rmgrdesc/dbasedesc.c  |  20 +-
 src/backend/access/transam/xlogutils.c   |   6 +-
 src/backend/catalog/heap.c   |   2 +-
 src/backend/catalog/storage.c|  34 +-
 src/backend/commands/dbcommands.c| 795 ++-
 src/backend/commands/tablecmds.c |   2 +-
 src/backend/storage/buffer/bufmgr.c  | 172 ++-
 src/backend/storage/lmgr/lmgr.c  |  28 ++
 src/backend/utils/activity/wait_event.c  |   3 +
 src/backend/utils/cache/relcache.c   |   2 +-
 src/backend/utils/cache/relmapper.c  |  93 
 src/bin/pg_rewind/parsexlog.c|   9 +-
 src/bin/psql/tab-complete.c  |   4 +-
 src/bin/scripts/createdb.c   |  10 +-
 src/bin/scripts/t/020_createdb.pl|  20 +
 src/include/catalog/storage.h|   4 +-
 src/include/commands/dbcommands_xlog.h   |  25 +-
 src/include/storage/bufmgr.h |   6 +-
 src/include/storage/lmgr.h   |   1 +
 src/include/utils/relmapper.h|   4 +-
 src/include/utils/wait_event.h   |   1 +
 src/tools/pgindent/typedefs.list |   5 +-
 28 files changed, 1136 insertions(+), 157 deletions(-)

diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index c94cf34..82378db 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -173,7 +173,7 @@ blbuildempty(Relation index)
 	 * Write the page and log it.  It might seem that an immediate sync would
 	 * be sufficient to guarantee that the file exists on disk, b

Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-03-27 Thread Kyotaro Horiguchi

At Mon, 28 Mar 2022 14:34:44 +1300, Thomas Munro  wrote 
in 
> On Mon, Mar 28, 2022 at 2:01 PM Kyotaro Horiguchi
>  wrote:
> > At Fri, 25 Mar 2022 13:26:05 +0100, Alvaro Herrera 
> >  wrote in
> > > Pushed this, backpatching to 14 and 13.  It would have been good to
> > > backpatch further, but there's an (textually trivial) merge conflict
> > > related to commit e6d8069522c8.  Because that commit conceptually
> > > touches the same area that this bugfix is about, I'm not sure that
> > > backpatching further without a lot more thought is wise -- particularly
> > > so when there's no way to automate the test in branches older than
> > > master.
> 
> Just a thought:  we could consider back-patching
> allow_in_place_tablespaces, after a little while, if we're happy with
> how that is working out, if it'd be useful for verifying bug fixes in
> back branches.  It's non-end-user-facing testing infrastructure.

I appreciate if we accept that.  The patch is simple.  And it now has
the clear use-case for back-patching.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Logical replication timeout problem

2022-03-27 Thread Amit Kapila

On Mon, Mar 28, 2022 at 11:41 AM wangw.f...@fujitsu.com
 wrote:
>
> On Mon, Mar 28, 2022 at 9:56 AM Kuroda, Hayato/黒田 隼人 
>  wrote:
>
> > Do we have to consider something special case for that?
> > I thought timeout may occur if users have huge table and publish few 
> > columns,
> > but it is corner case.
> I think maybe we do not need to deal with this use case.
> The maximum number of table columns allowed by PG is 1600
> (macro MaxHeapAttributeNumber), and after loop through all columns in the
> function logicalrep_write_tuple, the function OutputPluginWrite will be 
> invoked
> immediately to actually send the data to the subscriber. This refreshes the
> last time the subscriber received a message.
> So I think this loop will not cause timeout issues.
>

Right, I also don't think it can be a source of timeout.

-- 
With Regards,
Amit Kapila.

Re: pg_stat_get_replication_slot() marked not strict, crashes

2022-03-27 Thread Amit Kapila

On Mon, Mar 28, 2022 at 10:24 AM Andres Freund  wrote:
>
> On 2022-03-27 21:09:29 -0700, Andres Freund wrote:
> > FWIW, I have a test for both, I was a bit "stuck" on where to put the
> > pg_stat_get_subscription_stats(NULL) test. I had put the
> > pg_stat_get_replication_slot(NULL) in contrib/test_decoding/sql/stats.sql
> > but pg_stat_get_subscription_stats() doesn't really fit there.  I think I'm
> > coming down to putting a section of such tests into 
> > src/test/regress/sql/stats.sql
> > instead. In the hope of preventing future such occurrances by encouraging
> > people to copy the test...
>
> Pushed with tests there.
>

Thank you.

-- 
With Regards,
Amit Kapila.

Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-03-27 Thread Tatsuo Ishii

>> > Even applying this patch, "make postgres-A4.pdf" arises the warning on my
>> > machine. After some investigations, I found that previous document had a 
>> > break
>> > after 'num_transactions', but it has been removed due to this commit.
>> 
>> Yes, your patch removed "&zwsp;".
>> 
>> > So,
>> > I would like to get back this as it was. I attached the patch.
>> 
>> This produces errors. Needs ";" postfix?
> 
> Oops. Yes, it needs ';'. Also, I found another "&zwsp;" dropped.
> I attached the fixed patch.

Basic problem with this patch is, this may solve the issue with pdf
generation but this does not solve the issue with HTML generation. The
PDF manual of pgbench has ridiculously long line, which Tom Lane
complained too:

interval_start num_transactions sum_latency sum_latency_2 min_latency 
max_latency { failures | serialization_failures deadlock_failures } [ sum_lag 
sum_lag_2 min_lag max_lag [ skipped ] ] [ retried retries ]

Why can't we use just line feeds instead of &zwsp;? Although it's not
a command usage but the SELECT manual already uses line feeds to
nicely break into multiple lines of command usage.

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-03-27 Thread Tatsuo Ishii

>>> > Even applying this patch, "make postgres-A4.pdf" arises the warning on my
>>> > machine. After some investigations, I found that previous document had a 
>>> > break
>>> > after 'num_transactions', but it has been removed due to this commit.
>>> 
>>> Yes, your patch removed "&zwsp;".
>>> 
>>> > So,
>>> > I would like to get back this as it was. I attached the patch.
>>> 
>>> This produces errors. Needs ";" postfix?
>> 
>> Oops. Yes, it needs ';'. Also, I found another "&zwsp;" dropped.
>> I attached the fixed patch.
> 
> Basic problem with this patch is, this may solve the issue with pdf
> generation but this does not solve the issue with HTML generation. The
> PDF manual of pgbench has ridiculously long line, which Tom Lane
I meant "HTML manual" here.

> complained too:
> 
> interval_start num_transactions sum_latency sum_latency_2 min_latency 
> max_latency { failures | serialization_failures deadlock_failures } [ 
> sum_lag sum_lag_2 min_lag max_lag [ skipped ] ] [ retried retries ]
> 
> Why can't we use just line feeds instead of &zwsp;? Although it's not
> a command usage but the SELECT manual already uses line feeds to
> nicely break into multiple lines of command usage.
> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp

71 matches

Mail list logo