date:20220512

Re: gitmaster access

2022-05-12 Thread Tatsuo Ishii

> At Thu, 12 May 2022 14:44:15 +0900 (JST), Tatsuo Ishii  
> wrote in 
>> >> Thank you for the info, but unfortunately it hasn't worked.
>> >> I'm going to try a slightly different steps..
>> > 
>> > And finally I succeeded to clone from git.postgresql.org and to push a
>> > commit.
>> 
>> Is it git.postgresql.org, not gitmaster.postgresql.org? Interesting...
> 
> git.postgresql.org.  I still receive "Permission denied" from
> gitmaster.

Ok. I learned that only postgresql.git should be accessed from
gitmaster.postgresql.org. All other repos should be accessed from
git.postgresql.org.

BTW,
> I'm going to try a slightly different steps..

Can you please tell me What you actually did? I am afraid of facing
similar problem if I want to add another committer to pgpool2 repo.

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: gitmaster access

2022-05-12 Thread Dave Page

On Thu, 12 May 2022 at 08:03, Tatsuo Ishii  wrote:

> > At Thu, 12 May 2022 14:44:15 +0900 (JST), Tatsuo Ishii <
> is...@sraoss.co.jp> wrote in
> >> >> Thank you for the info, but unfortunately it hasn't worked.
> >> >> I'm going to try a slightly different steps..
> >> >
> >> > And finally I succeeded to clone from git.postgresql.org and to push
> a
> >> > commit.
> >>
> >> Is it git.postgresql.org, not gitmaster.postgresql.org? Interesting...
> >
> > git.postgresql.org.  I still receive "Permission denied" from
> > gitmaster.
>
> Ok. I learned that only postgresql.git should be accessed from
> gitmaster.postgresql.org. All other repos should be accessed from
> git.postgresql.org.


That is correct for PostgreSQL Committers such as yourself. Anyone else can
*only* use git.postgresql.org


>
> BTW,
> > I'm going to try a slightly different steps..
>
> Can you please tell me What you actually did? I am afraid of facing
> similar problem if I want to add another committer to pgpool2 repo.
>
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
-- 
-- 
Dave Page
https://pgsnake.blogspot.com

EDB Postgres
https://www.enterprisedb.com

Re: bogus: logical replication rows/cols combinations

2022-05-12 Thread Amit Kapila

On Thu, May 12, 2022 at 12:15 PM Amit Kapila  wrote:
>
> On Wed, May 11, 2022 at 12:55 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Wednesday, May 11, 2022 11:33 AM Amit Kapila  
> > wrote:
> > >
> > > Fair enough, then we should go with a simpler approach to detect it in
> > > pgoutput.c (get_rel_sync_entry).
> >
> > OK, here is the patch that try to check column list in that way. The patch 
> > also
> > check the column list when CREATE SUBSCRIPTION and when starting initial 
> > copy.
> >
>
> Few comments:
> ===
...

One more point, I think we should update the docs for this.

-- 
With Regards,
Amit Kapila.

Re: gitmaster access

2022-05-12 Thread Kyotaro Horiguchi

At Thu, 12 May 2022 16:03:50 +0900 (JST), Tatsuo Ishii  
wrote in 
> > At Thu, 12 May 2022 14:44:15 +0900 (JST), Tatsuo Ishii  
> > wrote in 
> BTW,
> > I'm going to try a slightly different steps..
> 
> Can you please tell me What you actually did? I am afraid of facing
> similar problem if I want to add another committer to pgpool2 repo.

 Cleared SSH key in user profile.
+Reloaded the adm page.
+Waited for 20 minutes.
 Filled in SSH key in user profile.
 Reloaded the adm page.
 Waited for 20 minutes.
 Run git clone and succeeded \o/

I'm not sure the additional steps gave substantial differences,
though.  Most of the whole steps look like bibbid-babbid-boo to me in
the first place..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

2022-05-12 Thread Etsuro Fujita

On Wed, May 11, 2022 at 7:39 PM Etsuro Fujita  wrote:
> On Fri, May 6, 2022 at 7:08 PM Etsuro Fujita  wrote:
> > > > On Wed, Apr 20, 2022 at 4:55 AM David Zhang  
> > > > wrote:
> > > >> + * remote server in parallel at (sub)transaction end.
>
> > I noticed that the comment failed to mention that the
> > parallel_commit option is disabled by default.  Also, I noticed a
> > comment above it:
> >
> >  * It's enough to determine this only when making new connection because
> >  * all the connections to the foreign server whose keep_connections 
> > option
> >  * is changed will be closed and re-made later.
> >
> > This would apply to the parallel_commit option as well.  How about
> > updating these like the attached?  (I simplified the latter comment
> > and moved it to a more appropriate place.)
>
> I’m planning to commit this as a follow-up patch for commit 04e706d42.

Done.

Best regards,
Etsuro Fujita

Re: First draft of the PG 15 release notes

2022-05-12 Thread Amit Kapila

On Wed, May 11, 2022 at 2:02 AM Bruce Momjian  wrote:
>
> On Tue, May 10, 2022 at 04:17:59PM -0400, Jonathan Katz wrote:
> > On 5/10/22 11:44 AM, Bruce Momjian wrote:
> > > I have completed the first draft of the PG 15 release notes and you can
> > > see the results here:
> > >
> > >  https://momjian.us/pgsql_docs/release-15.html
> >
> > Thanks for pulling this together.
> >
> > + Allow logical replication to transfer sequence changes
> >
> > I believe this was reverted in 2c7ea57e5, unless some other parts of this
> > work made it in.
>
> Yes, sorry, I missed that.  Oddly, the unlogged sequence patch was
> retained, even though there is no value for it on the primary.  I
> removed the sentence that mentioned that benefit from the release notes
> since it doesn't apply to PG 15 anymore.
>

+ Create unlogged sequences and allow them to be skipped in logical replication

Is it right to say the second part of the sentence: "allow them to be
skipped in logical replication" when we are not replicating them in
the first place?

-- 
With Regards,
Amit Kapila.

Re: Make relfile tombstone files conditional on WAL level

2022-05-12 Thread Amul Sul

Hi Dilip,

On Fri, Mar 4, 2022 at 11:07 AM Dilip Kumar  wrote:
>
> On Mon, Feb 21, 2022 at 1:21 PM Dilip Kumar  wrote:
> >
> > On Thu, Jan 6, 2022 at 1:43 PM Dilip Kumar  wrote:
> >
> >  2) GetNewRelFileNode() will not loop for checking the file existence
> > > and retry with other relfilenode.
> >
> >
> > Open Issues- there are currently 2 open issues in the patch 1) Issue
> > as discussed above about removing the loop, so currently in this patch
> > the loop is removed.  2) During upgrade from the previous version we
> > need to advance the nextrelfilenode to the current relfilenode we are
> > setting for the object in order to avoid the conflict.
>
>
> In this version I have fixed both of these issues.  Thanks Robert for
> suggesting the solution for both of these problems in our offlist
> discussion.  Basically, for the first problem we can flush the xlog
> immediately because we are actually logging the WAL every time after
> we allocate 64 relfilenode so this should not have much impact on the
> performance and I have added the same in the comments.  And during
> pg_upgrade, whenever we are assigning the relfilenode as part of the
> upgrade we will set that relfilenode + 1 as nextRelFileNode to be
> assigned so that we never generate the conflicting relfilenode.
>
> The only part I do not like in the patch is that before this patch we
> could directly access the buftag->rnode.  But since now we are not
> having directly relfilenode as part of the buffertag and instead of
> that we are keeping individual fields (i.e. dbOid, tbsOid and relNode)
> in the buffer tag.  So if we have to directly get the relfilenode we
> need to generate it.  However those changes are very limited to just 1
> or 2 file so maybe not that bad.
>

v5 patch needs a rebase and here are a few comments for 0002, I found
while reading that, hope that helps:

+/* Number of RelFileNode to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH   8192
+

Should it be 64, as per comment in XLogPutNextRelFileNode for XLogFlush() ?
---

+   /*
+* Check for the wraparound for the relnode counter.
+*
+* XXX Actually the relnode is 56 bits wide so we don't need to worry about
+* the wraparound case.
+*/
+   if (ShmemVariableCache->nextRelNode > MAX_RELFILENODE)

Very rare case, should use unlikely()?
---

+/*
+ * Max value of the relfilnode.  Relfilenode will be of 56bits wide for more
+ * details refer comments atop BufferTag.
+ */
+#define MAX_RELFILENODEuint64) 1) << 56) - 1)

Should there be 57-bit shifts here? Instead, I think we should use
INT64CONST(0xFF) to be consistent with PG_*_MAX
declarations, thoughts?
---

+   /* If we run out of logged for use RelNode then we must log more */
+   if (ShmemVariableCache->relnodecount == 0)

Might relnodecount never go below, but just to be safer should check
<= 0 instead.
---

Few typos:
Simmialr
Simmilar
agains
idealy

Regards,
Amul

Reproducible coliisions in jsonb_hash

2022-05-12 Thread Valeriy Meleshkin

Hello,

I've noticed that 

jsonb_hash_extended(jsonb_v,0) = 
jsonb_hash_extended(jsonb_build_array(jsonb_v),0)

for any jsonb value jsonb_v. 

AFAICT it happens because when iterating over a jsonb the hash function makes 
no distinction between raw scalars and arrays (it doesn't inspect 
v.val.array.rawScalar)
https://github.com/postgres/postgres/blob/27b77ecf9f4d5be211900eda54d8155ada50d696/src/backend/utils/adt/jsonb_op.c#L326

Is this an intended behaviour or a bug?

Cheers,
Valeriy

Re: First draft of the PG 15 release notes

2022-05-12 Thread Amit Kapila

On Thu, May 12, 2022 at 2:25 PM Amit Kapila  wrote:
>
> On Wed, May 11, 2022 at 2:02 AM Bruce Momjian  wrote:
> >
> > On Tue, May 10, 2022 at 04:17:59PM -0400, Jonathan Katz wrote:
> > > On 5/10/22 11:44 AM, Bruce Momjian wrote:
> > > > I have completed the first draft of the PG 15 release notes and you can
> > > > see the results here:
> > > >
> > > >  https://momjian.us/pgsql_docs/release-15.html
> > >
> > > Thanks for pulling this together.
> > >
> > > + Allow logical replication to transfer sequence changes
> > >
> > > I believe this was reverted in 2c7ea57e5, unless some other parts of this
> > > work made it in.
> >
> > Yes, sorry, I missed that.  Oddly, the unlogged sequence patch was
> > retained, even though there is no value for it on the primary.  I
> > removed the sentence that mentioned that benefit from the release notes
> > since it doesn't apply to PG 15 anymore.
> >
>
> + Create unlogged sequences and allow them to be skipped in logical 
> replication
>
> Is it right to say the second part of the sentence: "allow them to be
> skipped in logical replication" when we are not replicating them in
> the first place?
>

One more point related to logical replication features:

>
Add SQL functions to monitor the directory contents of replication
slots (Bharath Rupireddy)

Specifically, the functions are pg_ls_logicalsnapdir(),
pg_ls_logicalmapdir(), and pg_ls_replslotdir(). They can be run by
members of the predefined pg_monitor role.
>

This feature is currently under the section "Streaming Replication and
Recovery". Shouldn't it be under "Logical Replication"? The function
names themselves seem to indicate that they are used for logical
replication contents. I think the replication slot-related function
would fall under both categories but overall it seems to belong to the
"Logical Replication" section.

-- 
With Regards,
Amit Kapila.

Re: postgres_fdw "parallel_commit" docs

2022-05-12 Thread Etsuro Fujita

On Wed, May 11, 2022 at 7:25 PM Etsuro Fujita  wrote:
> One thing I noticed is this bit:
>
> -When multiple remote (sub)transactions are involved in a local
> -(sub)transaction, by default postgres_fdw commits
> -those remote (sub)transactions one by one when the local (sub)transaction
> -commits.
> -Performance can be improved with the following option:
> +When multiple remote transactions or subtransactions are involved in a
> +local transaction (or subtransaction) on a foreign server,
> +postgres_fdw by default commits those remote
> +transactions serially when the local transaction commits.
> Performance can be
> +improved with the following option:
>
> I think this might still be a bit confusing.  How about rewriting it
> to something like  this?
>
> As described in F.38.4. Transaction Management, in postgres_fdw
> transactions are managed by creating corresponding remote
> transactions, and subtransactions are managed by creating
> corresponding remote subtransactions.  When multiple remote
> transactions are involved in the current local transaction,
> postgres_fdw by default commits those remote transactions serially
> when the local transaction is committed.  When multiple remote
> subtransactions are involved in the current local subtransaction, it
> by default commits those remote subtransactions serially when the
> local subtransaction is committed.  Performance can be improved with
> the following option:
>
> It might be a bit redundant to explain the transaction/subtransaction
> cases differently, but I think it makes it clear and maybe
> easy-to-understand that how they are handled by postgres_fdw by
> default.

I modified the patch that way.

On Wed, May 11, 2022 at 7:29 PM Etsuro Fujita  wrote:
> On Tue, May 10, 2022 at 12:58 AM Justin Pryzby  wrote:
> > On Mon, May 09, 2022 at 11:37:35AM -0400, Jonathan S. Katz wrote:
> > > -   If multiple foreign servers with this option enabled are involved 
> > > in
> > > -   a local (sub)transaction, multiple remote (sub)transactions 
> > > opened on
> > > -   those foreign servers in the local (sub)transaction are committed 
> > > in
> > > -   parallel across those foreign servers when the local 
> > > (sub)transaction
> > > -   commits.
> > > +   If multiple foreign servers with this option enabled have a local
> > > +   transaction, multiple remote transactions on those foreign 
> > > servers are
> > > +   committed in parallel across those foreign servers when the local
> > > +   transaction is committed.
> > >
> >
> > I think "have a transaction" doesn't sound good, and the old language 
> > "involved
> > in" was better.
>
> I think so too.

I modified the patch to use the old language.  Also, I fixed a typo
reported by Justin.

Attached is an updated patch.  I'll commit the patch if no objections.

Best regards,
Etsuro Fujita


parallel-commit-docs-efujita.patch
Description: Binary data

Re: First draft of the PG 15 release notes

2022-05-12 Thread Ajin Cherian

On Wed, May 11, 2022 at 1:44 AM Bruce Momjian  wrote:
>
> I have completed the first draft of the PG 15 release notes and you can
> see the results here:
>
> https://momjian.us/pgsql_docs/release-15.html
>
> The feature count is similar to recent major releases:
>
> release-10 195
> release-11 185
> release-12 198
> release-13 183
> release-14 229
> --> release-15 186
>
> I assume there will be major adjustments in the next few weeks based on
> feedback.
>

I wonder if this is worth mentioning:

Skip empty transactions for logical replication.
commit d5a9d86d8ffcadc52ff3729cd00fbd83bc38643c

https://github.com/postgres/postgres/commit/d5a9d86d8ffcadc52ff3729cd00fbd83bc38643c

regards,
Ajin Cherian
Fujitsu Australia

Re: First draft of the PG 15 release notes

2022-05-12 Thread David Steele


On 5/11/22 10:50 PM, Ian Lawrence Barwick wrote:

2022年5月12日(木) 11:46 Bruce Momjian :


On Thu, May 12, 2022 at 11:40:17AM +0900, Ian Lawrence Barwick wrote:

Looking at the release notes from the point of view of someone who has maybe
not been following the long-running debate on removing exclusive backups:

"Important-sounding backup thing is suddenly gone! What was that
again? Hmm, can't
find anything in the now-current Pg 15 docs [*], do I need to worry
about this!?"

[*] the backup section has removed all mention of the word "exclusive"
https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

versus:

"Long-deprecated thing is finally gone, ah OK whatever".

I am thinking back here to a point in my working life where the
release notes were reviewed
(by a team including non-Pg specialists) for potential issues when
considering a major
upgrade - from experience the more clarity with this kind of change
the better so
as not to unnecessarily raise alarm bells.


Ah, you are right.  I thought I had "deprecated" in the text, but I now
see I did not, and we do have cases where we mention the deprecated
status in previous release notes, so the new text is:

 Remove long-deprecated exclusive backup mode (David Steele, Nathan
 Bossart)



That works, thanks!


A bit late to this conversation, but +1 from me.

--
-David
da...@pgmasters.net

Re: Privileges on PUBLICATION

2022-05-12 Thread Euler Taveira

On Tue, May 10, 2022, at 5:37 AM, Antonin Houska wrote:
> My understanding is that the rows/columns filtering is a way for the
> *publisher* to control which data is available to particular replica. From
> this point of view, the publication privileges would just make the control
> complete.
I agree. IMO it is a new feature. We already require high privilege for logical
replication. Hence, we expect the replication user to have access to all data.
Unfortunately, nobody mentioned about this requirement during the row filter /
column list development; someone could have written a patch for GRANT ... ON
PUBLICATION.

I understand your concern. Like I said in my last sentence in the previous
email: it is a fine-grained access control on the publisher. Keep in mind that
it will *only* work for non-superusers (REPLICATION attribute). It is not
exposing something that we didn't expose before. In this particular case, there
is no mechanism to prevent the subscriber to obtain data provided by the
various row filters if they know the publication names. We could probably add a
sentence to "Logical Replication > Security" section:

There is no privileges for publications. If you have multiple publications in a
database, a subscription can use all publications available.

--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: postgres_fdw "parallel_commit" docs

2022-05-12 Thread Jonathan S. Katz


Hi Etsuro,

On 5/12/22 7:26 AM, Etsuro Fujita wrote:


I modified the patch to use the old language.  Also, I fixed a typo
reported by Justin.

Attached is an updated patch.  I'll commit the patch if no objections.


Thanks for reviewing and revising! I think this is much easier to read.

I made a few minor copy edits. Please see attached.

Thanks,

Jonathan
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index b43d0aecba..e35ae4ff72 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -460,10 +460,16 @@ OPTIONS (ADD password_required 'false');
Transaction Management Options
 

-When multiple remote (sub)transactions are involved in a local
-(sub)transaction, by default postgres_fdw commits
-those remote (sub)transactions one by one when the local (sub)transaction
-commits.
+As described in the Transaction Management section, in
+postgres_fdw transactions are managed by creating
+corresponding remote transactions, and subtransactions are managed by
+creating corresponding remote subtransactions. When multiple remote
+transactions are involved in the current local transaction, by default
+postgres_fdw commits those remote transactions
+serially when the local transaction is committed. When
+multiple remote subtransactions are involved in the current local
+subtransaction, by default postgres_fdw commits those
+remote subtransactions serially when the local subtransaction is committed.
 Performance can be improved with the following option:

 
@@ -474,26 +480,24 @@ OPTIONS (ADD password_required 'false');
  
   
This option controls whether postgres_fdw commits
-   remote (sub)transactions opened on a foreign server in a local
-   (sub)transaction in parallel when the local (sub)transaction commits.
-   This option can only be specified for foreign servers, not per-table.
-   The default is false.
+   in parallel remote transactions opened on a foreign server in a local
+   transaction when the local transaction is committed. This setting also
+   applies to remote and local subtransactions. This option can only be
+   specified for foreign servers, not per-table. The default is
+   false.
   
 
   
-   If multiple foreign servers with this option enabled are involved in
-   a local (sub)transaction, multiple remote (sub)transactions opened on
-   those foreign servers in the local (sub)transaction are committed in
-   parallel across those foreign servers when the local (sub)transaction
-   commits.
+   If multiple foreign servers with this option enabled are involved in a
+   local transaction, multiple remote transactions on those foreign
+   servers are committed in parallel across those foreign servers when
+   the local transaction is committed.
   
 
   
-   For a foreign server with this option enabled, if many remote
-   (sub)transactions are opened on the foreign server in a local
-   (sub)transaction, this option might increase the remote server's load
-   when the local (sub)transaction commits, so be careful when using this
-   option.
+   When this option is enabled, a foreign server with many remote
+   transactions may see a negative performance impact when the local
+   transaction is committed.
   
  
 


OpenPGP_signature
Description: OpenPGP digital signature

RE: First draft of the PG 15 release notes

2022-05-12 Thread osumi.takami...@fujitsu.com

On Wednesday, May 11, 2022 12:44 AM Bruce Momjian  wrote:
> I have completed the first draft of the PG 15 release notes and you can see 
> the
> results here:
> 
> https://momjian.us/pgsql_docs/release-15.html
> 
> The feature count is similar to recent major releases:
> 
> release-10 195
> release-11 185
> release-12 198
> release-13 183
> release-14 229
> --> release-15 186
> 
> I assume there will be major adjustments in the next few weeks based on
> feedback.
Hi,


I'd like to suggest that we mention a new option for subscription 
'disable_on_error'.


https://github.com/postgres/postgres/commit/705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33


Best Regards,
Takamichi Osumi

Re: Reproducible coliisions in jsonb_hash

2022-05-12 Thread Andrew Dunstan



On 2022-05-12 Th 07:02, Valeriy Meleshkin wrote:
> Hello,
>
> I've noticed that 
>
> jsonb_hash_extended(jsonb_v,0) = 
> jsonb_hash_extended(jsonb_build_array(jsonb_v),0)
>
> for any jsonb value jsonb_v. 
>
> AFAICT it happens because when iterating over a jsonb the hash function makes 
> no distinction between raw scalars and arrays (it doesn't inspect 
> v.val.array.rawScalar)
> https://github.com/postgres/postgres/blob/27b77ecf9f4d5be211900eda54d8155ada50d696/src/backend/utils/adt/jsonb_op.c#L326
>
> Is this an intended behaviour or a bug?
>

It does look rather like a bug, but I'm unclear about the implications
of fixing it.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: First draft of the PG 15 release notes

2022-05-12 Thread Bruce Momjian

On Thu, May 12, 2022 at 02:27:26PM +0900, Amit Langote wrote:
> > Yes, this is what I needed to know.  The updated text is:
> >
> > 
> >
> > 
> > 
> > Improve foreign key behavior of updates on partitioned tables
> > that move rows between partitions (Amit Langote)
> > 
> >
> > 
> > Previously, such updates ran delete actions on the source partition
> > and insert actions on the target partition.  PostgreSQL will now
> > run update actions on the partition root.
> > 
> > 
> 
> Looks fine to me.  Though I think maybe we should write the last
> sentence as "PostgreSQL will now run update actions on the partition
> root mentioned in the query" to be less ambiguous about which "root",
> because it can also mean the actual root table in the partition tree.
> A user may be updating only a particular subtree by mentioning that
> subtree's root in the query, for example.

Okay, I went with:

Previously, such updates ran delete actions on the source
partition and insert actions on the target partition.  PostgreSQL will
now run update actions on the referenced partition root.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: First draft of the PG 15 release notes

2022-05-12 Thread Bruce Momjian

On Thu, May 12, 2022 at 02:25:36PM +0530, Amit Kapila wrote:
> On Wed, May 11, 2022 at 2:02 AM Bruce Momjian  wrote:
> > Yes, sorry, I missed that.  Oddly, the unlogged sequence patch was
> > retained, even though there is no value for it on the primary.  I
> > removed the sentence that mentioned that benefit from the release notes
> > since it doesn't apply to PG 15 anymore.
> >
> 
> + Create unlogged sequences and allow them to be skipped in logical 
> replication
> 
> Is it right to say the second part of the sentence: "allow them to be
> skipped in logical replication" when we are not replicating them in
> the first place?

Oops, yeah, that second part was reverted;  new text:

Allow the creation of unlogged sequences (Peter Eisentraut)

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Reproducible coliisions in jsonb_hash

2022-05-12 Thread Tom Lane

Andrew Dunstan  writes:
> On 2022-05-12 Th 07:02, Valeriy Meleshkin wrote:
>> AFAICT it happens because when iterating over a jsonb the hash function 
>> makes no distinction between raw scalars and arrays (it doesn't inspect 
>> v.val.array.rawScalar)

> It does look rather like a bug, but I'm unclear about the implications
> of fixing it.

Changing this hash algorithm would break existing hash indexes on jsonb
columns.  Maybe there aren't any, but even if so I can't get very excited
about changing this.  Hash algorithms always have collisions, and we have
never made any promise that ours are cryptographically strong.

regards, tom lane

Re: First draft of the PG 15 release notes

2022-05-12 Thread Bruce Momjian

On Thu, May 12, 2022 at 04:40:49PM +0530, Amit Kapila wrote:
> One more point related to logical replication features:
> 
> >
> Add SQL functions to monitor the directory contents of replication
> slots (Bharath Rupireddy)
> 
> Specifically, the functions are pg_ls_logicalsnapdir(),
> pg_ls_logicalmapdir(), and pg_ls_replslotdir(). They can be run by
> members of the predefined pg_monitor role.
> >
> 
> This feature is currently under the section "Streaming Replication and
> Recovery". Shouldn't it be under "Logical Replication"? The function
> names themselves seem to indicate that they are used for logical
> replication contents. I think the replication slot-related function
> would fall under both categories but overall it seems to belong to the
> "Logical Replication" section.

Oh, very good point!  I missed that this is logical-slot-only
monitoring, so I moved the item to logical replication and changed the
description to:

Add SQL functions to monitor the directory contents of logical
replication slots (Bharath Rupireddy)

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: First draft of the PG 15 release notes

2022-05-12 Thread Bruce Momjian

On Thu, May 12, 2022 at 09:32:48PM +1000, Ajin Cherian wrote:
> On Wed, May 11, 2022 at 1:44 AM Bruce Momjian  wrote:
> >
> > I have completed the first draft of the PG 15 release notes and you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-15.html
> >
> > The feature count is similar to recent major releases:
> >
> > release-10 195
> > release-11 185
> > release-12 198
> > release-13 183
> > release-14 229
> > --> release-15 186
> >
> > I assume there will be major adjustments in the next few weeks based on
> > feedback.
> >
> 
> I wonder if this is worth mentioning:
> 
> Skip empty transactions for logical replication.
> commit d5a9d86d8ffcadc52ff3729cd00fbd83bc38643c
> 
> https://github.com/postgres/postgres/commit/d5a9d86d8ffcadc52ff3729cd00fbd83bc38643c

I looked at that but thought that everyone would already assume we
skipped replication of empty transactions, and I didn't see much impact
for the user, so I didn't include it.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: First draft of the PG 15 release notes

2022-05-12 Thread 'Bruce Momjian'

On Thu, May 12, 2022 at 01:35:39PM +, osumi.takami...@fujitsu.com wrote:
> I'd like to suggest that we mention a new option for subscription 
> 'disable_on_error'.
> 
> 
> https://github.com/postgres/postgres/commit/705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33

Yes, I missed that one, added:

Allow subscribers to stop logical replication application on error
(Osumi Takamichi, Mark Dilger)

This is enabled with the subscriber option "disable_on_error"
and avoids possible infinite loops during stream application.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: First draft of the PG 15 release notes

2022-05-12 Thread Euler Taveira

On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote:
> I looked at that but thought that everyone would already assume we
> skipped replication of empty transactions, and I didn't see much impact
> for the user, so I didn't include it.
It certainly has an impact on heavy workloads that replicate tables with few
modifications. It receives a high traffic of 'begin' and 'commit' messages that
the previous Postgres versions have to handle (discard). I would classify it as
a performance improvement for logical replication. Don't have a strong opinion
if it should be mentioned or not.


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: First draft of the PG 15 release notes

2022-05-12 Thread Bruce Momjian

On Thu, May 12, 2022 at 11:12:54AM -0300, Euler Taveira wrote:
OB> On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote:
> 
> I looked at that but thought that everyone would already assume we
> skipped replication of empty transactions, and I didn't see much impact
> for the user, so I didn't include it.
> 
> It certainly has an impact on heavy workloads that replicate tables with few
> modifications. It receives a high traffic of 'begin' and 'commit' messages 
> that
> the previous Postgres versions have to handle (discard). I would classify it 
> as
> a performance improvement for logical replication. Don't have a strong opinion
> if it should be mentioned or not.

Oh, so your point is that a transaction that only has SELECT would
previously send an empty transaction?  I thought this was only for apps
that create literal empty transactions, which seem rare.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: gitmaster access

2022-05-12 Thread Bruce Momjian

On Thu, May 12, 2022 at 01:54:57PM +0900, Kyotaro Horiguchi wrote:
> At Thu, 12 May 2022 11:44:33 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 
> > At Thu, 12 May 2022 10:34:49 +0900 (JST), Tatsuo Ishii  
> > wrote in 
> > > Last year we faced a similar problem, namely, a new committer for
> > > pgpool.git could not access the git repository (Permission denied
> > > (publickey)). Magnus kindly advised following and it worked. Hope this
> > > helps.
> > > 
> > > > 1. Log into the git server on https://git.postgresql.org/adm/. It
> > > > should be an automatic log in and show the repository.
> > > > 2. *then* go back to the main website and delete the ssh key
> > > > 3. Now add the ssh key again on the main website
> > > > 4. Wait 10-15 minutes and then it should work
> > 
> > Thank you for the info, but unfortunately it hasn't worked.
> > I'm going to try a slightly different steps..
> 
> And finally I succeeded to clone from git.postgresql.org and to push a
> commit.

Sorry, but this has me confused.  When I read this, I thought you were
pushing a 'pgsql' core server commit to gitmaster, but that would be
impossible for git.postgresql.org, so where are you pushing to?  This
might be part of the confusion Dave was asking about.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Reproducible coliisions in jsonb_hash

2022-05-12 Thread Robert Haas

On Thu, May 12, 2022 at 9:57 AM Tom Lane  wrote:
> Andrew Dunstan  writes:
> > On 2022-05-12 Th 07:02, Valeriy Meleshkin wrote:
> >> AFAICT it happens because when iterating over a jsonb the hash function 
> >> makes no distinction between raw scalars and arrays (it doesn't inspect 
> >> v.val.array.rawScalar)
>
> > It does look rather like a bug, but I'm unclear about the implications
> > of fixing it.
>
> Changing this hash algorithm would break existing hash indexes on jsonb
> columns.  Maybe there aren't any, but even if so I can't get very excited
> about changing this.  Hash algorithms always have collisions, and we have
> never made any promise that ours are cryptographically strong.

I might be missing something, but I don't know why "cryptographically
strong" is the relevant concept here. It seems like the question is
how likely it is that this would lead to queries having crappy
performance. In general we want hash functions to deliver different
values for different inputs so that we give ourselves the best
possible chance of spreading keys evenly across buckets. If for
example we hashed every possible JSON object to the same constant
value, everything we tried to do with this hash function would suck,
and the problem would be so severe that I think we'd have to fix it,
even though it would mean a compatibility break. Or even if we hashed
every JSON integer to the same value, that would be horrible, because
it's quite likely that you could have a column full of json objects
where ignoring the difference between one integer and another results
in a ton of duplicate hash values.

Here, that doesn't seem too likely. You could have a column that
contains 'tom' and ['tom'] and [['tom']] and [[['tom']]] and so forth
and they all get mapped onto the same bucket and you're sad. But
probably not. So I'd judge that while it was probably a mistake to
make the hash function work this way, it's not likely to cause serious
problems, and therefore we ought to maybe leave it alone for now, but
add a comment so that if we ever break backward-compatibility for any
other reason, we remember to fix this too.

IOW, I think I mostly agree with your conclusion, but perhaps not
entirely with the reasoning.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: First draft of the PG 15 release notes

2022-05-12 Thread vignesh C

I wonder if this is worth mentioning:

Raise a WARNING for missing publications.
commit 8f2e2bbf145384784bad07a96d461c6bbd91f597

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8f2e2bbf145384784bad07a96d461c6bbd91f597

Regards,
Vignesh


On Thu, May 12, 2022 at 7:52 PM Bruce Momjian  wrote:
>
> On Thu, May 12, 2022 at 11:12:54AM -0300, Euler Taveira wrote:
> OB> On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote:
> >
> > I looked at that but thought that everyone would already assume we
> > skipped replication of empty transactions, and I didn't see much impact
> > for the user, so I didn't include it.
> >
> > It certainly has an impact on heavy workloads that replicate tables with few
> > modifications. It receives a high traffic of 'begin' and 'commit' messages 
> > that
> > the previous Postgres versions have to handle (discard). I would classify 
> > it as
> > a performance improvement for logical replication. Don't have a strong 
> > opinion
> > if it should be mentioned or not.
>
> Oh, so your point is that a transaction that only has SELECT would
> previously send an empty transaction?  I thought this was only for apps
> that create literal empty transactions, which seem rare.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   Indecision is a decision.  Inaction is an action.  Mark Batterson
>
>
>

Re: Reproducible coliisions in jsonb_hash

2022-05-12 Thread Tom Lane

Robert Haas  writes:
> Here, that doesn't seem too likely. You could have a column that
> contains 'tom' and ['tom'] and [['tom']] and [[['tom']]] and so forth
> and they all get mapped onto the same bucket and you're sad. But
> probably not.

Yeah, that might be a more useful way to think about it: is this likely
to cause performance-critical collisions in practice?  I agree that
that doesn't seem like a very likely situation, even given that you
might be using json for erratically-structured data.

regards, tom lane

Re: Mark all GUC variable as PGDLLIMPORT

2022-05-12 Thread Andres Freund

Hi,

On 2022-05-12 15:15:10 +0900, Michael Paquier wrote:
> On Tue, May 10, 2022 at 04:09:47PM +0900, Michael Paquier wrote:
> > Well, what about the attached then?  While looking at all the headers
> > in the tree, I have noticed that __pg_log_level is not marked, but
> > we'd better paint a PGDLLIMPORT also for it?  This is getting used by
> > pgbench for some unlikely() business, as one example.
> 
> After an extra look, PGDLLIMPORT missing from __pg_log_level looks
> like an imbroglio between 8ec5694, that has added the marking, and
> 9a374b77 that has removed it the same day.  All that has been fixed in
> 5edeb57.

It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers /
variables. What is that supposed to mean?

Greetings,

Andres Freund

Re: Crash in new pgstats code

2022-05-12 Thread Andres Freund

Hi,

On 2022-05-11 15:46:13 +0900, Michael Paquier wrote:
> On Tue, Apr 19, 2022 at 08:31:05PM +1200, Thomas Munro wrote:
> > On Tue, Apr 19, 2022 at 2:50 AM Andres Freund  wrote:
> > > Kestrel won't go that far back even - I set it up 23 days ago...
> > 
> > Here's a ~6 month old example from mylodon (I can't see much further
> > back than that with HTTP requests... I guess BF records are purged?):
> > 
> > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=mylodon&dt=2021-10-19%2022%3A57%3A54&stg=recovery-check
> 
> Do we have anything remaining on this thread in light of the upcoming
> beta1?  One fix has been pushed upthread, but it does not seem we are
> completely in the clear either.

I don't know what else there is to do, tbh.

Greetings,

Andres Freund

Re: make MaxBackends available in _PG_init

2022-05-12 Thread Anton A. Melnikov


On 12.05.2022 00:01, Nathan Bossart wrote:

On Wed, May 11, 2022 at 04:18:10PM -0400, Robert Haas wrote:

OK, I have committed 0001 now.


Thanks!



Maybe remove the first part from the patchset?
Because now the Patch Tester is giving apply error for the first part 
and can't test the other.

http://cfbot.cputube.org/patch_38_3614.log



With best regards,
--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Mark all GUC variable as PGDLLIMPORT

2022-05-12 Thread Tom Lane

Andres Freund  writes:
> On 2022-05-12 15:15:10 +0900, Michael Paquier wrote:
>> After an extra look, PGDLLIMPORT missing from __pg_log_level looks
>> like an imbroglio between 8ec5694, that has added the marking, and
>> 9a374b77 that has removed it the same day.  All that has been fixed in
>> 5edeb57.

> It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers /
> variables. What is that supposed to mean?

Yeah, that's why I removed it in 9a374b77.

regards, tom lane

Re: Crash in new pgstats code

2022-05-12 Thread Tom Lane

Andres Freund  writes:
> On 2022-05-11 15:46:13 +0900, Michael Paquier wrote:
>> Do we have anything remaining on this thread in light of the upcoming
>> beta1?  One fix has been pushed upthread, but it does not seem we are
>> completely in the clear either.

> I don't know what else there is to do, tbh.

Well, it was mostly you expressing misgivings upthread ;-).  But we
have not seen any pgstat crashes lately, so I'm content to mark the
open item as resolved.

regards, tom lane

Re: Crash in new pgstats code

2022-05-12 Thread Andres Freund

Hi,

On 2022-05-12 12:12:59 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-05-11 15:46:13 +0900, Michael Paquier wrote:
> >> Do we have anything remaining on this thread in light of the upcoming
> >> beta1?  One fix has been pushed upthread, but it does not seem we are
> >> completely in the clear either.
> 
> > I don't know what else there is to do, tbh.
> 
> Well, it was mostly you expressing misgivings upthread ;-).

Those mostly were about stuff in 14 as well... I guess it'd be good to figure
out precisely how the problem was triggered, but without further information
I don't quite see how to figure it out...


> But we have not seen any pgstat crashes lately, so I'm content to mark the
> open item as resolved.

Cool.

Greetings,

Andres Freund

Declaration fixes

2022-05-12 Thread Andres Freund

Hi,

I was experimenting with specifying symbol visiblity for functions explicitly,
i.e. adding PGDLLIMPORT markers for them, with the goal of getting rid of
src/tools/msvc/gendef.pl (and similar AIX stuff). While doing that I compared
the set of exported symbols before / after, leading me to find a few
pre-existing "issues".

I think the attached patches are all a good idea and trivial enought that I
think we should apply them now.

The changes are sufficiently obvious and/or explained in the commit messages.

Comments?

Greetings,

Andres Freund
>From 33e9aa35d0c0153ef62ce647b8cc748079908b07 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 12 May 2022 09:17:14 -0700
Subject: [PATCH v1 1/4] Add missing 'extern' to function prototypes.

Postgres style is to spell out extern. Noticed while scripting adding
PGDLLIMPORT markers to functions.

Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
 src/include/access/rewriteheap.h  |  2 +-
 src/include/port/win32_port.h | 26 -
 src/include/replication/message.h |  6 +-
 src/include/replication/origin.h  |  6 +-
 src/include/replication/reorderbuffer.h   | 70 +++
 src/include/replication/worker_internal.h |  4 +-
 src/include/storage/s_lock.h  |  4 +-
 src/include/tsearch/dicts/regis.h |  8 +--
 src/include/utils/attoptcache.h   |  2 +-
 src/include/utils/numeric.h   |  2 +-
 src/include/utils/pgstat_internal.h   |  2 +-
 src/include/utils/spccache.h  |  6 +-
 12 files changed, 69 insertions(+), 69 deletions(-)

diff --git a/src/include/access/rewriteheap.h b/src/include/access/rewriteheap.h
index aa5c48f219a..3e27790b3f0 100644
--- a/src/include/access/rewriteheap.h
+++ b/src/include/access/rewriteheap.h
@@ -52,6 +52,6 @@ typedef struct LogicalRewriteMappingData
  * ---
  */
 #define LOGICAL_REWRITE_FORMAT "map-%x-%x-%X_%X-%x-%x"
-void		CheckPointLogicalRewriteHeap(void);
+extern void CheckPointLogicalRewriteHeap(void);
 
 #endif			/* REWRITE_HEAP_H */
diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h
index 5045ced91b7..dbbf88f8e8c 100644
--- a/src/include/port/win32_port.h
+++ b/src/include/port/win32_port.h
@@ -455,10 +455,10 @@ extern PGDLLIMPORT HANDLE pgwin32_initial_signal_pipe;
 #define UNBLOCKED_SIGNAL_QUEUE()	(pg_signal_queue & ~pg_signal_mask)
 #define PG_SIGNAL_COUNT 32
 
-void		pgwin32_signal_initialize(void);
-HANDLE		pgwin32_create_signal_listener(pid_t pid);
-void		pgwin32_dispatch_queued_signals(void);
-void		pg_queue_signal(int signum);
+extern void pgwin32_signal_initialize(void);
+extern HANDLE pgwin32_create_signal_listener(pid_t pid);
+extern void pgwin32_dispatch_queued_signals(void);
+extern void pg_queue_signal(int signum);
 
 /* In src/port/kill.c */
 #define kill(pid,sig)	pgkill(pid,sig)
@@ -475,15 +475,15 @@ extern int	pgkill(int pid, int sig);
 #define recv(s, buf, len, flags) pgwin32_recv(s, buf, len, flags)
 #define send(s, buf, len, flags) pgwin32_send(s, buf, len, flags)
 
-SOCKET		pgwin32_socket(int af, int type, int protocol);
-int			pgwin32_bind(SOCKET s, struct sockaddr *addr, int addrlen);
-int			pgwin32_listen(SOCKET s, int backlog);
-SOCKET		pgwin32_accept(SOCKET s, struct sockaddr *addr, int *addrlen);
-int			pgwin32_connect(SOCKET s, const struct sockaddr *name, int namelen);
-int			pgwin32_select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timeval *timeout);
-int			pgwin32_recv(SOCKET s, char *buf, int len, int flags);
-int			pgwin32_send(SOCKET s, const void *buf, int len, int flags);
-int			pgwin32_waitforsinglesocket(SOCKET s, int what, int timeout);
+extern SOCKET pgwin32_socket(int af, int type, int protocol);
+extern int	pgwin32_bind(SOCKET s, struct sockaddr *addr, int addrlen);
+extern int	pgwin32_listen(SOCKET s, int backlog);
+extern SOCKET pgwin32_accept(SOCKET s, struct sockaddr *addr, int *addrlen);
+extern int	pgwin32_connect(SOCKET s, const struct sockaddr *name, int namelen);
+extern int	pgwin32_select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timeval *timeout);
+extern int	pgwin32_recv(SOCKET s, char *buf, int len, int flags);
+extern int	pgwin32_send(SOCKET s, const void *buf, int len, int flags);
+extern int	pgwin32_waitforsinglesocket(SOCKET s, int what, int timeout);
 
 extern PGDLLIMPORT int pgwin32_noblock;
 
diff --git a/src/include/replication/message.h b/src/include/replication/message.h
index b9686c28550..0b396c56693 100644
--- a/src/include/replication/message.h
+++ b/src/include/replication/message.h
@@ -34,8 +34,8 @@ extern XLogRecPtr LogLogicalMessage(const char *prefix, const char *message,
 
 /* RMGR API */
 #define XLOG_LOGICAL_MESSAGE	0x00
-void		logicalmsg_redo(XLogReaderState *record);
-void		logicalmsg_desc(StringInfo buf, XLogReaderState *record);
-const char *logicalmsg_identify(uint8 info);
+extern void logicalmsg_redo(XLogReaderState *record);

Re: Declaration fixes

2022-05-12 Thread Tom Lane

Andres Freund  writes:
> I was experimenting with specifying symbol visiblity for functions explicitly,
> i.e. adding PGDLLIMPORT markers for them, with the goal of getting rid of
> src/tools/msvc/gendef.pl (and similar AIX stuff). While doing that I compared
> the set of exported symbols before / after, leading me to find a few
> pre-existing "issues".

> I think the attached patches are all a good idea and trivial enought that I
> think we should apply them now.

+1 to all of that.  Would the changes you're working on result in getting
warnings for these sorts of oversights on mainstream compilers?

regards, tom lane

Re: Declaration fixes

2022-05-12 Thread Andres Freund

Hi,

On 2022-05-12 13:18:05 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I was experimenting with specifying symbol visiblity for functions 
> > explicitly,
> > i.e. adding PGDLLIMPORT markers for them, with the goal of getting rid of
> > src/tools/msvc/gendef.pl (and similar AIX stuff). While doing that I 
> > compared
> > the set of exported symbols before / after, leading me to find a few
> > pre-existing "issues".
> 
> > I think the attached patches are all a good idea and trivial enought that I
> > think we should apply them now.
> 
> +1 to all of that.

Cool.


> Would the changes you're working on result in getting
> warnings for these sorts of oversights on mainstream compilers?

I assume with "mainstream compiler" you basically mean gcc and clang? If so,
some oversights would be hard errors, some warnings, some runtime (i.e. symbol
not found errors when loading extension library) but some others unfortunately
would continue to only be visible in msvc :(.

Greetings,

Andres Freund

Re: First draft of the PG 15 release notes

2022-05-12 Thread Bruce Momjian

On Thu, May 12, 2022 at 08:05:40PM +0530, vignesh C wrote:
> I wonder if this is worth mentioning:
> 
> Raise a WARNING for missing publications.
> commit 8f2e2bbf145384784bad07a96d461c6bbd91f597
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8f2e2bbf145384784bad07a96d461c6bbd91f597

Reading the commit message, it looked like only a warning was being
added, which was more of a helpful change rather than something we need
to mention.  However, if this means you could can now create a
subscription for a missing publication that you couldn't do before, it
should be added --- I couldn't tell from the patch.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: [RFC] building postgres with meson -v8

2022-05-12 Thread Andres Freund

Hi,

On 2022-05-11 12:18:58 +0200, Peter Eisentraut wrote:
> I fixed the Perl detection issue in my macOS environment that I had reported
> a while ago.

Hm. I wonder if it's right to check with is_file() - perhaps there are
platforms that have split off the include directory?


> Then I added in support for all configure options that had not been ported
> over yet.  Some of these are rather trivial.

Thanks!

Some of these (extra version, krb srvname, ...) I just merged from a
colleague.

Will look at merging the others.


> After that, these configure options don't have an equivalent yet:
> 
> --disable-rpath
> --enable-profiling
> --disable-thread-safety
> --with-libedit-preferred
> 
> The first three overlap with meson built-in functionality, so we would need
> to check whether the desired functionality is available somehow.

Which builtin functionality does --enable-profiling overlap with? There's a
coverage option, perhaps you were thinking of that?

I don't think we should add --disable-thread-safety, platforms without it also
aren't going to support ninja / meson... IIRC Thomas was planning to submit a
patch getting rid of it independently...


> The last one we probably want to keep somehow; it would need a bit of fiddly
> work.

A colleague just finished that bit. Probably can be improved further, but it
works now...


> From 049b34b6a8dd949f0eb7987cad65f6682a6ec786 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut 
> Date: Wed, 11 May 2022 09:06:13 +0200
> Subject: [PATCH 3/9] meson: prereq: Refactor dtrace postprocessing make rules
> 
> Move the dtrace postprocessing sed commands into a separate file so
> that it can be shared by meson.  Also split the rule into two for
> proper dependency declaration.

Hm. Using sed may be problematic on windows...


> From fad02f1fb71ec8c64e47e5031726ffbee4a1dd84 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut 
> Date: Wed, 11 May 2022 09:53:01 +0200
> Subject: [PATCH 7/9] meson: Add system-tzdata option
> 
> ---
>  meson.build   | 3 +++
>  meson_options.txt | 3 +++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/meson.build b/meson.build
> index 7c9c6e7f23..b33a51a35d 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -246,6 +246,9 @@ cdata.set('RELSEG_SIZE', get_option('segsize') * 131072)
>  cdata.set('DEF_PGPORT', get_option('pgport'))
>  cdata.set_quoted('DEF_PGPORT_STR', get_option('pgport'))
>  cdata.set_quoted('PG_KRB_SRVNAM', 'postgres')
> +if get_option('system-tzdata') != ''
> +  cdata.set_quoted('SYSTEMTZDIR', get_option('system-tzdata'))
> +endif

This doesn't quite seem sufficient - we also need to prevent building &
installing our own timezone data...

Greetings,

Andres Freund

Re: Declaration fixes

2022-05-12 Thread Andres Freund

On 2022-05-12 11:38:39 -0700, Andres Freund wrote:
> On 2022-05-12 13:18:05 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > I think the attached patches are all a good idea and trivial enought that 
> > > I
> > > think we should apply them now.
> > 
> > +1 to all of that.
> 
> Cool.

Pushed them.

typedefs.list glitches

2022-05-12 Thread Tom Lane

I just completed the v15 pre-beta pgindent run.  It went reasonably
smoothly, but I had to hack up typedefs.list a little bit compared
to the version downloaded from the buildfarm.

* The buildfarm's list is missing
pg_md5_ctx
pg_sha1_ctx
pg_sha224_ctx
pg_sha256_ctx
pg_sha384_ctx
pg_sha512_ctx
which are certainly used, but only in some src/common files
that are built only in non-OpenSSL builds.  So evidently,
every buildfarm member that's contributing to the typedefs list
builds with OpenSSL.  That wouldn't surprise me, except that
my own animal sifaka should be filling that gap.  Looking at
its latest attempt[1], it seems to be generating an empty list,
which I guess means that our recipe for extracting typedefs
doesn't work on macOS/arm64.  I shall investigate.

* The buildfarm's list includes "value_type", which is surely
not typedef'd anywhere in our code, and that is messing up
some formatting involving JsonIsPredicate.value_type.
I suppose that is coming from some system header where it is
a typedef on some machines (komodoensis and lorikeet report it,
which seems like an odd pairing).  I think the best thing to
do here is rename that field while we still can, perhaps to
item_type.  Thoughts?

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2022-05-11%2020%3A21%3A15

Re: typedefs.list glitches

2022-05-12 Thread Tom Lane

I wrote:
> every buildfarm member that's contributing to the typedefs list
> builds with OpenSSL.  That wouldn't surprise me, except that
> my own animal sifaka should be filling that gap.  Looking at
> its latest attempt[1], it seems to be generating an empty list,
> which I guess means that our recipe for extracting typedefs
> doesn't work on macOS/arm64.  I shall investigate.

Found it.  Current macOS produces

$ objdump -W
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/objdump:
 error: unknown argument '-W'

where last year's vintage produced

$ objdump -W
objdump: Unknown command line argument '-W'.  Try: 
'/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/objdump
 --help'
objdump: Did you mean '-C'?

This confuses run_build.pl into taking the "Linux and sometimes windows"
code path instead of the $using_osx one.  I think simplest fix is to
move the $using_osx branch ahead of the heuristic ones, as attached.

regards, tom lane

--- run_build.pl.orig	2022-01-29 10:18:03.0 -0500
+++ run_build.pl	2022-05-12 16:59:58.0 -0400
@@ -2163,7 +2163,32 @@ sub find_typedefs
 		next if $bin =~ m!bin/(ipcclean|pltcl_)!;
 		next unless -f $bin;
 		next if -l $bin;# ignore symlinks to plain files (e.g. postmaster)
-		if (@err == 1)  # Linux and sometimes windows
+		if ($using_osx)
+		{
+			# no run_log due to redirections.
+			@dumpout =
+			  `dwarfdump $bin 2>/dev/null | egrep -A2 TAG_typedef 2>/dev/null`;
+			foreach (@dumpout)
+			{
+## no critic (RegularExpressions::ProhibitCaptureWithoutTest)
+@flds = split;
+if (@flds == 3)
+{
+	# old format
+	next unless ($flds[0] eq "AT_name(");
+	next unless ($flds[1] =~ m/^"(.*)"$/);
+	$syms{$1} = 1;
+}
+elsif (@flds == 2)
+{
+	# new format
+	next unless ($flds[0] eq "DW_AT_name");
+	next unless ($flds[1] =~ m/^\("(.*)"\)$/);
+	$syms{$1} = 1;
+}
+			}
+		}
+		elsif (@err == 1)  # Linux and sometimes windows
 		{
 			my $cmd = "$objdump -Wi $bin 2>/dev/null | "
 			  . "egrep -A3 DW_TAG_typedef 2>/dev/null";
@@ -2194,31 +2219,6 @@ sub find_typedefs
 $syms{ $flds[-1] } = 1;
 			}
 		}
-		elsif ($using_osx)
-		{
-			# no run_log due to redirections.
-			@dumpout =
-			  `dwarfdump $bin 2>/dev/null | egrep -A2 TAG_typedef 2>/dev/null`;
-			foreach (@dumpout)
-			{
-## no critic (RegularExpressions::ProhibitCaptureWithoutTest)
-@flds = split;
-if (@flds == 3)
-{
-	# old format
-	next unless ($flds[0] eq "AT_name(");
-	next unless ($flds[1] =~ m/^"(.*)"$/);
-	$syms{$1} = 1;
-}
-elsif (@flds == 2)
-{
-	# new format
-	next unless ($flds[0] eq "DW_AT_name");
-	next unless ($flds[1] =~ m/^\("(.*)"\)$/);
-	$syms{$1} = 1;
-}
-			}
-		}
 		else
 		{
 			# no run_log due to redirections.

Re: Comments on Custom RMGRs

2022-05-12 Thread Simon Riggs

On Thu, 12 May 2022 at 04:40, Andres Freund  wrote:
>
> On 2022-05-11 09:39:48 -0700, Jeff Davis wrote:
> > On Wed, 2022-05-11 at 15:24 +0100, Simon Riggs wrote:
> > > [PATCH: rmgr_001.v1.patch]
> > >
> > > [PATCH: rmgr_002.v1.patch]
> >
> > Thank you. Both of these look like good ideas, and I will commit them
> > in a few days assuming that nobody else sees a problem.
>
> What exactly is the use case here? Without passing in information about
> whether recovery will be performed etc, it's not at all clear how callbacks
> could something useful?

Sure, happy to do it that way.
[PATCH: rmgr_002.v2.patch]

> I don't think we should allocate a bunch of memory contexts to just free them
> immediately after?

Didn't seem a problem, but I've added code to use the flag requested above.

> > > It occurs to me that any use of WAL presumes that Checkpoints exist
> > > and do something useful. However, the custom rmgr interface doesn't
> > > allow you to specify any actions on checkpoint, so ends up being
> > > limited in scope. So I think we also need an rm_checkpoint() call -
> > > which would be a no-op for existing rmgrs.
> > > [PATCH: rmgr_003.v1.patch]
> >
> > I also like this idea, but can you describe the intended use case? I
> > looked through CheckPointGuts() and I'm not sure what else a custom AM
> > might want to do. Maybe sync special files in a way that's not handled
> > with RegisterSyncRequest()?
>
> I'm not happy with the idea of random code being executed in the middle of
> CheckPointGuts(), without any documentation of what is legal to do at that
> point.

The "I'm not happy.." ship has already sailed with pluggable rmgrs.

Checkpoints exist for one purpose - as the starting place for recovery.

Why would we allow pluggable recovery without *also* allowing
pluggable checkpoints?

>To actually be useful we'd likely need multiple calls to such an rmgr
> callback, with a parameter where in CheckPointGuts() we are. Right now the
> sequencing is explicit in CheckPointGuts(), but with the proposed callback,
> that'd not be the case anymore.

It is useful without the extra complexity you mention.

I see multiple uses for the rm_checkpoint() point proposed and I've
been asked multiple times for a checkpoint hook. Any rmgr that
services crash recovery for a non-smgr based storage system would need
this because the current checkpoint code only handles flushing to disk
for smgr-based approaches. That is orthogonal to other code during
checkpoint, so it stands alone quite well.

--
Simon Riggshttp://www.EnterpriseDB.com/

rmgr_002.v2.patch
Description: Binary data

Re: [PATCH] Log details for client certificate failures

2022-05-12 Thread Jacob Champion

On Thu, 2022-05-05 at 15:12 +, Jacob Champion wrote:
> On Wed, 2022-05-04 at 15:53 +0200, Peter Eisentraut wrote:
> > In terms of aligning what is printed, I meant that pg_stat_ssl uses the
> > issuer plus serial number to identify the certificate unambiguously.
> 
> Oh, that's a great idea. I'll do that too.

v2 limits the maximum subject length and adds the serial number to the
logs.

Thanks!
--Jacob
commit eb5ddc1d8909fe322ddeb1ec4bf9118bb9544667
Author: Jacob Champion 
Date:   Wed May 11 10:33:33 2022 -0700

squash! Log details for client certificate failures

- Add a maximum Subject length to prevent malicious client certs from
  spamming the logs.

- Add the certificate serial number to the output, for disambiguation if
  the Subject gets truncated unhelpfully.

diff --git a/src/backend/libpq/be-secure-openssl.c 
b/src/backend/libpq/be-secure-openssl.c
index 1683f9cc9e..3ccc23ba62 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -1094,6 +1094,8 @@ verify_cb(int ok, X509_STORE_CTX *ctx)
const char *errstring;
X509   *cert;
char   *certname = NULL;
+   char   *truncated = NULL;
+   char   *serialno = NULL;
 
if (ok)
{
@@ -1108,14 +1110,56 @@ verify_cb(int ok, X509_STORE_CTX *ctx)
 
cert = X509_STORE_CTX_get_current_cert(ctx);
if (cert)
+   {
+   size_t  namelen;
+   ASN1_INTEGER *sn;
+   BIGNUM *b;
+
+   /*
+* Get the Subject for logging, but don't let maliciously huge 
certs
+* flood the logs.
+*
+* Common Names are 64 chars max, so for a common case where 
the CN is
+* the last field, we can still print the longest possible CN 
with a
+* 7-character prefix (".../CN=[64 chars]"), for a reasonable 
limit of
+* 71 characters.
+*/
+#define MAXLEN 71
certname = X509_NAME_to_cstring(X509_get_subject_name(cert));
+   namelen = strlen(certname);
+
+   if (namelen > MAXLEN)
+   {
+   /*
+* Keep the end of the Subject, not the beginning, 
since the most
+* specific field is likely to give users the most 
information.
+*/
+   truncated = certname + namelen - MAXLEN;
+   truncated[0] = truncated[1] = truncated[2] = '.';
+   }
+#undef MAXLEN
+
+   /*
+* Pull the serial number, too, in case a Subject is still 
ambiguous.
+* This mirrors be_tls_get_peer_serial().
+*/
+   sn = X509_get_serialNumber(cert);
+   b = ASN1_INTEGER_to_BN(sn, NULL);
+   serialno = BN_bn2dec(b);
+
+   BN_free(b);
+   }
 
ereport(COMMERROR,
(errmsg("client certificate verification failed at 
depth %d: %s",
depth, errstring),
 /* only print detail if we have a certificate to print 
*/
-certname && errdetail("failed certificate's subject: 
%s", certname)));
+certname && errdetail("failed certificate had subject 
'%s', serial number %s",
+  truncated ? 
truncated : certname,
+  serialno ? 
serialno : _("unknown";
 
+   if (serialno)
+   OPENSSL_free(serialno);
if (certname)
pfree(certname);
 
diff --git a/src/test/ssl/conf/client-long.config 
b/src/test/ssl/conf/client-long.config
new file mode 100644
index 00..0e92a8fbfe
--- /dev/null
+++ b/src/test/ssl/conf/client-long.config
@@ -0,0 +1,14 @@
+# An OpenSSL format CSR config file for creating a client certificate with a
+# long Subject.
+
+[ req ]
+distinguished_name = req_distinguished_name
+prompt = no
+
+[ req_distinguished_name ]
+# Common Names are 64 characters max
+CN = 
ssl-123456789012345678901234567890123456789012345678901234567890
+OU = Some Organizational Unit
+O  = PostgreSQL Global Development Group
+
+# no extensions in client certs
diff --git a/src/test/ssl/ssl/client-long.crt b/src/test/ssl/ssl/client-long.crt
new file mode 100644
index 00..a1db55b5c3
--- /dev/null
+++ b/src/test/ssl/ssl/client-long.crt
@@ -0,0 +1,69 @@
+Certificate:
+Data:
+Version: 1 (0x0)
+Serial Number: 2315418733629425152 (0x202205121600)
+Signature Algorithm: sha256WithRSAEncryption
+Issuer: CN = Test CA for PostgreSQL SSL regression test client certs
+Validity
+Not Before: May 12 21:44:47 202

Re: Comments on Custom RMGRs

2022-05-12 Thread Andres Freund

Hi,

On 2022-05-12 22:26:51 +0100, Simon Riggs wrote:
> On Thu, 12 May 2022 at 04:40, Andres Freund  wrote:
> > I'm not happy with the idea of random code being executed in the middle of
> > CheckPointGuts(), without any documentation of what is legal to do at that
> > point.
> 
> The "I'm not happy.." ship has already sailed with pluggable rmgrs.

I don't agree. The ordering within a checkpoint is a lot more fragile than
what you do in an individual redo routine.

> Checkpoints exist for one purpose - as the starting place for recovery.
> 
> Why would we allow pluggable recovery without *also* allowing
> pluggable checkpoints?

Because one can do a lot of stuff with just pluggable WAL records, without
integrating into checkpoints?

Note that I'm *not* against making checkpoint extensible - I just think it
needs a good bit of design work around when the hook is called etc.

I definitely think it's too late in the cycle to add checkpoint extensibility
now.

> > To actually be useful we'd likely need multiple calls to such an rmgr
> > callback, with a parameter where in CheckPointGuts() we are. Right now the
> > sequencing is explicit in CheckPointGuts(), but with the proposed callback,
> > that'd not be the case anymore.
> 
> It is useful without the extra complexity you mention.

Shrug. The documentation work definitely is needed. Perhaps we can get away
without multiple callbacks within a checkpoint, I think it'll become more
apparent when writing information about the precise point in time the
checkpoint callback is called.

> I see multiple uses for the rm_checkpoint() point proposed and I've
> been asked multiple times for a checkpoint hook. Any rmgr that
> services crash recovery for a non-smgr based storage system would need
> this because the current checkpoint code only handles flushing to disk
> for smgr-based approaches. That is orthogonal to other code during
> checkpoint, so it stands alone quite well.

FWIW, for that there are much bigger problems than checkpoint
extensibility. Most importantly there's currently no good way to integrate
relation creation / drop with the commit / abort infrastructure...

Greetings,

Andres Freund

Re: Crash in new pgstats code

2022-05-12 Thread Michael Paquier

On Thu, May 12, 2022 at 09:33:05AM -0700, Andres Freund wrote:
> On 2022-05-12 12:12:59 -0400, Tom Lane wrote:
>> But we have not seen any pgstat crashes lately, so I'm content to mark the
>> open item as resolved.
> 
> Cool.

Okay, thanks for the feedback.  I have marked the item as resolved for
the time being.  Let's revisit it later if necessary.
--
Michael


signature.asc
Description: PGP signature

Re: Mark all GUC variable as PGDLLIMPORT

2022-05-12 Thread Michael Paquier

On Thu, May 12, 2022 at 11:59:49AM -0400, Tom Lane wrote:
>> It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers /
>> variables. What is that supposed to mean?
> 
> Yeah, that's why I removed it in 9a374b77.

Perhaps we should try to remove it from the header itself in the long
run, even if that's used in a couple of macros?  pgbench relies on it
to avoid building a debug string for a meta-command, and logging.h has
it in those compat macros..  I won't fight on that, though.

Anyway, I'll go remove the marking.  My apologies for the noise.
--
Michael

signature.asc
Description: PGP signature

Re: make MaxBackends available in _PG_init

2022-05-12 Thread Michael Paquier

On Thu, May 12, 2022 at 06:51:59PM +0300, Anton A. Melnikov wrote:
> Maybe remove the first part from the patchset?
> Because now the Patch Tester is giving apply error for the first part and
> can't test the other.
> http://cfbot.cputube.org/patch_38_3614.log

Yep.  As simple as the attached.
--
Michael
From d69844e34049d8c38b1e05fb75104469b2d123e1 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 18 Apr 2022 15:25:37 -0700
Subject: [PATCH v9] Add a new shmem_request_hook hook.

Currently, preloaded libraries are expected to request additional
shared memory and LWLocks in _PG_init().  However, it is not unusal
for such requests to depend on MaxBackends, which won't be
initialized at that time.  Such requests could also depend on GUCs
that other modules might change.  This introduces a new hook where
modules can safely use MaxBackends and GUCs to request additional
shared memory and LWLocks.

Furthermore, this change restricts requests for shared memory and
LWLocks to this hook.  Previously, libraries could make requests
until the size of the main shared memory segment was calculated.
Unlike before, we no longer silently ignore requests received at
invalid times.  Instead, we FATAL if someone tries to request
additional shared memory or LWLocks outside of the hook.

Authors: Julien Rouhaud, Nathan Bossart
Discussion: https://postgr.es/m/20220412210112.GA2065815%40nathanxps13
---
 src/include/miscadmin.h   |  5 
 src/backend/postmaster/postmaster.c   |  5 
 src/backend/storage/ipc/ipci.c| 20 +-
 src/backend/storage/lmgr/lwlock.c | 18 -
 src/backend/utils/init/miscinit.c | 15 +++
 doc/src/sgml/xfunc.sgml   | 12 +++--
 contrib/pg_prewarm/autoprewarm.c  | 17 +++-
 .../pg_stat_statements/pg_stat_statements.c   | 26 +--
 src/tools/pgindent/typedefs.list  |  1 +
 9 files changed, 81 insertions(+), 38 deletions(-)

diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 53fd168d93..0af130fbc5 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -465,6 +465,7 @@ extern void BaseInit(void);
 extern PGDLLIMPORT bool IgnoreSystemIndexes;
 extern PGDLLIMPORT bool process_shared_preload_libraries_in_progress;
 extern PGDLLIMPORT bool process_shared_preload_libraries_done;
+extern PGDLLIMPORT bool process_shmem_requests_in_progress;
 extern PGDLLIMPORT char *session_preload_libraries_string;
 extern PGDLLIMPORT char *shared_preload_libraries_string;
 extern PGDLLIMPORT char *local_preload_libraries_string;
@@ -478,9 +479,13 @@ extern bool RecheckDataDirLockFile(void);
 extern void ValidatePgVersion(const char *path);
 extern void process_shared_preload_libraries(void);
 extern void process_session_preload_libraries(void);
+extern void process_shmem_requests(void);
 extern void pg_bindtextdomain(const char *domain);
 extern bool has_rolreplication(Oid roleid);
 
+typedef void (*shmem_request_hook_type) (void);
+extern PGDLLIMPORT shmem_request_hook_type shmem_request_hook;
+
 /* in executor/nodeHash.c */
 extern size_t get_hash_memory_limit(void);
 
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index bf591f048d..3b73e26956 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1042,6 +1042,11 @@ PostmasterMain(int argc, char *argv[])
 	 */
 	InitializeMaxBackends();
 
+	/*
+	 * Give preloaded libraries a chance to request additional shared memory.
+	 */
+	process_shmem_requests();
+
 	/*
 	 * Now that loadable modules have had their chance to request additional
 	 * shared memory, determine the value of any runtime-computed GUCs that
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 75e456360b..26372d95b3 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -55,25 +55,21 @@ int			shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE;
 shmem_startup_hook_type shmem_startup_hook = NULL;
 
 static Size total_addin_request = 0;
-static bool addin_request_allowed = true;
-
 
 /*
  * RequestAddinShmemSpace
  *		Request that extra shmem space be allocated for use by
  *		a loadable module.
  *
- * This is only useful if called from the _PG_init hook of a library that
- * is loaded into the postmaster via shared_preload_libraries.  Once
- * shared memory has been allocated, calls will be ignored.  (We could
- * raise an error, but it seems better to make it a no-op, so that
- * libraries containing such calls can be reloaded if needed.)
+ * This may only be called via the shmem_request_hook of a library that is
+ * loaded into the postmaster via shared_preload_libraries.  Calls from
+ * elsewhere will fail.
  */
 void
 RequestAddinShmemSpace(Size size)
 {
-	if (IsUnderPostmaster || !addin_request_allowed)
-		return;	/* too late */
+	if (!process_shmem_requests_in_progress)
+

Re: First draft of the PG 15 release notes

2022-05-12 Thread Euler Taveira

On Thu, May 12, 2022, at 11:22 AM, Bruce Momjian wrote:
> On Thu, May 12, 2022 at 11:12:54AM -0300, Euler Taveira wrote:
> OB> On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote:
> > 
> > I looked at that but thought that everyone would already assume we
> > skipped replication of empty transactions, and I didn't see much impact
> > for the user, so I didn't include it.
> > 
> > It certainly has an impact on heavy workloads that replicate tables with few
> > modifications. It receives a high traffic of 'begin' and 'commit' messages 
> > that
> > the previous Postgres versions have to handle (discard). I would classify 
> > it as
> > a performance improvement for logical replication. Don't have a strong 
> > opinion
> > if it should be mentioned or not.
> 
> Oh, so your point is that a transaction that only has SELECT would
> previously send an empty transaction?  I thought this was only for apps
> that create literal empty transactions, which seem rare.
No. It should be a write transaction. If you have a replication setup that
publish only table foo (that isn't modified often) and most of your
workload does not contain table foo, Postgres sends 'begin' and 'commit'
messages to subscriber even if there is no change to replicate.

Let me show you an example:

postgres=# CREATE TABLE foo (a integer primary key, b text);
CREATE TABLE
postgres=# CREATE TABLE bar (c integer primary key, d text);
CREATE TABLE
postgres=# CREATE TABLE baz (e integer primary key, f text);
CREATE TABLE
postgres=# CREATE PUBLICATION pubfoo FOR TABLE foo;
CREATE PUBLICATION
postgres=# SELECT pg_create_logical_replication_slot('slotfoo', 'pgoutput');
pg_create_logical_replication_slot 

(slotfoo,0/E709AC50)
(1 row)

Let's create a transaction without table foo:

postgres=# BEGIN;
BEGIN
postgres=*# INSERT INTO bar (c, d) VALUES(1, 'blah');
INSERT 0 1
postgres=*# INSERT INTO baz (e, f) VALUES(2, 'xpto');
INSERT 0 1
postgres=*# COMMIT;
COMMIT

As you can see, the replication slot contains messages for that transaction.
Although, table bar and baz are NOT published, the begin (B) and commit (C)
messages that refers to this transaction are sent to subscriber.

postgres=# SELECT chr(get_byte(data, 0)) FROM 
pg_logical_slot_peek_binary_changes('slotfoo', NULL, NULL, 
'proto_version', '1', 'publication_names', 'pubfoo');
chr 
-
B
C
(2 rows)

If you execute another transaction without table foo, there will be another B/C
pair.

postgres=# DELETE FROM baz WHERE e = 2;
DELETE 1
postgres=# SELECT chr(get_byte(data, 0)) FROM 
pg_logical_slot_peek_binary_changes('slotfoo', NULL, NULL, 
'proto_version', '1', 'publication_names', 'pubfoo');
chr 
-
B
C
B
C
(4 rows)

Let's create a transaction that uses table foo but also table bar:

postgres=# BEGIN;
BEGIN
postgres=*# INSERT INTO foo (a, b) VALUES(100, 'asdf');
INSERT 0 1
postgres=*# INSERT INTO bar (c, d) VALUES(200, 'qwert');
INSERT 0 1
postgres=*# COMMIT;
COMMIT

In this case, there will be other messages since the publication pubfoo
publishes table foo. ('I' means there is an INSERT for table foo).

postgres=# SELECT chr(get_byte(data, 0)), length(data) FROM 
pg_logical_slot_peek_binary_changes('slotfoo', NULL, NULL, 
'proto_version', '1', 'publication_names', 'pubfoo');
chr | length 
-+
B   | 21
C   | 26
B   | 21
C   | 26
B   | 21
R   | 41
I   | 25
C   | 26
(8 rows)


In summary, a logical replication setup sends 47 bytes per skipped transaction.
v15 won't send the first 2 B/C pairs. Discussion started here [1].

[1] 
https://postgr.es/m/CAMkU=1yohp9-dv48flosprmqyeyys5zwkazgd41rjr10xin...@mail.gmail.com


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: gitmaster access

2022-05-12 Thread Kyotaro Horiguchi

At Thu, 12 May 2022 10:25:03 -0400, Bruce Momjian  wrote in 
> On Thu, May 12, 2022 at 01:54:57PM +0900, Kyotaro Horiguchi wrote:
> > At Thu, 12 May 2022 11:44:33 +0900 (JST), Kyotaro Horiguchi 
> >  wrote in 
> > > At Thu, 12 May 2022 10:34:49 +0900 (JST), Tatsuo Ishii 
> > >  wrote in 
> > > > Last year we faced a similar problem, namely, a new committer for
> > > > pgpool.git could not access the git repository (Permission denied
> > > > (publickey)). Magnus kindly advised following and it worked. Hope this
> > > > helps.
> > > > 
> > > > > 1. Log into the git server on https://git.postgresql.org/adm/. It
> > > > > should be an automatic log in and show the repository.
> > > > > 2. *then* go back to the main website and delete the ssh key
> > > > > 3. Now add the ssh key again on the main website
> > > > > 4. Wait 10-15 minutes and then it should work
> > > 
> > > Thank you for the info, but unfortunately it hasn't worked.
> > > I'm going to try a slightly different steps..
> > 
> > And finally I succeeded to clone from git.postgresql.org and to push a
> > commit.
> 
> Sorry, but this has me confused.  When I read this, I thought you were
> pushing a 'pgsql' core server commit to gitmaster, but that would be
> impossible for git.postgresql.org, so where are you pushing to?  This
> might be part of the confusion Dave was asking about.

The repo I mention here is pgtranslate.  Since I didn't find a clear
instruction about how to push to the repos of other than core, after
failing with "git.postgresql.org", I tried "gitmaster.postgresql.org"
following the wiki page[1].

I think Dave's first suggestion (use git.postgresql.org) had a point
in that gitmaster is dedicated to core committers. But I got
clearly understood that from the later conversatinos.


[1] https://wiki.postgresql.org/wiki/Committing_with_Git

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: First draft of the PG 15 release notes

2022-05-12 Thread osumi.takami...@fujitsu.com

On Thursday, May 12, 2022 11:10 PM 'Bruce Momjian'  wrote:
> On Thu, May 12, 2022 at 01:35:39PM +, osumi.takami...@fujitsu.com
> wrote:
> > I'd like to suggest that we mention a new option for subscription
> 'disable_on_error'.
> >
> >
> >
> https://github.com/postgres/postgres/commit/705e20f8550c0e8e47c0b6b20
> b5f5ffd6ffd9e33
> 
> Yes, I missed that one, added:
> 
>   
> 
>   
>   
>   Allow subscribers to stop logical replication application on error
>   (Osumi Takamichi, Mark Dilger)
>   
> 
>   
>   This is enabled with the subscriber option "disable_on_error"
>   and avoids possible infinite loops during stream application.
>   
>   
Thank you !

In this last paragraph, how about replacing "infinite loops"
with "infinite error loops" ? I think it makes the situation somewhat
clear for readers.



Best Regards,
Takamichi Osumi

recovery test failure on morepork with timestamp mystery

2022-05-12 Thread Melanie Plageman

Andres drew my attention to this [1] build farm failure.

Looks like a test I wrote resetting pg_stat_replication_slots is
failing:

  #   Failed test 'Check that reset timestamp is later after resetting
stats for slot 'test_slot' again.'
  #   at t/006_logical_decoding.pl line 261.
  #  got: 'f'
  # expected: 't'
  # Looks like you failed 1 test of 20.
  [19:59:58] t/006_logical_decoding.pl 

This is the test code itself:

  is( $node_primary->safe_psql(
  'postgres',
  qq(SELECT stats_reset > '$reset1'::timestamptz FROM
pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1')
),
qq(t),
qq(Check that reset timestamp is later after resetting stats for
slot '$stats_test_slot1' again.)
  );

This is the relevant SQL statement:

  SELECT stats_reset > '$reset1'::timestamptz FROM
pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1'

When this statement is executed, reset1 is as shown:

  2022-05-12 19:59:58.342 CEST [88829:3] 006_logical_decoding.pl LOG:
statement: SELECT stats_reset > '2022-05-12
19:59:58.402808+02'::timestamptz FROM pg_stat_replication_slots WHERE
slot_name = 'test_slot'

Note the timestamp of this execution. The stats reset occurred in the
past, and as such *must* have come before '2022-05-12
19:59:58.402808+02'::timestamptz.

The starred line is where `reset1` is fetched:

  2022-05-12 19:59:58.305 CEST [86784:2] [unknown] LOG:  connection
authorized: user=pgbf database=postgres
application_name=006_logical_decoding.pl
* 2022-05-12 19:59:58.306 CEST [86784:3] 006_logical_decoding.pl LOG:
statement: SELECT stats_reset FROM pg_stat_replication_slots WHERE
slot_name = 'test_slot'
  2022-05-12 19:59:58.308 CEST [86784:4] 006_logical_decoding.pl LOG:
disconnection: session time: 0:00:00.003 user=pgbf database=postgres
host=[local]
  2022-05-12 19:59:58.315 CEST [18214:1] [unknown] LOG:  connection
received: host=[local]
  2022-05-12 19:59:58.316 CEST [18214:2] [unknown] LOG:  connection
authorized: user=pgbf database=postgres
application_name=006_logical_decoding.pl
  2022-05-12 19:59:58.317 CEST [18214:3] 006_logical_decoding.pl LOG:
statement: SELECT pg_stat_reset_replication_slot(NULL)
  2022-05-12 19:59:58.322 CEST [18214:4] 006_logical_decoding.pl LOG:
disconnection: session time: 0:00:00.007 user=pgbf database=postgres
host=[local]
  2022-05-12 19:59:58.329 CEST [45967:1] [unknown] LOG:  connection
received: host=[local]
  2022-05-12 19:59:58.330 CEST [45967:2] [unknown] LOG:  connection
authorized: user=pgbf database=postgres
application_name=006_logical_decoding.pl
  2022-05-12 19:59:58.331 CEST [45967:3] 006_logical_decoding.pl LOG:
statement: SELECT stats_reset IS NOT NULL FROM
pg_stat_replication_slots WHERE slot_name = 'logical_slot'
  2022-05-12 19:59:58.333 CEST [45967:4] 006_logical_decoding.pl LOG:
disconnection: session time: 0:00:00.003 user=pgbf database=postgres
host=[local]
  2022-05-12 19:59:58.341 CEST [88829:1] [unknown] LOG:  connection
received: host=[local]
  2022-05-12 19:59:58.341 CEST [88829:2] [unknown] LOG:  connection
authorized: user=pgbf database=postgres
application_name=006_logical_decoding.pl
  2022-05-12 19:59:58.342 CEST [88829:3] 006_logical_decoding.pl LOG:
statement: SELECT stats_reset > '2022-05-12
19:59:58.402808+02'::timestamptz FROM pg_stat_replication_slots WHERE
slot_name = 'test_slot'
  2022-05-12 19:59:58.344 CEST [88829:4] 006_logical_decoding.pl LOG:
disconnection: session time: 0:00:00.003 user=pgbf database=postgres
host=[local]
  2022-05-12 19:59:58.350 CEST [50055:4] LOG:  received fast shutdown request
  2022-05-12 19:59:58.350 CEST [50055:5] LOG:  aborting any active transactions
  2022-05-12 19:59:58.352 CEST [50055:6] LOG:  background worker
"logical replication launcher" (PID 89924) exited with exit code 1
  2022-05-12 19:59:58.352 CEST [56213:1] LOG:  shutting down
  2022-05-12 19:59:58.352 CEST [56213:2] LOG:  checkpoint starting:
shutdown immediate
  2022-05-12 19:59:58.353 CEST [56213:3] LOG:  checkpoint complete:
wrote 4 buffers (3.1%); 0 WAL file(s) added, 0 removed, 0 recycled;
write=0.001 s, sync=0.001 s, total=0.001 s; sync files=0,
longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB
  2022-05-12 19:59:58.355 CEST [50055:7] LOG:  database system is shut down

stats_reset was set in the past, so `reset1` shouldn't be after
'2022-05-12 19:59:58.306 CEST'. It looks like the timestamp appearing in
the test query would correspond to a time after the database is shut
down.

- melanie

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=morepork&dt=2022-05-12%2017%3A50%3A47

Re: First draft of the PG 15 release notes

2022-05-12 Thread Amit Langote

On Thu, May 12, 2022 at 10:52 PM Bruce Momjian  wrote:
> Okay, I went with:
>
> Previously, such updates ran delete actions on the source
> partition and insert actions on the target partition.  PostgreSQL will
> now run update actions on the referenced partition root.

WFM, thanks.

Btw, perhaps the following should be listed under E.1.3.2.1. Logical
Replication, not E.1.3.1.1. Partitioning?





Remove incorrect duplicate partitions in system view
pg_publication_tables (Hou Zhijie)



Attached a patch to do so.

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Re: recovery test failure on morepork with timestamp mystery

2022-05-12 Thread Peter Smith

On Fri, May 13, 2022 at 11:43 AM Melanie Plageman
 wrote:
>
> Andres drew my attention to this [1] build farm failure.
>
> Looks like a test I wrote resetting pg_stat_replication_slots is
> failing:
>
>   #   Failed test 'Check that reset timestamp is later after resetting
> stats for slot 'test_slot' again.'
>   #   at t/006_logical_decoding.pl line 261.
>   #  got: 'f'
>   # expected: 't'
>   # Looks like you failed 1 test of 20.
>   [19:59:58] t/006_logical_decoding.pl 
>
> This is the test code itself:
>
>   is( $node_primary->safe_psql(
>   'postgres',
>   qq(SELECT stats_reset > '$reset1'::timestamptz FROM
> pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1')
> ),
> qq(t),
> qq(Check that reset timestamp is later after resetting stats for
> slot '$stats_test_slot1' again.)
>   );
>
> This is the relevant SQL statement:
>
>   SELECT stats_reset > '$reset1'::timestamptz FROM
> pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1'
>
> When this statement is executed, reset1 is as shown:
>
>   2022-05-12 19:59:58.342 CEST [88829:3] 006_logical_decoding.pl LOG:
> statement: SELECT stats_reset > '2022-05-12
> 19:59:58.402808+02'::timestamptz FROM pg_stat_replication_slots WHERE
> slot_name = 'test_slot'
>

I don't know if this is related, but I noticed that the log timestamp
(19:59:58.342) is reporting the $reset1 value (19:59:58.402808).

I did not understand how a timestamp saved from the past could be
ahead of the timestamp of the log.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

RE: Data is copied twice when specifying both child and parent table in publication

2022-05-12 Thread wangw.f...@fujitsu.com

On Thur, May 12, 2022 9:48 AM osumi.takami...@fujitsu.com 
 wrote:
> On Wednesday, May 11, 2022 11:33 AM I wrote:
> > On Monday, May 9, 2022 10:51 AM wangw.f...@fujitsu.com
> >  wrote:
> > > Attach new patches.
> > > The patch for HEAD:
> > > 1. Modify the approach. Enhance the API of function
> > > pg_get_publication_tables to handle one publication or an array of
> > > publications.
> > > The patch for REL14:
> > > 1. Improve the table sync SQL. [suggestions by Shi yu]
> > Hi, thank you for updating the patch !
> >
> > Minor comments on your patch for HEAD v2.
Thanks for your comments.

> > (1) commit message sentence
> >
> > I suggest below sentence.
> >
> > Kindly change from
> > "... when subscribing to both publications using one subscription, the data 
> > is
> > replicated twice in inital copy"
> > to "subscribing to both publications from one subscription causes initial 
> > copy
> > twice".
Improve it according to your suggestion.

> > (2) unused variable
> >
> > pg_publication.c: In function ‘pg_get_publication_tables’:
> > pg_publication.c:1091:11: warning: unused variable ‘pubname’
> > [-Wunused-variable]
> >   char*pubname;
> >
> > We can remove this.
Fix it.

> > (3) free of allocated memory
> >
> > In the pg_get_publication_tables(),
> > we don't free 'elems'. Don't we need it ?
Improve it according to your suggestion. Free 'elems'.

> > (4) some coding alignments
> >
> > 4-1.
> >
> > +   List   *tables_viaroot = NIL,
> > ...
> > +  *current_table = NIL;
> >
> > I suggest we can put some variables
> > into the condition for the first time call of this function, like 
> > tables_viaroot and
> > current_table.
> > When you agree, kindly change it.
Improve these according to your suggestions.
Also, I put the code for getting publication(s) into the condition for the
first time call of this function.

> > 4-2.
> >
> > +   /*
> > +* Publications support partitioned tables, although
> > all changes
> > +* are replicated using leaf partition identity and
> > schema, so we
> > +* only need those.
> > +*/
> > +   if (publication->alltables)
> > +   {
> > +   current_table =
> > GetAllTablesPublicationRelations(publication->pubviaroot);
> > +   }
> >
> > This is not related to the change itself and now we are inheriting the 
> > previous
> > curly brackets, but I think there's no harm in removing it, since it's only 
> > for one
> > statement.
Improve these according to your suggestions.

> Hi,
> 
> One more thing I'd like to add is that
> we don't hit the below code by tests.
> In the HEAD v2, we add a new filtering logic in pg_get_publication_tables.
> Although I'm not sure if this is related to the bug fix itself,
> when we want to include it in this patch, then
> I feel it's better to add some simple test for this part,
> to cover all the new main paths and check if
> new logic works correctly.
> 
> 
> +   /*
> +* If a partition table is published in a publication with 
> viaroot,
> +* and its parent or child table is published in another 
> publication
> +* without viaroot. Then we need to move the parent or child 
> table
> +* from tables to tables_viaroot.
> +*
> +* If all publication(s)'s viaroot are the same, then skip 
> this part.
> +*/
> 
> 
>if (ancestor_viaroot == ancestor)
> +   {
> +   tables = 
> foreach_delete_current(tables, lc2);
> +   change_tables =
> list_append_unique_oid(change_tables,
> + 
>  relid);
> +   }
Yes, I agree.
But when I was adding the test, I found we could improve this part.
So, I removed this part of the code.

Also rebase it because the change in HEAD(23e7b38).

Attach the patches.(Only changed the patch for HEAD.).
1. Improve the commit message. [suggestions by Osumi-san]
2. Improve coding alignments and the usage for SRFs. [suggestions by Osumi-san 
and I]
3. Simplify the modifications in function pg_get_publication_tables.

Regards,
Wang wei


HEAD_v3-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch
Description:  HEAD_v3-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch


REL14_v4-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch
Description:  REL14_v4-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch

Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508

2022-05-12 Thread Andres Freund

Hi,

I finally pushed the fix for this. Erik, thanks for the report! And thanks
Michael for the ping...

On 2022-05-11 20:32:14 -0700, Andres Freund wrote:
> On 2022-05-11 15:48:40 +0900, Michael Paquier wrote:

> > Ping.  It looks annoying to release beta1 with that, as assertions are
> > likely going to be enabled in a lot of test builds.

FWIW, it's somewhat hard to hit (basically the sender needs to stall while
sending out a transaction / network being really slow), so it'd not have been
likely to be hit by all that many people.

Greetings,

Andres Freund

Re: recovery test failure on morepork with timestamp mystery

2022-05-12 Thread Michael Paquier

On Fri, May 13, 2022 at 12:01:09PM +1000, Peter Smith wrote:
> I don't know if this is related, but I noticed that the log timestamp
> (19:59:58.342) is reporting the $reset1 value (19:59:58.402808).
> 
> I did not understand how a timestamp saved from the past could be
> ahead of the timestamp of the log.

morepork is not completely in the white in this area.  See the
following thread:
https://www.postgresql.org/message-id/x+r2vufkzdkcf...@paquier.xyz
--
Michael


signature.asc
Description: PGP signature

Re: recovery test failure on morepork with timestamp mystery

2022-05-12 Thread Andres Freund

Hi,

On 2022-05-12 21:42:43 -0400, Melanie Plageman wrote:
> Andres drew my attention to this [1] build farm failure.
> 
> Looks like a test I wrote resetting pg_stat_replication_slots is
> failing:
> 
>   #   Failed test 'Check that reset timestamp is later after resetting
> stats for slot 'test_slot' again.'
>   #   at t/006_logical_decoding.pl line 261.
>   #  got: 'f'
>   # expected: 't'
>   # Looks like you failed 1 test of 20.
>   [19:59:58] t/006_logical_decoding.pl 
> 
> This is the test code itself:
> 
>   is( $node_primary->safe_psql(
>   'postgres',
>   qq(SELECT stats_reset > '$reset1'::timestamptz FROM
> pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1')
> ),
> qq(t),
> qq(Check that reset timestamp is later after resetting stats for
> slot '$stats_test_slot1' again.)
>   );
> 
> This is the relevant SQL statement:
> 
>   SELECT stats_reset > '$reset1'::timestamptz FROM
> pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1'
> 
> When this statement is executed, reset1 is as shown:
> 
>   2022-05-12 19:59:58.342 CEST [88829:3] 006_logical_decoding.pl LOG:
> statement: SELECT stats_reset > '2022-05-12
> 19:59:58.402808+02'::timestamptz FROM pg_stat_replication_slots WHERE
> slot_name = 'test_slot'
> 
> Note the timestamp of this execution. The stats reset occurred in the
> past, and as such *must* have come before '2022-05-12
> 19:59:58.402808+02'::timestamptz.

The timestamp is computed during:

>   2022-05-12 19:59:58.317 CEST [18214:3] 006_logical_decoding.pl LOG:
> statement: SELECT pg_stat_reset_replication_slot(NULL)

One interesting tidbit is that the log timestamps are computed differently
(with elog.c:get_formatted_log_time()) than the reset timestamp (with
GetCurrentTimestamp()). Both use gettimeofday() internally.

I wonder if there's a chance that somehow openbsd ends up with more usecs than
"fit" in a second in the result of gettimeofday()? The elog.c case would
truncate everything above a second away afaics:
/* 'paste' milliseconds into place... */
sprintf(msbuf, ".%03d", (int) (saved_timeval.tv_usec / 1000));
memcpy(formatted_log_time + 19, msbuf, 4);

whereas GetCurrentTimestamp() would add them to the timestamp:
result = (TimestampTz) tp.tv_sec -
((POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * SECS_PER_DAY);
result = (result * USECS_PER_SEC) + tp.tv_usec;


Greetings,

Andres Freund

Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:

2022-05-12 Thread Thomas Munro

On Thu, May 12, 2022 at 4:57 PM Thomas Munro  wrote:
> On Thu, May 12, 2022 at 3:13 PM Thomas Munro  wrote:
> > error running SQL: 'psql::1: ERROR:  source database
> > "conflict_db_template" is being accessed by other users
> > DETAIL:  There is 1 other session using the database.'
>
> Oh, for this one I think it may just be that the autovacuum worker
> with PID 23757 took longer to exit than the 5 seconds
> CountOtherDBBackends() is prepared to wait, after sending it SIGTERM.

In this test, autovacuum_naptime is set to 1s (per Andres, AV was
implicated when he first saw the problem with pg_upgrade, hence desire
to crank it up).  That's not necessary: commenting out the active line
in ProcessBarrierSmgrRelease() shows that the tests reliably reproduce
data corruption without it.  Let's just take that out.

As for skink failing, the timeout was hard coded 300s for the whole
test, but apparently that wasn't enough under valgrind.  Let's use the
standard PostgreSQL::Test::Utils::timeout_default (180s usually), but
reset it for each query we send.

See attached.
From ffcb61004fd06c9b2db56c7fe045c7c726d67a72 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Fri, 13 May 2022 13:40:03 +1200
Subject: [PATCH] Fix slow animal timeouts in 032_relfilenode_reuse.pl.

Per BF animal chipmunk:  CREATE DATABASE could apparently fail due to an
AV process being in the template database and not quitting fast enough
for the 5 second timeout in CountOtherDBBackends().  The test script had
autovacuum_naptime=1s to encourage more activity that opens fds, but
that wasn't strictly necessary for this test.  Take it out.

Per BF animal skink:  the test had a global 300s timeout, but apparently
that was not enough under valgrind.  Use the standard timeout
PostgreSQL::Test::Utils::timeout_default, but reset it for each query we
run.

Discussion:
---
 src/test/recovery/t/032_relfilenode_reuse.pl | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/test/recovery/t/032_relfilenode_reuse.pl b/src/test/recovery/t/032_relfilenode_reuse.pl
index ac9340b7dd..5a6a759aa5 100644
--- a/src/test/recovery/t/032_relfilenode_reuse.pl
+++ b/src/test/recovery/t/032_relfilenode_reuse.pl
@@ -14,7 +14,6 @@ log_connections=on
 # to avoid "repairing" corruption
 full_page_writes=off
 log_min_messages=debug2
-autovacuum_naptime=1s
 shared_buffers=1MB
 ]);
 $node_primary->start;
@@ -28,11 +27,8 @@ $node_standby->init_from_backup($node_primary, $backup_name,
 	has_streaming => 1);
 $node_standby->start;

-# To avoid hanging while expecting some specific input from a psql
-# instance being driven by us, add a timeout high enough that it
-# should never trigger even on very slow machines, unless something
-# is really wrong.
-my $psql_timeout = IPC::Run::timer(300);
+# We'll reset this timeout for each individual query we run.
+my $psql_timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);

 my %psql_primary = (stdin => '', stdout => '', stderr => '');
 $psql_primary{run} = IPC::Run::start(
@@ -202,6 +198,9 @@ sub send_query_and_wait
 	my ($psql, $query, $untl) = @_;
 	my $ret;

+	$psql_timeout->reset();
+	$psql_timeout->start();
+
 	# send query
 	$$psql{stdin} .= $query;
 	$$psql{stdin} .= "\n";
-- 
2.36.0

Re: First draft of the PG 15 release notes

2022-05-12 Thread Amit Kapila

On Fri, May 13, 2022 at 7:19 AM Amit Langote  wrote:
>
> On Thu, May 12, 2022 at 10:52 PM Bruce Momjian  wrote:
> > Okay, I went with:
> >
> > Previously, such updates ran delete actions on the source
> > partition and insert actions on the target partition.  PostgreSQL 
> > will
> > now run update actions on the referenced partition root.
>
> WFM, thanks.
>
> Btw, perhaps the following should be listed under E.1.3.2.1. Logical
> Replication, not E.1.3.1.1. Partitioning?
>

Right.

> 
>
> 
> 
> Remove incorrect duplicate partitions in system view
> pg_publication_tables (Hou Zhijie)
> 
> 
>
> Attached a patch to do so.
>

I don't see any attachment.

-- 
With Regards,
Amit Kapila.

Re: First draft of the PG 15 release notes

2022-05-12 Thread Amit Langote

On Fri, May 13, 2022 at 11:44 AM Amit Kapila  wrote:
> On Fri, May 13, 2022 at 7:19 AM Amit Langote  wrote:
> >
> > On Thu, May 12, 2022 at 10:52 PM Bruce Momjian  wrote:
> > > Okay, I went with:
> > >
> > > Previously, such updates ran delete actions on the source
> > > partition and insert actions on the target partition.  PostgreSQL 
> > > will
> > > now run update actions on the referenced partition root.
> >
> > WFM, thanks.
> >
> > Btw, perhaps the following should be listed under E.1.3.2.1. Logical
> > Replication, not E.1.3.1.1. Partitioning?
> >
>
> Right.
>
> > 
> >
> > 
> > 
> > Remove incorrect duplicate partitions in system view
> > pg_publication_tables (Hou Zhijie)
> > 
> > 
> >
> > Attached a patch to do so.
> >
>
> I don't see any attachment.

Oops, attached this time.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


move-logirep-item.diff
Description: Binary data

Re: First draft of the PG 15 release notes

2022-05-12 Thread Amit Kapila

On Fri, May 13, 2022 at 6:02 AM Euler Taveira  wrote:
>
> On Thu, May 12, 2022, at 11:22 AM, Bruce Momjian wrote:
>
> On Thu, May 12, 2022 at 11:12:54AM -0300, Euler Taveira wrote:
> OB> On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote:
> >
> > I looked at that but thought that everyone would already assume we
> > skipped replication of empty transactions, and I didn't see much impact
> > for the user, so I didn't include it.
> >
> > It certainly has an impact on heavy workloads that replicate tables with few
> > modifications. It receives a high traffic of 'begin' and 'commit' messages 
> > that
> > the previous Postgres versions have to handle (discard). I would classify 
> > it as
> > a performance improvement for logical replication. Don't have a strong 
> > opinion
> > if it should be mentioned or not.
>
> Oh, so your point is that a transaction that only has SELECT would
> previously send an empty transaction?  I thought this was only for apps
> that create literal empty transactions, which seem rare.
>
> No. It should be a write transaction. If you have a replication setup that
> publish only table foo (that isn't modified often) and most of your
> workload does not contain table foo, Postgres sends 'begin' and 'commit'
> messages to subscriber even if there is no change to replicate.
>

It reduces network traffic and improves performance by 3-14% on simple
tests [1] like the one shown by Euler. I see a value in adding this as
for the workloads where it hits, it seems more than 99% of network
traffic [2] is due to these empty messages.

[1] - 
https://www.postgresql.org/message-id/OSZPR01MB63105A71CFAA46F5BD7C9D7CFD1E9%40OSZPR01MB6310.jpnprd01.prod.outlook.com
[2] - 
https://www.postgresql.org/message-id/CAMkU=1yohp9-dv48flosprmqyeyys5zwkazgd41rjr10xin...@mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: recovery test failure on morepork with timestamp mystery

2022-05-12 Thread Andres Freund

Hi,

On 2022-05-12 19:14:13 -0700, Andres Freund wrote:
> On 2022-05-12 21:42:43 -0400, Melanie Plageman wrote:
> > Andres drew my attention to this [1] build farm failure.

I just saw that there's another recent timestamp related failure:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gombessa&dt=2022-05-13%2002%3A58%3A52

It's pretty odd that we have two timestamp related failures in stats code that
hasn't changed in >30 days, both only on openbsd within the last ~10h. There's
not been a similar isolationtest failure before.

Greetings,

Andres Freund

Re: Skipping schema changes in publication

2022-05-12 Thread Peter Smith

On Thu, May 12, 2022 at 2:24 PM vignesh C  wrote:
>
...
> The attached patch has the implementation for "ALTER PUBLICATION
> pubname RESET". This command will reset the publication to default
> state which includes resetting the publication options, setting ALL
> TABLES option to false and dropping the relations and schemas that are
> associated with the publication.
>

Please see below my review comments for the v1-0001 (RESET) patch

==

1. Commit message

This patch adds a new RESET option to ALTER PUBLICATION which

Wording: "RESET option" -> "RESET clause"

~~~

2. doc/src/sgml/ref/alter_publication.sgml

+  
+   The RESET clause will reset the publication to default
+   state which includes resetting the publication options, setting
+   ALL TABLES option to false
and drop the
+   relations and schemas that are associated with the publication.
   

2a. Wording: "to default state" -> "to the default state"

2b. Wording: "and drop the relations..." -> "and dropping all relations..."

~~~

3. doc/src/sgml/ref/alter_publication.sgml

+   invoking user to be a superuser.  RESET of publication
+   requires invoking user to be a superuser. To alter the owner, you must also

Wording: "requires invoking user" -> "requires the invoking user"

~~~

4. doc/src/sgml/ref/alter_publication.sgml - Example

@@ -207,6 +220,12 @@ ALTER PUBLICATION sales_publication ADD ALL
TABLES IN SCHEMA marketing, sales;
production_publication:
 
 ALTER PUBLICATION production_publication ADD TABLE users,
departments, ALL TABLES IN SCHEMA production;
+
+
+  
+   Resetting the publication production_publication:
+
+ALTER PUBLICATION production_publication RESET;

Wording: "Resetting the publication" -> "Reset the publication"

~~~

5. src/backend/commands/publicationcmds.c

+ /* Check and reset the options */

IMO the code can just reset all these options unconditionally. I did
not see the point to check for existing option values first. I feel
the simpler code outweighs any negligible performance difference in
this case.

~~~

6. src/backend/commands/publicationcmds.c

+ /* Check and reset the options */

Somehow it seemed a pity having to hardcode all these default values
true/false in multiple places; e.g. the same is already hardcoded in
the parse_publication_options function.

To avoid multiple hard coded bools you could just call the
parse_publication_options with an empty options list. That would set
the defaults which you can then use:
values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactiondefs->insert);

Alternatively, maybe there should be #defines to use instead of having
the scattered hardcoded bool defaults:
#define PUBACTION_DEFAULT_INSERT true
#define PUBACTION_DEFAULT_UPDATE true
etc

~~~

7. src/include/nodes/parsenodes.h

@@ -4033,7 +4033,8 @@ typedef enum AlterPublicationAction
 {
  AP_AddObjects, /* add objects to publication */
  AP_DropObjects, /* remove objects from publication */
- AP_SetObjects /* set list of objects */
+ AP_SetObjects, /* set list of objects */
+ AP_ReSetPublication /* reset the publication */
 } AlterPublicationAction;

Unusual case: "AP_ReSetPublication" -> "AP_ResetPublication"

~~~

8. src/test/regress/sql/publication.sql

8a.
+-- Test for RESET PUBLICATION
SUGGESTED
+-- Tests for ALTER PUBLICATION ... RESET

8b.
+-- Verify that 'ALL TABLES' option is reset
SUGGESTED:
+-- Verify that 'ALL TABLES' flag is reset

8c.
+-- Verify that publish option and publish via root option is reset
SUGGESTED:
+-- Verify that publish options and publish_via_partition_root option are reset

8d.
+-- Verify that only superuser can execute RESET publication
SUGGESTED
+-- Verify that only superuser can reset a publication

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Comments on Custom RMGRs

2022-05-12 Thread Jeff Davis

On Thu, 2022-05-12 at 22:26 +0100, Simon Riggs wrote:
> I see multiple uses for the rm_checkpoint() point proposed and I've
> been asked multiple times for a checkpoint hook.

Can you elaborate and/or link to a discussion?

Regards,
Jeff Davis

Correct comment in ProcedureCreate() for pgstat_create_function() call.

2022-05-12 Thread Amul Sul

Hi,

PFA, attached patch to $SUBJECT.

-- 
Regards,
Amul Sul
EDB: http://www.enterprisedb.com

Re: Correct comment in ProcedureCreate() for pgstat_create_function() call.

2022-05-12 Thread Amul Sul

Sorry, hit the send button too early :|

Attached here !!

On Fri, May 13, 2022 at 10:20 AM Amul Sul  wrote:
>
> Hi,
>
> PFA, attached patch to $SUBJECT.
>
> --
> Regards,
> Amul Sul
> EDB: http://www.enterprisedb.com


code_comment.patch
Description: Binary data

Re: Data is copied twice when specifying both child and parent table in publication

2022-05-12 Thread Amit Kapila

On Fri, May 13, 2022 at 7:32 AM wangw.f...@fujitsu.com
 wrote:
>
> Attach the patches.(Only changed the patch for HEAD.).
>

Few comments:
=
1.
@@ -1135,6 +1172,15 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
  if (publication->pubviaroot)
  tables = filter_partitions(tables);
  }
+ pfree(elems);
+
+ /*
+ * We need an additional filter for this case : A partition table is
+ * published in a publication with viaroot, and its parent or child
+ * table is published in another publication without viaroot. In this
+ * case, we should publish only parent table.
+ */
+ tables = filter_partitions(tables);

Do we need to filter partitions twice? Can't we check if any of the
publications has 'pubviaroot' option set, if so, call
filter_partitions at the end?

2.  " FROM pg_class c JOIN pg_namespace n"
+ "   ON n.oid = c.relnamespace,"
+ " LATERAL pg_get_publication_tables(array[ %s ]) gst"

Here, it is better to have an alias name as gpt.

3.
  }
+ pfree(elems);
+

An extra line between these two lines makes it looks slightly better.

4. Not able to apply patch cleanly.
patching file src/test/subscription/t/013_partition.pl
Hunk #1 FAILED at 477.
Hunk #2 FAILED at 556.
Hunk #3 FAILED at 584.
3 out of 3 hunks FAILED -- saving rejects to file
src/test/subscription/t/013_partition.pl.rej
patching file src/test/subscription/t/028_row_filter.pl
Hunk #1 succeeded at 394 (offset 1 line).
Hunk #2 FAILED at 722.
1 out of 2 hunks FAILED -- saving rejects to file
src/test/subscription/t/028_row_filter.pl.rej
patching file src/test/subscription/t/031_column_list.pl
Hunk #1 succeeded at 948 (offset -92 lines).
Hunk #2 FAILED at 1050.
1 out of 2 hunks FAILED -- saving rejects to file
src/test/subscription/t/031_column_list.pl.rej

-- 
With Regards,
Amit Kapila.

RE: bogus: logical replication rows/cols combinations

2022-05-12 Thread houzj.f...@fujitsu.com

On Thursday, May 12, 2022 2:45 PM Amit Kapila  wrote:
> 
> On Wed, May 11, 2022 at 12:55 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Wednesday, May 11, 2022 11:33 AM Amit Kapila
>  wrote:
> > >
> > > Fair enough, then we should go with a simpler approach to detect it
> > > in pgoutput.c (get_rel_sync_entry).
> >
> > OK, here is the patch that try to check column list in that way. The
> > patch also check the column list when CREATE SUBSCRIPTION and when
> starting initial copy.
> >
> 
> Few comments:
> ===
> 1.
> initStringInfo(&cmd);
> - appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname,
> t.tablename\n"
> -"  FROM pg_catalog.pg_publication_tables t\n"
> + appendStringInfoString(&cmd,
> +"SELECT DISTINCT t.schemaname,\n"
> +"t.tablename,\n"
> +"(CASE WHEN (array_length(pr.prattrs, 1) = t.relnatts)\n"
> +"THEN NULL ELSE pr.prattrs END)\n"
> +"  FROM (SELECT P.pubname AS pubname,\n"
> +"   N.nspname AS schemaname,\n"
> +"   C.relname AS tablename,\n"
> +"   P.oid AS pubid,\n"
> +"   C.oid AS reloid,\n"
> +"   C.relnatts\n"
> +"  FROM pg_publication P,\n"
> +"  LATERAL pg_get_publication_tables(P.pubname) GPT,\n"
> +"  pg_class C JOIN pg_namespace N\n"
> +" ON (N.oid = C.relnamespace)\n"
> +"  WHERE C.oid = GPT.relid) t\n"
> +"  LEFT OUTER JOIN pg_publication_rel pr\n"
> +"   ON (t.pubid = pr.prpubid AND\n"
> +"pr.prrelid = reloid)\n"
> 
> Can we modify pg_publication_tables to get the row filter and column list and
> then use it directly instead of constructing this query?

Agreed. If we can get columnlist and rowfilter from pg_publication_tables, it
will be more convenient. And I think users that want to fetch the columnlist
and rowfilter of table can also benefit from it.

> 2.
> + if (list_member(tablelist, rv))
> + ereport(WARNING,
> + errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cannot use different column lists for table \"%s.%s\" in
> different publications",
> +nspname, relname));
> + else
> 
> Can we write comments to explain why we are using WARNING here instead of
> ERROR?
> 
> 3.
> static void
> -pgoutput_ensure_entry_cxt(PGOutputData *data, RelationSyncEntry *entry)
> +pgoutput_ensure_entry_cxt(PGOutputData *data, RelationSyncEntry *entry,
> +   Relation relation)
> 
> What is the need to change this interface as part of this patch?

Attach the new version patch which addressed these comments and update the
document. 0001 patch is to extent the view and 0002 patch is to add restriction
for column list.

Best regards,
Hou zj



0002-Disallow-combining-publication-when-column-list-is-d.patch
Description:  0002-Disallow-combining-publication-when-column-list-is-d.patch


0001-extent-pg_publication_tables.patch
Description: 0001-extent-pg_publication_tables.patch

Re: Privileges on PUBLICATION

2022-05-12 Thread Antonin Houska

Euler Taveira  wrote:

> On Tue, May 10, 2022, at 5:37 AM, Antonin Houska wrote:
> 
>  My understanding is that the rows/columns filtering is a way for the
>  *publisher* to control which data is available to particular replica. From
>  this point of view, the publication privileges would just make the control
>  complete.
> 
> I agree. IMO it is a new feature. We already require high privilege for 
> logical
> replication. Hence, we expect the replication user to have access to all data.
> Unfortunately, nobody mentioned about this requirement during the row filter /
> column list development; someone could have written a patch for GRANT ... ON
> PUBLICATION.

I can try that for PG 16, unless someone is already working on it.

> I understand your concern. Like I said in my last sentence in the previous
> email: it is a fine-grained access control on the publisher. Keep in mind that
> it will *only* work for non-superusers (REPLICATION attribute). It is not
> exposing something that we didn't expose before. In this particular case, 
> there
> is no mechanism to prevent the subscriber to obtain data provided by the
> various row filters if they know the publication names. We could probably add 
> a
> sentence to "Logical Replication > Security" section:
> 
> There is no privileges for publications. If you have multiple publications in 
> a
> database, a subscription can use all publications available.

Attached is my proposal. It tries to be more specific and does not mention the
absence of the privileges explicitly.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml
index 1a828e8d2ff..b74ba625649 100644
--- a/doc/src/sgml/ref/create_publication.sgml
+++ b/doc/src/sgml/ref/create_publication.sgml
@@ -94,6 +94,16 @@ CREATE PUBLICATION name
   list is specified, it must include the replica identity columns.
  
 
+ 
+  
+   If you are using the WHERE clause or the column list
+   to omit some table data from the replication for security reasons,
+   please make sure that the same data is not exposed via other
+   publications which contain the same table and have different (or
+   none) WHERE clause or column list.
+ 
+ 
+
  
   Only persistent base tables and partitioned tables can be part of a
   publication.  Temporary tables, unlogged tables, foreign tables,

Re: Privileges on PUBLICATION

2022-05-12 Thread Antonin Houska

Peter Eisentraut  wrote:

> On 10.05.22 10:37, Antonin Houska wrote:
> > My understanding is that the rows/columns filtering is a way for the
> > *publisher*  to control which data is available to particular replica. From
> > this point of view, the publication privileges would just make the control
> > complete.
> 
> I think privileges on publications would eventually be useful.  But are you
> arguing that we need them for PG15 to make the new features usable safely?

I didn't think that far, but user should be aware of the problem. My proposal
of documentation is in 
https://www.postgresql.org/message-id/5859.1652423797%40antos

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

71 matches

Mail list logo