Re: gitmaster access
> At Thu, 12 May 2022 14:44:15 +0900 (JST), Tatsuo Ishii > wrote in >> >> Thank you for the info, but unfortunately it hasn't worked. >> >> I'm going to try a slightly different steps.. >> > >> > And finally I succeeded to clone from git.postgresql.org and to push a >> > commit. >> >> Is it git.postgresql.org, not gitmaster.postgresql.org? Interesting... > > git.postgresql.org. I still receive "Permission denied" from > gitmaster. Ok. I learned that only postgresql.git should be accessed from gitmaster.postgresql.org. All other repos should be accessed from git.postgresql.org. BTW, > I'm going to try a slightly different steps.. Can you please tell me What you actually did? I am afraid of facing similar problem if I want to add another committer to pgpool2 repo. Best reagards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: gitmaster access
On Thu, 12 May 2022 at 08:03, Tatsuo Ishii wrote: > > At Thu, 12 May 2022 14:44:15 +0900 (JST), Tatsuo Ishii < > is...@sraoss.co.jp> wrote in > >> >> Thank you for the info, but unfortunately it hasn't worked. > >> >> I'm going to try a slightly different steps.. > >> > > >> > And finally I succeeded to clone from git.postgresql.org and to push > a > >> > commit. > >> > >> Is it git.postgresql.org, not gitmaster.postgresql.org? Interesting... > > > > git.postgresql.org. I still receive "Permission denied" from > > gitmaster. > > Ok. I learned that only postgresql.git should be accessed from > gitmaster.postgresql.org. All other repos should be accessed from > git.postgresql.org. That is correct for PostgreSQL Committers such as yourself. Anyone else can *only* use git.postgresql.org > > BTW, > > I'm going to try a slightly different steps.. > > Can you please tell me What you actually did? I am afraid of facing > similar problem if I want to add another committer to pgpool2 repo. > > Best reagards, > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese:http://www.sraoss.co.jp > -- -- Dave Page https://pgsnake.blogspot.com EDB Postgres https://www.enterprisedb.com
Re: bogus: logical replication rows/cols combinations
On Thu, May 12, 2022 at 12:15 PM Amit Kapila wrote: > > On Wed, May 11, 2022 at 12:55 PM houzj.f...@fujitsu.com > wrote: > > > > On Wednesday, May 11, 2022 11:33 AM Amit Kapila > > wrote: > > > > > > Fair enough, then we should go with a simpler approach to detect it in > > > pgoutput.c (get_rel_sync_entry). > > > > OK, here is the patch that try to check column list in that way. The patch > > also > > check the column list when CREATE SUBSCRIPTION and when starting initial > > copy. > > > > Few comments: > === ... One more point, I think we should update the docs for this. -- With Regards, Amit Kapila.
Re: gitmaster access
At Thu, 12 May 2022 16:03:50 +0900 (JST), Tatsuo Ishii wrote in > > At Thu, 12 May 2022 14:44:15 +0900 (JST), Tatsuo Ishii > > wrote in > BTW, > > I'm going to try a slightly different steps.. > > Can you please tell me What you actually did? I am afraid of facing > similar problem if I want to add another committer to pgpool2 repo. Cleared SSH key in user profile. +Reloaded the adm page. +Waited for 20 minutes. Filled in SSH key in user profile. Reloaded the adm page. Waited for 20 minutes. Run git clone and succeeded \o/ I'm not sure the additional steps gave substantial differences, though. Most of the whole steps look like bibbid-babbid-boo to me in the first place.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit
On Wed, May 11, 2022 at 7:39 PM Etsuro Fujita wrote: > On Fri, May 6, 2022 at 7:08 PM Etsuro Fujita wrote: > > > > On Wed, Apr 20, 2022 at 4:55 AM David Zhang > > > > wrote: > > > >> + * remote server in parallel at (sub)transaction end. > > > I noticed that the comment failed to mention that the > > parallel_commit option is disabled by default. Also, I noticed a > > comment above it: > > > > * It's enough to determine this only when making new connection because > > * all the connections to the foreign server whose keep_connections > > option > > * is changed will be closed and re-made later. > > > > This would apply to the parallel_commit option as well. How about > > updating these like the attached? (I simplified the latter comment > > and moved it to a more appropriate place.) > > I’m planning to commit this as a follow-up patch for commit 04e706d42. Done. Best regards, Etsuro Fujita
Re: First draft of the PG 15 release notes
On Wed, May 11, 2022 at 2:02 AM Bruce Momjian wrote: > > On Tue, May 10, 2022 at 04:17:59PM -0400, Jonathan Katz wrote: > > On 5/10/22 11:44 AM, Bruce Momjian wrote: > > > I have completed the first draft of the PG 15 release notes and you can > > > see the results here: > > > > > > https://momjian.us/pgsql_docs/release-15.html > > > > Thanks for pulling this together. > > > > + Allow logical replication to transfer sequence changes > > > > I believe this was reverted in 2c7ea57e5, unless some other parts of this > > work made it in. > > Yes, sorry, I missed that. Oddly, the unlogged sequence patch was > retained, even though there is no value for it on the primary. I > removed the sentence that mentioned that benefit from the release notes > since it doesn't apply to PG 15 anymore. > + Create unlogged sequences and allow them to be skipped in logical replication Is it right to say the second part of the sentence: "allow them to be skipped in logical replication" when we are not replicating them in the first place? -- With Regards, Amit Kapila.
Re: Make relfile tombstone files conditional on WAL level
Hi Dilip, On Fri, Mar 4, 2022 at 11:07 AM Dilip Kumar wrote: > > On Mon, Feb 21, 2022 at 1:21 PM Dilip Kumar wrote: > > > > On Thu, Jan 6, 2022 at 1:43 PM Dilip Kumar wrote: > > > > 2) GetNewRelFileNode() will not loop for checking the file existence > > > and retry with other relfilenode. > > > > > > Open Issues- there are currently 2 open issues in the patch 1) Issue > > as discussed above about removing the loop, so currently in this patch > > the loop is removed. 2) During upgrade from the previous version we > > need to advance the nextrelfilenode to the current relfilenode we are > > setting for the object in order to avoid the conflict. > > > In this version I have fixed both of these issues. Thanks Robert for > suggesting the solution for both of these problems in our offlist > discussion. Basically, for the first problem we can flush the xlog > immediately because we are actually logging the WAL every time after > we allocate 64 relfilenode so this should not have much impact on the > performance and I have added the same in the comments. And during > pg_upgrade, whenever we are assigning the relfilenode as part of the > upgrade we will set that relfilenode + 1 as nextRelFileNode to be > assigned so that we never generate the conflicting relfilenode. > > The only part I do not like in the patch is that before this patch we > could directly access the buftag->rnode. But since now we are not > having directly relfilenode as part of the buffertag and instead of > that we are keeping individual fields (i.e. dbOid, tbsOid and relNode) > in the buffer tag. So if we have to directly get the relfilenode we > need to generate it. However those changes are very limited to just 1 > or 2 file so maybe not that bad. > v5 patch needs a rebase and here are a few comments for 0002, I found while reading that, hope that helps: +/* Number of RelFileNode to prefetch (preallocate) per XLOG write */ +#define VAR_RFN_PREFETCH 8192 + Should it be 64, as per comment in XLogPutNextRelFileNode for XLogFlush() ? --- + /* +* Check for the wraparound for the relnode counter. +* +* XXX Actually the relnode is 56 bits wide so we don't need to worry about +* the wraparound case. +*/ + if (ShmemVariableCache->nextRelNode > MAX_RELFILENODE) Very rare case, should use unlikely()? --- +/* + * Max value of the relfilnode. Relfilenode will be of 56bits wide for more + * details refer comments atop BufferTag. + */ +#define MAX_RELFILENODEuint64) 1) << 56) - 1) Should there be 57-bit shifts here? Instead, I think we should use INT64CONST(0xFF) to be consistent with PG_*_MAX declarations, thoughts? --- + /* If we run out of logged for use RelNode then we must log more */ + if (ShmemVariableCache->relnodecount == 0) Might relnodecount never go below, but just to be safer should check <= 0 instead. --- Few typos: Simmialr Simmilar agains idealy Regards, Amul
Reproducible coliisions in jsonb_hash
Hello, I've noticed that jsonb_hash_extended(jsonb_v,0) = jsonb_hash_extended(jsonb_build_array(jsonb_v),0) for any jsonb value jsonb_v. AFAICT it happens because when iterating over a jsonb the hash function makes no distinction between raw scalars and arrays (it doesn't inspect v.val.array.rawScalar) https://github.com/postgres/postgres/blob/27b77ecf9f4d5be211900eda54d8155ada50d696/src/backend/utils/adt/jsonb_op.c#L326 Is this an intended behaviour or a bug? Cheers, Valeriy
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022 at 2:25 PM Amit Kapila wrote: > > On Wed, May 11, 2022 at 2:02 AM Bruce Momjian wrote: > > > > On Tue, May 10, 2022 at 04:17:59PM -0400, Jonathan Katz wrote: > > > On 5/10/22 11:44 AM, Bruce Momjian wrote: > > > > I have completed the first draft of the PG 15 release notes and you can > > > > see the results here: > > > > > > > > https://momjian.us/pgsql_docs/release-15.html > > > > > > Thanks for pulling this together. > > > > > > + Allow logical replication to transfer sequence changes > > > > > > I believe this was reverted in 2c7ea57e5, unless some other parts of this > > > work made it in. > > > > Yes, sorry, I missed that. Oddly, the unlogged sequence patch was > > retained, even though there is no value for it on the primary. I > > removed the sentence that mentioned that benefit from the release notes > > since it doesn't apply to PG 15 anymore. > > > > + Create unlogged sequences and allow them to be skipped in logical > replication > > Is it right to say the second part of the sentence: "allow them to be > skipped in logical replication" when we are not replicating them in > the first place? > One more point related to logical replication features: > Add SQL functions to monitor the directory contents of replication slots (Bharath Rupireddy) Specifically, the functions are pg_ls_logicalsnapdir(), pg_ls_logicalmapdir(), and pg_ls_replslotdir(). They can be run by members of the predefined pg_monitor role. > This feature is currently under the section "Streaming Replication and Recovery". Shouldn't it be under "Logical Replication"? The function names themselves seem to indicate that they are used for logical replication contents. I think the replication slot-related function would fall under both categories but overall it seems to belong to the "Logical Replication" section. -- With Regards, Amit Kapila.
Re: postgres_fdw "parallel_commit" docs
On Wed, May 11, 2022 at 7:25 PM Etsuro Fujita wrote: > One thing I noticed is this bit: > > -When multiple remote (sub)transactions are involved in a local > -(sub)transaction, by default postgres_fdw commits > -those remote (sub)transactions one by one when the local (sub)transaction > -commits. > -Performance can be improved with the following option: > +When multiple remote transactions or subtransactions are involved in a > +local transaction (or subtransaction) on a foreign server, > +postgres_fdw by default commits those remote > +transactions serially when the local transaction commits. > Performance can be > +improved with the following option: > > I think this might still be a bit confusing. How about rewriting it > to something like this? > > As described in F.38.4. Transaction Management, in postgres_fdw > transactions are managed by creating corresponding remote > transactions, and subtransactions are managed by creating > corresponding remote subtransactions. When multiple remote > transactions are involved in the current local transaction, > postgres_fdw by default commits those remote transactions serially > when the local transaction is committed. When multiple remote > subtransactions are involved in the current local subtransaction, it > by default commits those remote subtransactions serially when the > local subtransaction is committed. Performance can be improved with > the following option: > > It might be a bit redundant to explain the transaction/subtransaction > cases differently, but I think it makes it clear and maybe > easy-to-understand that how they are handled by postgres_fdw by > default. I modified the patch that way. On Wed, May 11, 2022 at 7:29 PM Etsuro Fujita wrote: > On Tue, May 10, 2022 at 12:58 AM Justin Pryzby wrote: > > On Mon, May 09, 2022 at 11:37:35AM -0400, Jonathan S. Katz wrote: > > > - If multiple foreign servers with this option enabled are involved > > > in > > > - a local (sub)transaction, multiple remote (sub)transactions > > > opened on > > > - those foreign servers in the local (sub)transaction are committed > > > in > > > - parallel across those foreign servers when the local > > > (sub)transaction > > > - commits. > > > + If multiple foreign servers with this option enabled have a local > > > + transaction, multiple remote transactions on those foreign > > > servers are > > > + committed in parallel across those foreign servers when the local > > > + transaction is committed. > > > > > > > I think "have a transaction" doesn't sound good, and the old language > > "involved > > in" was better. > > I think so too. I modified the patch to use the old language. Also, I fixed a typo reported by Justin. Attached is an updated patch. I'll commit the patch if no objections. Best regards, Etsuro Fujita parallel-commit-docs-efujita.patch Description: Binary data
Re: First draft of the PG 15 release notes
On Wed, May 11, 2022 at 1:44 AM Bruce Momjian wrote: > > I have completed the first draft of the PG 15 release notes and you can > see the results here: > > https://momjian.us/pgsql_docs/release-15.html > > The feature count is similar to recent major releases: > > release-10 195 > release-11 185 > release-12 198 > release-13 183 > release-14 229 > --> release-15 186 > > I assume there will be major adjustments in the next few weeks based on > feedback. > I wonder if this is worth mentioning: Skip empty transactions for logical replication. commit d5a9d86d8ffcadc52ff3729cd00fbd83bc38643c https://github.com/postgres/postgres/commit/d5a9d86d8ffcadc52ff3729cd00fbd83bc38643c regards, Ajin Cherian Fujitsu Australia
Re: First draft of the PG 15 release notes
On 5/11/22 10:50 PM, Ian Lawrence Barwick wrote: 2022年5月12日(木) 11:46 Bruce Momjian : On Thu, May 12, 2022 at 11:40:17AM +0900, Ian Lawrence Barwick wrote: Looking at the release notes from the point of view of someone who has maybe not been following the long-running debate on removing exclusive backups: "Important-sounding backup thing is suddenly gone! What was that again? Hmm, can't find anything in the now-current Pg 15 docs [*], do I need to worry about this!?" [*] the backup section has removed all mention of the word "exclusive" https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP versus: "Long-deprecated thing is finally gone, ah OK whatever". I am thinking back here to a point in my working life where the release notes were reviewed (by a team including non-Pg specialists) for potential issues when considering a major upgrade - from experience the more clarity with this kind of change the better so as not to unnecessarily raise alarm bells. Ah, you are right. I thought I had "deprecated" in the text, but I now see I did not, and we do have cases where we mention the deprecated status in previous release notes, so the new text is: Remove long-deprecated exclusive backup mode (David Steele, Nathan Bossart) That works, thanks! A bit late to this conversation, but +1 from me. -- -David da...@pgmasters.net
Re: Privileges on PUBLICATION
On Tue, May 10, 2022, at 5:37 AM, Antonin Houska wrote: > My understanding is that the rows/columns filtering is a way for the > *publisher* to control which data is available to particular replica. From > this point of view, the publication privileges would just make the control > complete. I agree. IMO it is a new feature. We already require high privilege for logical replication. Hence, we expect the replication user to have access to all data. Unfortunately, nobody mentioned about this requirement during the row filter / column list development; someone could have written a patch for GRANT ... ON PUBLICATION. I understand your concern. Like I said in my last sentence in the previous email: it is a fine-grained access control on the publisher. Keep in mind that it will *only* work for non-superusers (REPLICATION attribute). It is not exposing something that we didn't expose before. In this particular case, there is no mechanism to prevent the subscriber to obtain data provided by the various row filters if they know the publication names. We could probably add a sentence to "Logical Replication > Security" section: There is no privileges for publications. If you have multiple publications in a database, a subscription can use all publications available. -- Euler Taveira EDB https://www.enterprisedb.com/
Re: postgres_fdw "parallel_commit" docs
Hi Etsuro, On 5/12/22 7:26 AM, Etsuro Fujita wrote: I modified the patch to use the old language. Also, I fixed a typo reported by Justin. Attached is an updated patch. I'll commit the patch if no objections. Thanks for reviewing and revising! I think this is much easier to read. I made a few minor copy edits. Please see attached. Thanks, Jonathan diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml index b43d0aecba..e35ae4ff72 100644 --- a/doc/src/sgml/postgres-fdw.sgml +++ b/doc/src/sgml/postgres-fdw.sgml @@ -460,10 +460,16 @@ OPTIONS (ADD password_required 'false'); Transaction Management Options -When multiple remote (sub)transactions are involved in a local -(sub)transaction, by default postgres_fdw commits -those remote (sub)transactions one by one when the local (sub)transaction -commits. +As described in the Transaction Management section, in +postgres_fdw transactions are managed by creating +corresponding remote transactions, and subtransactions are managed by +creating corresponding remote subtransactions. When multiple remote +transactions are involved in the current local transaction, by default +postgres_fdw commits those remote transactions +serially when the local transaction is committed. When +multiple remote subtransactions are involved in the current local +subtransaction, by default postgres_fdw commits those +remote subtransactions serially when the local subtransaction is committed. Performance can be improved with the following option: @@ -474,26 +480,24 @@ OPTIONS (ADD password_required 'false'); This option controls whether postgres_fdw commits - remote (sub)transactions opened on a foreign server in a local - (sub)transaction in parallel when the local (sub)transaction commits. - This option can only be specified for foreign servers, not per-table. - The default is false. + in parallel remote transactions opened on a foreign server in a local + transaction when the local transaction is committed. This setting also + applies to remote and local subtransactions. This option can only be + specified for foreign servers, not per-table. The default is + false. - If multiple foreign servers with this option enabled are involved in - a local (sub)transaction, multiple remote (sub)transactions opened on - those foreign servers in the local (sub)transaction are committed in - parallel across those foreign servers when the local (sub)transaction - commits. + If multiple foreign servers with this option enabled are involved in a + local transaction, multiple remote transactions on those foreign + servers are committed in parallel across those foreign servers when + the local transaction is committed. - For a foreign server with this option enabled, if many remote - (sub)transactions are opened on the foreign server in a local - (sub)transaction, this option might increase the remote server's load - when the local (sub)transaction commits, so be careful when using this - option. + When this option is enabled, a foreign server with many remote + transactions may see a negative performance impact when the local + transaction is committed. OpenPGP_signature Description: OpenPGP digital signature
RE: First draft of the PG 15 release notes
On Wednesday, May 11, 2022 12:44 AM Bruce Momjian wrote: > I have completed the first draft of the PG 15 release notes and you can see > the > results here: > > https://momjian.us/pgsql_docs/release-15.html > > The feature count is similar to recent major releases: > > release-10 195 > release-11 185 > release-12 198 > release-13 183 > release-14 229 > --> release-15 186 > > I assume there will be major adjustments in the next few weeks based on > feedback. Hi, I'd like to suggest that we mention a new option for subscription 'disable_on_error'. https://github.com/postgres/postgres/commit/705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33 Best Regards, Takamichi Osumi
Re: Reproducible coliisions in jsonb_hash
On 2022-05-12 Th 07:02, Valeriy Meleshkin wrote: > Hello, > > I've noticed that > > jsonb_hash_extended(jsonb_v,0) = > jsonb_hash_extended(jsonb_build_array(jsonb_v),0) > > for any jsonb value jsonb_v. > > AFAICT it happens because when iterating over a jsonb the hash function makes > no distinction between raw scalars and arrays (it doesn't inspect > v.val.array.rawScalar) > https://github.com/postgres/postgres/blob/27b77ecf9f4d5be211900eda54d8155ada50d696/src/backend/utils/adt/jsonb_op.c#L326 > > Is this an intended behaviour or a bug? > It does look rather like a bug, but I'm unclear about the implications of fixing it. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022 at 02:27:26PM +0900, Amit Langote wrote: > > Yes, this is what I needed to know. The updated text is: > > > > > > > > > > > > Improve foreign key behavior of updates on partitioned tables > > that move rows between partitions (Amit Langote) > > > > > > > > Previously, such updates ran delete actions on the source partition > > and insert actions on the target partition. PostgreSQL will now > > run update actions on the partition root. > > > > > > Looks fine to me. Though I think maybe we should write the last > sentence as "PostgreSQL will now run update actions on the partition > root mentioned in the query" to be less ambiguous about which "root", > because it can also mean the actual root table in the partition tree. > A user may be updating only a particular subtree by mentioning that > subtree's root in the query, for example. Okay, I went with: Previously, such updates ran delete actions on the source partition and insert actions on the target partition. PostgreSQL will now run update actions on the referenced partition root. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022 at 02:25:36PM +0530, Amit Kapila wrote: > On Wed, May 11, 2022 at 2:02 AM Bruce Momjian wrote: > > Yes, sorry, I missed that. Oddly, the unlogged sequence patch was > > retained, even though there is no value for it on the primary. I > > removed the sentence that mentioned that benefit from the release notes > > since it doesn't apply to PG 15 anymore. > > > > + Create unlogged sequences and allow them to be skipped in logical > replication > > Is it right to say the second part of the sentence: "allow them to be > skipped in logical replication" when we are not replicating them in > the first place? Oops, yeah, that second part was reverted; new text: Allow the creation of unlogged sequences (Peter Eisentraut) -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: Reproducible coliisions in jsonb_hash
Andrew Dunstan writes: > On 2022-05-12 Th 07:02, Valeriy Meleshkin wrote: >> AFAICT it happens because when iterating over a jsonb the hash function >> makes no distinction between raw scalars and arrays (it doesn't inspect >> v.val.array.rawScalar) > It does look rather like a bug, but I'm unclear about the implications > of fixing it. Changing this hash algorithm would break existing hash indexes on jsonb columns. Maybe there aren't any, but even if so I can't get very excited about changing this. Hash algorithms always have collisions, and we have never made any promise that ours are cryptographically strong. regards, tom lane
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022 at 04:40:49PM +0530, Amit Kapila wrote: > One more point related to logical replication features: > > > > Add SQL functions to monitor the directory contents of replication > slots (Bharath Rupireddy) > > Specifically, the functions are pg_ls_logicalsnapdir(), > pg_ls_logicalmapdir(), and pg_ls_replslotdir(). They can be run by > members of the predefined pg_monitor role. > > > > This feature is currently under the section "Streaming Replication and > Recovery". Shouldn't it be under "Logical Replication"? The function > names themselves seem to indicate that they are used for logical > replication contents. I think the replication slot-related function > would fall under both categories but overall it seems to belong to the > "Logical Replication" section. Oh, very good point! I missed that this is logical-slot-only monitoring, so I moved the item to logical replication and changed the description to: Add SQL functions to monitor the directory contents of logical replication slots (Bharath Rupireddy) -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022 at 09:32:48PM +1000, Ajin Cherian wrote: > On Wed, May 11, 2022 at 1:44 AM Bruce Momjian wrote: > > > > I have completed the first draft of the PG 15 release notes and you can > > see the results here: > > > > https://momjian.us/pgsql_docs/release-15.html > > > > The feature count is similar to recent major releases: > > > > release-10 195 > > release-11 185 > > release-12 198 > > release-13 183 > > release-14 229 > > --> release-15 186 > > > > I assume there will be major adjustments in the next few weeks based on > > feedback. > > > > I wonder if this is worth mentioning: > > Skip empty transactions for logical replication. > commit d5a9d86d8ffcadc52ff3729cd00fbd83bc38643c > > https://github.com/postgres/postgres/commit/d5a9d86d8ffcadc52ff3729cd00fbd83bc38643c I looked at that but thought that everyone would already assume we skipped replication of empty transactions, and I didn't see much impact for the user, so I didn't include it. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022 at 01:35:39PM +, osumi.takami...@fujitsu.com wrote: > I'd like to suggest that we mention a new option for subscription > 'disable_on_error'. > > > https://github.com/postgres/postgres/commit/705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33 Yes, I missed that one, added: Allow subscribers to stop logical replication application on error (Osumi Takamichi, Mark Dilger) This is enabled with the subscriber option "disable_on_error" and avoids possible infinite loops during stream application. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote: > I looked at that but thought that everyone would already assume we > skipped replication of empty transactions, and I didn't see much impact > for the user, so I didn't include it. It certainly has an impact on heavy workloads that replicate tables with few modifications. It receives a high traffic of 'begin' and 'commit' messages that the previous Postgres versions have to handle (discard). I would classify it as a performance improvement for logical replication. Don't have a strong opinion if it should be mentioned or not. -- Euler Taveira EDB https://www.enterprisedb.com/
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022 at 11:12:54AM -0300, Euler Taveira wrote: OB> On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote: > > I looked at that but thought that everyone would already assume we > skipped replication of empty transactions, and I didn't see much impact > for the user, so I didn't include it. > > It certainly has an impact on heavy workloads that replicate tables with few > modifications. It receives a high traffic of 'begin' and 'commit' messages > that > the previous Postgres versions have to handle (discard). I would classify it > as > a performance improvement for logical replication. Don't have a strong opinion > if it should be mentioned or not. Oh, so your point is that a transaction that only has SELECT would previously send an empty transaction? I thought this was only for apps that create literal empty transactions, which seem rare. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: gitmaster access
On Thu, May 12, 2022 at 01:54:57PM +0900, Kyotaro Horiguchi wrote: > At Thu, 12 May 2022 11:44:33 +0900 (JST), Kyotaro Horiguchi > wrote in > > At Thu, 12 May 2022 10:34:49 +0900 (JST), Tatsuo Ishii > > wrote in > > > Last year we faced a similar problem, namely, a new committer for > > > pgpool.git could not access the git repository (Permission denied > > > (publickey)). Magnus kindly advised following and it worked. Hope this > > > helps. > > > > > > > 1. Log into the git server on https://git.postgresql.org/adm/. It > > > > should be an automatic log in and show the repository. > > > > 2. *then* go back to the main website and delete the ssh key > > > > 3. Now add the ssh key again on the main website > > > > 4. Wait 10-15 minutes and then it should work > > > > Thank you for the info, but unfortunately it hasn't worked. > > I'm going to try a slightly different steps.. > > And finally I succeeded to clone from git.postgresql.org and to push a > commit. Sorry, but this has me confused. When I read this, I thought you were pushing a 'pgsql' core server commit to gitmaster, but that would be impossible for git.postgresql.org, so where are you pushing to? This might be part of the confusion Dave was asking about. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: Reproducible coliisions in jsonb_hash
On Thu, May 12, 2022 at 9:57 AM Tom Lane wrote: > Andrew Dunstan writes: > > On 2022-05-12 Th 07:02, Valeriy Meleshkin wrote: > >> AFAICT it happens because when iterating over a jsonb the hash function > >> makes no distinction between raw scalars and arrays (it doesn't inspect > >> v.val.array.rawScalar) > > > It does look rather like a bug, but I'm unclear about the implications > > of fixing it. > > Changing this hash algorithm would break existing hash indexes on jsonb > columns. Maybe there aren't any, but even if so I can't get very excited > about changing this. Hash algorithms always have collisions, and we have > never made any promise that ours are cryptographically strong. I might be missing something, but I don't know why "cryptographically strong" is the relevant concept here. It seems like the question is how likely it is that this would lead to queries having crappy performance. In general we want hash functions to deliver different values for different inputs so that we give ourselves the best possible chance of spreading keys evenly across buckets. If for example we hashed every possible JSON object to the same constant value, everything we tried to do with this hash function would suck, and the problem would be so severe that I think we'd have to fix it, even though it would mean a compatibility break. Or even if we hashed every JSON integer to the same value, that would be horrible, because it's quite likely that you could have a column full of json objects where ignoring the difference between one integer and another results in a ton of duplicate hash values. Here, that doesn't seem too likely. You could have a column that contains 'tom' and ['tom'] and [['tom']] and [[['tom']]] and so forth and they all get mapped onto the same bucket and you're sad. But probably not. So I'd judge that while it was probably a mistake to make the hash function work this way, it's not likely to cause serious problems, and therefore we ought to maybe leave it alone for now, but add a comment so that if we ever break backward-compatibility for any other reason, we remember to fix this too. IOW, I think I mostly agree with your conclusion, but perhaps not entirely with the reasoning. -- Robert Haas EDB: http://www.enterprisedb.com
Re: First draft of the PG 15 release notes
I wonder if this is worth mentioning: Raise a WARNING for missing publications. commit 8f2e2bbf145384784bad07a96d461c6bbd91f597 https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8f2e2bbf145384784bad07a96d461c6bbd91f597 Regards, Vignesh On Thu, May 12, 2022 at 7:52 PM Bruce Momjian wrote: > > On Thu, May 12, 2022 at 11:12:54AM -0300, Euler Taveira wrote: > OB> On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote: > > > > I looked at that but thought that everyone would already assume we > > skipped replication of empty transactions, and I didn't see much impact > > for the user, so I didn't include it. > > > > It certainly has an impact on heavy workloads that replicate tables with few > > modifications. It receives a high traffic of 'begin' and 'commit' messages > > that > > the previous Postgres versions have to handle (discard). I would classify > > it as > > a performance improvement for logical replication. Don't have a strong > > opinion > > if it should be mentioned or not. > > Oh, so your point is that a transaction that only has SELECT would > previously send an empty transaction? I thought this was only for apps > that create literal empty transactions, which seem rare. > > -- > Bruce Momjian https://momjian.us > EDB https://enterprisedb.com > > Indecision is a decision. Inaction is an action. Mark Batterson > > >
Re: Reproducible coliisions in jsonb_hash
Robert Haas writes: > Here, that doesn't seem too likely. You could have a column that > contains 'tom' and ['tom'] and [['tom']] and [[['tom']]] and so forth > and they all get mapped onto the same bucket and you're sad. But > probably not. Yeah, that might be a more useful way to think about it: is this likely to cause performance-critical collisions in practice? I agree that that doesn't seem like a very likely situation, even given that you might be using json for erratically-structured data. regards, tom lane
Re: Mark all GUC variable as PGDLLIMPORT
Hi, On 2022-05-12 15:15:10 +0900, Michael Paquier wrote: > On Tue, May 10, 2022 at 04:09:47PM +0900, Michael Paquier wrote: > > Well, what about the attached then? While looking at all the headers > > in the tree, I have noticed that __pg_log_level is not marked, but > > we'd better paint a PGDLLIMPORT also for it? This is getting used by > > pgbench for some unlikely() business, as one example. > > After an extra look, PGDLLIMPORT missing from __pg_log_level looks > like an imbroglio between 8ec5694, that has added the marking, and > 9a374b77 that has removed it the same day. All that has been fixed in > 5edeb57. It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers / variables. What is that supposed to mean? Greetings, Andres Freund
Re: Crash in new pgstats code
Hi, On 2022-05-11 15:46:13 +0900, Michael Paquier wrote: > On Tue, Apr 19, 2022 at 08:31:05PM +1200, Thomas Munro wrote: > > On Tue, Apr 19, 2022 at 2:50 AM Andres Freund wrote: > > > Kestrel won't go that far back even - I set it up 23 days ago... > > > > Here's a ~6 month old example from mylodon (I can't see much further > > back than that with HTTP requests... I guess BF records are purged?): > > > > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=mylodon&dt=2021-10-19%2022%3A57%3A54&stg=recovery-check > > Do we have anything remaining on this thread in light of the upcoming > beta1? One fix has been pushed upthread, but it does not seem we are > completely in the clear either. I don't know what else there is to do, tbh. Greetings, Andres Freund
Re: make MaxBackends available in _PG_init
On 12.05.2022 00:01, Nathan Bossart wrote: On Wed, May 11, 2022 at 04:18:10PM -0400, Robert Haas wrote: OK, I have committed 0001 now. Thanks! Maybe remove the first part from the patchset? Because now the Patch Tester is giving apply error for the first part and can't test the other. http://cfbot.cputube.org/patch_38_3614.log With best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Mark all GUC variable as PGDLLIMPORT
Andres Freund writes: > On 2022-05-12 15:15:10 +0900, Michael Paquier wrote: >> After an extra look, PGDLLIMPORT missing from __pg_log_level looks >> like an imbroglio between 8ec5694, that has added the marking, and >> 9a374b77 that has removed it the same day. All that has been fixed in >> 5edeb57. > It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers / > variables. What is that supposed to mean? Yeah, that's why I removed it in 9a374b77. regards, tom lane
Re: Crash in new pgstats code
Andres Freund writes: > On 2022-05-11 15:46:13 +0900, Michael Paquier wrote: >> Do we have anything remaining on this thread in light of the upcoming >> beta1? One fix has been pushed upthread, but it does not seem we are >> completely in the clear either. > I don't know what else there is to do, tbh. Well, it was mostly you expressing misgivings upthread ;-). But we have not seen any pgstat crashes lately, so I'm content to mark the open item as resolved. regards, tom lane
Re: Crash in new pgstats code
Hi, On 2022-05-12 12:12:59 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2022-05-11 15:46:13 +0900, Michael Paquier wrote: > >> Do we have anything remaining on this thread in light of the upcoming > >> beta1? One fix has been pushed upthread, but it does not seem we are > >> completely in the clear either. > > > I don't know what else there is to do, tbh. > > Well, it was mostly you expressing misgivings upthread ;-). Those mostly were about stuff in 14 as well... I guess it'd be good to figure out precisely how the problem was triggered, but without further information I don't quite see how to figure it out... > But we have not seen any pgstat crashes lately, so I'm content to mark the > open item as resolved. Cool. Greetings, Andres Freund
Declaration fixes
Hi, I was experimenting with specifying symbol visiblity for functions explicitly, i.e. adding PGDLLIMPORT markers for them, with the goal of getting rid of src/tools/msvc/gendef.pl (and similar AIX stuff). While doing that I compared the set of exported symbols before / after, leading me to find a few pre-existing "issues". I think the attached patches are all a good idea and trivial enought that I think we should apply them now. The changes are sufficiently obvious and/or explained in the commit messages. Comments? Greetings, Andres Freund >From 33e9aa35d0c0153ef62ce647b8cc748079908b07 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Thu, 12 May 2022 09:17:14 -0700 Subject: [PATCH v1 1/4] Add missing 'extern' to function prototypes. Postgres style is to spell out extern. Noticed while scripting adding PGDLLIMPORT markers to functions. Author: Reviewed-By: Discussion: https://postgr.es/m/ Backpatch: --- src/include/access/rewriteheap.h | 2 +- src/include/port/win32_port.h | 26 - src/include/replication/message.h | 6 +- src/include/replication/origin.h | 6 +- src/include/replication/reorderbuffer.h | 70 +++ src/include/replication/worker_internal.h | 4 +- src/include/storage/s_lock.h | 4 +- src/include/tsearch/dicts/regis.h | 8 +-- src/include/utils/attoptcache.h | 2 +- src/include/utils/numeric.h | 2 +- src/include/utils/pgstat_internal.h | 2 +- src/include/utils/spccache.h | 6 +- 12 files changed, 69 insertions(+), 69 deletions(-) diff --git a/src/include/access/rewriteheap.h b/src/include/access/rewriteheap.h index aa5c48f219a..3e27790b3f0 100644 --- a/src/include/access/rewriteheap.h +++ b/src/include/access/rewriteheap.h @@ -52,6 +52,6 @@ typedef struct LogicalRewriteMappingData * --- */ #define LOGICAL_REWRITE_FORMAT "map-%x-%x-%X_%X-%x-%x" -void CheckPointLogicalRewriteHeap(void); +extern void CheckPointLogicalRewriteHeap(void); #endif /* REWRITE_HEAP_H */ diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h index 5045ced91b7..dbbf88f8e8c 100644 --- a/src/include/port/win32_port.h +++ b/src/include/port/win32_port.h @@ -455,10 +455,10 @@ extern PGDLLIMPORT HANDLE pgwin32_initial_signal_pipe; #define UNBLOCKED_SIGNAL_QUEUE() (pg_signal_queue & ~pg_signal_mask) #define PG_SIGNAL_COUNT 32 -void pgwin32_signal_initialize(void); -HANDLE pgwin32_create_signal_listener(pid_t pid); -void pgwin32_dispatch_queued_signals(void); -void pg_queue_signal(int signum); +extern void pgwin32_signal_initialize(void); +extern HANDLE pgwin32_create_signal_listener(pid_t pid); +extern void pgwin32_dispatch_queued_signals(void); +extern void pg_queue_signal(int signum); /* In src/port/kill.c */ #define kill(pid,sig) pgkill(pid,sig) @@ -475,15 +475,15 @@ extern int pgkill(int pid, int sig); #define recv(s, buf, len, flags) pgwin32_recv(s, buf, len, flags) #define send(s, buf, len, flags) pgwin32_send(s, buf, len, flags) -SOCKET pgwin32_socket(int af, int type, int protocol); -int pgwin32_bind(SOCKET s, struct sockaddr *addr, int addrlen); -int pgwin32_listen(SOCKET s, int backlog); -SOCKET pgwin32_accept(SOCKET s, struct sockaddr *addr, int *addrlen); -int pgwin32_connect(SOCKET s, const struct sockaddr *name, int namelen); -int pgwin32_select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timeval *timeout); -int pgwin32_recv(SOCKET s, char *buf, int len, int flags); -int pgwin32_send(SOCKET s, const void *buf, int len, int flags); -int pgwin32_waitforsinglesocket(SOCKET s, int what, int timeout); +extern SOCKET pgwin32_socket(int af, int type, int protocol); +extern int pgwin32_bind(SOCKET s, struct sockaddr *addr, int addrlen); +extern int pgwin32_listen(SOCKET s, int backlog); +extern SOCKET pgwin32_accept(SOCKET s, struct sockaddr *addr, int *addrlen); +extern int pgwin32_connect(SOCKET s, const struct sockaddr *name, int namelen); +extern int pgwin32_select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timeval *timeout); +extern int pgwin32_recv(SOCKET s, char *buf, int len, int flags); +extern int pgwin32_send(SOCKET s, const void *buf, int len, int flags); +extern int pgwin32_waitforsinglesocket(SOCKET s, int what, int timeout); extern PGDLLIMPORT int pgwin32_noblock; diff --git a/src/include/replication/message.h b/src/include/replication/message.h index b9686c28550..0b396c56693 100644 --- a/src/include/replication/message.h +++ b/src/include/replication/message.h @@ -34,8 +34,8 @@ extern XLogRecPtr LogLogicalMessage(const char *prefix, const char *message, /* RMGR API */ #define XLOG_LOGICAL_MESSAGE 0x00 -void logicalmsg_redo(XLogReaderState *record); -void logicalmsg_desc(StringInfo buf, XLogReaderState *record); -const char *logicalmsg_identify(uint8 info); +extern void logicalmsg_redo(XLogReaderState *record);
Re: Declaration fixes
Andres Freund writes: > I was experimenting with specifying symbol visiblity for functions explicitly, > i.e. adding PGDLLIMPORT markers for them, with the goal of getting rid of > src/tools/msvc/gendef.pl (and similar AIX stuff). While doing that I compared > the set of exported symbols before / after, leading me to find a few > pre-existing "issues". > I think the attached patches are all a good idea and trivial enought that I > think we should apply them now. +1 to all of that. Would the changes you're working on result in getting warnings for these sorts of oversights on mainstream compilers? regards, tom lane
Re: Declaration fixes
Hi, On 2022-05-12 13:18:05 -0400, Tom Lane wrote: > Andres Freund writes: > > I was experimenting with specifying symbol visiblity for functions > > explicitly, > > i.e. adding PGDLLIMPORT markers for them, with the goal of getting rid of > > src/tools/msvc/gendef.pl (and similar AIX stuff). While doing that I > > compared > > the set of exported symbols before / after, leading me to find a few > > pre-existing "issues". > > > I think the attached patches are all a good idea and trivial enought that I > > think we should apply them now. > > +1 to all of that. Cool. > Would the changes you're working on result in getting > warnings for these sorts of oversights on mainstream compilers? I assume with "mainstream compiler" you basically mean gcc and clang? If so, some oversights would be hard errors, some warnings, some runtime (i.e. symbol not found errors when loading extension library) but some others unfortunately would continue to only be visible in msvc :(. Greetings, Andres Freund
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022 at 08:05:40PM +0530, vignesh C wrote: > I wonder if this is worth mentioning: > > Raise a WARNING for missing publications. > commit 8f2e2bbf145384784bad07a96d461c6bbd91f597 > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8f2e2bbf145384784bad07a96d461c6bbd91f597 Reading the commit message, it looked like only a warning was being added, which was more of a helpful change rather than something we need to mention. However, if this means you could can now create a subscription for a missing publication that you couldn't do before, it should be added --- I couldn't tell from the patch. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: [RFC] building postgres with meson -v8
Hi, On 2022-05-11 12:18:58 +0200, Peter Eisentraut wrote: > I fixed the Perl detection issue in my macOS environment that I had reported > a while ago. Hm. I wonder if it's right to check with is_file() - perhaps there are platforms that have split off the include directory? > Then I added in support for all configure options that had not been ported > over yet. Some of these are rather trivial. Thanks! Some of these (extra version, krb srvname, ...) I just merged from a colleague. Will look at merging the others. > After that, these configure options don't have an equivalent yet: > > --disable-rpath > --enable-profiling > --disable-thread-safety > --with-libedit-preferred > > The first three overlap with meson built-in functionality, so we would need > to check whether the desired functionality is available somehow. Which builtin functionality does --enable-profiling overlap with? There's a coverage option, perhaps you were thinking of that? I don't think we should add --disable-thread-safety, platforms without it also aren't going to support ninja / meson... IIRC Thomas was planning to submit a patch getting rid of it independently... > The last one we probably want to keep somehow; it would need a bit of fiddly > work. A colleague just finished that bit. Probably can be improved further, but it works now... > From 049b34b6a8dd949f0eb7987cad65f6682a6ec786 Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut > Date: Wed, 11 May 2022 09:06:13 +0200 > Subject: [PATCH 3/9] meson: prereq: Refactor dtrace postprocessing make rules > > Move the dtrace postprocessing sed commands into a separate file so > that it can be shared by meson. Also split the rule into two for > proper dependency declaration. Hm. Using sed may be problematic on windows... > From fad02f1fb71ec8c64e47e5031726ffbee4a1dd84 Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut > Date: Wed, 11 May 2022 09:53:01 +0200 > Subject: [PATCH 7/9] meson: Add system-tzdata option > > --- > meson.build | 3 +++ > meson_options.txt | 3 +++ > 2 files changed, 6 insertions(+) > > diff --git a/meson.build b/meson.build > index 7c9c6e7f23..b33a51a35d 100644 > --- a/meson.build > +++ b/meson.build > @@ -246,6 +246,9 @@ cdata.set('RELSEG_SIZE', get_option('segsize') * 131072) > cdata.set('DEF_PGPORT', get_option('pgport')) > cdata.set_quoted('DEF_PGPORT_STR', get_option('pgport')) > cdata.set_quoted('PG_KRB_SRVNAM', 'postgres') > +if get_option('system-tzdata') != '' > + cdata.set_quoted('SYSTEMTZDIR', get_option('system-tzdata')) > +endif This doesn't quite seem sufficient - we also need to prevent building & installing our own timezone data... Greetings, Andres Freund
Re: Declaration fixes
On 2022-05-12 11:38:39 -0700, Andres Freund wrote: > On 2022-05-12 13:18:05 -0400, Tom Lane wrote: > > Andres Freund writes: > > > I think the attached patches are all a good idea and trivial enought that > > > I > > > think we should apply them now. > > > > +1 to all of that. > > Cool. Pushed them.
typedefs.list glitches
I just completed the v15 pre-beta pgindent run. It went reasonably smoothly, but I had to hack up typedefs.list a little bit compared to the version downloaded from the buildfarm. * The buildfarm's list is missing pg_md5_ctx pg_sha1_ctx pg_sha224_ctx pg_sha256_ctx pg_sha384_ctx pg_sha512_ctx which are certainly used, but only in some src/common files that are built only in non-OpenSSL builds. So evidently, every buildfarm member that's contributing to the typedefs list builds with OpenSSL. That wouldn't surprise me, except that my own animal sifaka should be filling that gap. Looking at its latest attempt[1], it seems to be generating an empty list, which I guess means that our recipe for extracting typedefs doesn't work on macOS/arm64. I shall investigate. * The buildfarm's list includes "value_type", which is surely not typedef'd anywhere in our code, and that is messing up some formatting involving JsonIsPredicate.value_type. I suppose that is coming from some system header where it is a typedef on some machines (komodoensis and lorikeet report it, which seems like an odd pairing). I think the best thing to do here is rename that field while we still can, perhaps to item_type. Thoughts? regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2022-05-11%2020%3A21%3A15
Re: typedefs.list glitches
I wrote: > every buildfarm member that's contributing to the typedefs list > builds with OpenSSL. That wouldn't surprise me, except that > my own animal sifaka should be filling that gap. Looking at > its latest attempt[1], it seems to be generating an empty list, > which I guess means that our recipe for extracting typedefs > doesn't work on macOS/arm64. I shall investigate. Found it. Current macOS produces $ objdump -W /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/objdump: error: unknown argument '-W' where last year's vintage produced $ objdump -W objdump: Unknown command line argument '-W'. Try: '/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/objdump --help' objdump: Did you mean '-C'? This confuses run_build.pl into taking the "Linux and sometimes windows" code path instead of the $using_osx one. I think simplest fix is to move the $using_osx branch ahead of the heuristic ones, as attached. regards, tom lane --- run_build.pl.orig 2022-01-29 10:18:03.0 -0500 +++ run_build.pl 2022-05-12 16:59:58.0 -0400 @@ -2163,7 +2163,32 @@ sub find_typedefs next if $bin =~ m!bin/(ipcclean|pltcl_)!; next unless -f $bin; next if -l $bin;# ignore symlinks to plain files (e.g. postmaster) - if (@err == 1) # Linux and sometimes windows + if ($using_osx) + { + # no run_log due to redirections. + @dumpout = + `dwarfdump $bin 2>/dev/null | egrep -A2 TAG_typedef 2>/dev/null`; + foreach (@dumpout) + { +## no critic (RegularExpressions::ProhibitCaptureWithoutTest) +@flds = split; +if (@flds == 3) +{ + # old format + next unless ($flds[0] eq "AT_name("); + next unless ($flds[1] =~ m/^"(.*)"$/); + $syms{$1} = 1; +} +elsif (@flds == 2) +{ + # new format + next unless ($flds[0] eq "DW_AT_name"); + next unless ($flds[1] =~ m/^\("(.*)"\)$/); + $syms{$1} = 1; +} + } + } + elsif (@err == 1) # Linux and sometimes windows { my $cmd = "$objdump -Wi $bin 2>/dev/null | " . "egrep -A3 DW_TAG_typedef 2>/dev/null"; @@ -2194,31 +2219,6 @@ sub find_typedefs $syms{ $flds[-1] } = 1; } } - elsif ($using_osx) - { - # no run_log due to redirections. - @dumpout = - `dwarfdump $bin 2>/dev/null | egrep -A2 TAG_typedef 2>/dev/null`; - foreach (@dumpout) - { -## no critic (RegularExpressions::ProhibitCaptureWithoutTest) -@flds = split; -if (@flds == 3) -{ - # old format - next unless ($flds[0] eq "AT_name("); - next unless ($flds[1] =~ m/^"(.*)"$/); - $syms{$1} = 1; -} -elsif (@flds == 2) -{ - # new format - next unless ($flds[0] eq "DW_AT_name"); - next unless ($flds[1] =~ m/^\("(.*)"\)$/); - $syms{$1} = 1; -} - } - } else { # no run_log due to redirections.
Re: Comments on Custom RMGRs
On Thu, 12 May 2022 at 04:40, Andres Freund wrote: > > On 2022-05-11 09:39:48 -0700, Jeff Davis wrote: > > On Wed, 2022-05-11 at 15:24 +0100, Simon Riggs wrote: > > > [PATCH: rmgr_001.v1.patch] > > > > > > [PATCH: rmgr_002.v1.patch] > > > > Thank you. Both of these look like good ideas, and I will commit them > > in a few days assuming that nobody else sees a problem. > > What exactly is the use case here? Without passing in information about > whether recovery will be performed etc, it's not at all clear how callbacks > could something useful? Sure, happy to do it that way. [PATCH: rmgr_002.v2.patch] > I don't think we should allocate a bunch of memory contexts to just free them > immediately after? Didn't seem a problem, but I've added code to use the flag requested above. > > > It occurs to me that any use of WAL presumes that Checkpoints exist > > > and do something useful. However, the custom rmgr interface doesn't > > > allow you to specify any actions on checkpoint, so ends up being > > > limited in scope. So I think we also need an rm_checkpoint() call - > > > which would be a no-op for existing rmgrs. > > > [PATCH: rmgr_003.v1.patch] > > > > I also like this idea, but can you describe the intended use case? I > > looked through CheckPointGuts() and I'm not sure what else a custom AM > > might want to do. Maybe sync special files in a way that's not handled > > with RegisterSyncRequest()? > > I'm not happy with the idea of random code being executed in the middle of > CheckPointGuts(), without any documentation of what is legal to do at that > point. The "I'm not happy.." ship has already sailed with pluggable rmgrs. Checkpoints exist for one purpose - as the starting place for recovery. Why would we allow pluggable recovery without *also* allowing pluggable checkpoints? >To actually be useful we'd likely need multiple calls to such an rmgr > callback, with a parameter where in CheckPointGuts() we are. Right now the > sequencing is explicit in CheckPointGuts(), but with the proposed callback, > that'd not be the case anymore. It is useful without the extra complexity you mention. I see multiple uses for the rm_checkpoint() point proposed and I've been asked multiple times for a checkpoint hook. Any rmgr that services crash recovery for a non-smgr based storage system would need this because the current checkpoint code only handles flushing to disk for smgr-based approaches. That is orthogonal to other code during checkpoint, so it stands alone quite well. -- Simon Riggshttp://www.EnterpriseDB.com/ rmgr_002.v2.patch Description: Binary data
Re: [PATCH] Log details for client certificate failures
On Thu, 2022-05-05 at 15:12 +, Jacob Champion wrote: > On Wed, 2022-05-04 at 15:53 +0200, Peter Eisentraut wrote: > > In terms of aligning what is printed, I meant that pg_stat_ssl uses the > > issuer plus serial number to identify the certificate unambiguously. > > Oh, that's a great idea. I'll do that too. v2 limits the maximum subject length and adds the serial number to the logs. Thanks! --Jacob commit eb5ddc1d8909fe322ddeb1ec4bf9118bb9544667 Author: Jacob Champion Date: Wed May 11 10:33:33 2022 -0700 squash! Log details for client certificate failures - Add a maximum Subject length to prevent malicious client certs from spamming the logs. - Add the certificate serial number to the output, for disambiguation if the Subject gets truncated unhelpfully. diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 1683f9cc9e..3ccc23ba62 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -1094,6 +1094,8 @@ verify_cb(int ok, X509_STORE_CTX *ctx) const char *errstring; X509 *cert; char *certname = NULL; + char *truncated = NULL; + char *serialno = NULL; if (ok) { @@ -1108,14 +1110,56 @@ verify_cb(int ok, X509_STORE_CTX *ctx) cert = X509_STORE_CTX_get_current_cert(ctx); if (cert) + { + size_t namelen; + ASN1_INTEGER *sn; + BIGNUM *b; + + /* +* Get the Subject for logging, but don't let maliciously huge certs +* flood the logs. +* +* Common Names are 64 chars max, so for a common case where the CN is +* the last field, we can still print the longest possible CN with a +* 7-character prefix (".../CN=[64 chars]"), for a reasonable limit of +* 71 characters. +*/ +#define MAXLEN 71 certname = X509_NAME_to_cstring(X509_get_subject_name(cert)); + namelen = strlen(certname); + + if (namelen > MAXLEN) + { + /* +* Keep the end of the Subject, not the beginning, since the most +* specific field is likely to give users the most information. +*/ + truncated = certname + namelen - MAXLEN; + truncated[0] = truncated[1] = truncated[2] = '.'; + } +#undef MAXLEN + + /* +* Pull the serial number, too, in case a Subject is still ambiguous. +* This mirrors be_tls_get_peer_serial(). +*/ + sn = X509_get_serialNumber(cert); + b = ASN1_INTEGER_to_BN(sn, NULL); + serialno = BN_bn2dec(b); + + BN_free(b); + } ereport(COMMERROR, (errmsg("client certificate verification failed at depth %d: %s", depth, errstring), /* only print detail if we have a certificate to print */ -certname && errdetail("failed certificate's subject: %s", certname))); +certname && errdetail("failed certificate had subject '%s', serial number %s", + truncated ? truncated : certname, + serialno ? serialno : _("unknown"; + if (serialno) + OPENSSL_free(serialno); if (certname) pfree(certname); diff --git a/src/test/ssl/conf/client-long.config b/src/test/ssl/conf/client-long.config new file mode 100644 index 00..0e92a8fbfe --- /dev/null +++ b/src/test/ssl/conf/client-long.config @@ -0,0 +1,14 @@ +# An OpenSSL format CSR config file for creating a client certificate with a +# long Subject. + +[ req ] +distinguished_name = req_distinguished_name +prompt = no + +[ req_distinguished_name ] +# Common Names are 64 characters max +CN = ssl-123456789012345678901234567890123456789012345678901234567890 +OU = Some Organizational Unit +O = PostgreSQL Global Development Group + +# no extensions in client certs diff --git a/src/test/ssl/ssl/client-long.crt b/src/test/ssl/ssl/client-long.crt new file mode 100644 index 00..a1db55b5c3 --- /dev/null +++ b/src/test/ssl/ssl/client-long.crt @@ -0,0 +1,69 @@ +Certificate: +Data: +Version: 1 (0x0) +Serial Number: 2315418733629425152 (0x202205121600) +Signature Algorithm: sha256WithRSAEncryption +Issuer: CN = Test CA for PostgreSQL SSL regression test client certs +Validity +Not Before: May 12 21:44:47 202
Re: Comments on Custom RMGRs
Hi, On 2022-05-12 22:26:51 +0100, Simon Riggs wrote: > On Thu, 12 May 2022 at 04:40, Andres Freund wrote: > > I'm not happy with the idea of random code being executed in the middle of > > CheckPointGuts(), without any documentation of what is legal to do at that > > point. > > The "I'm not happy.." ship has already sailed with pluggable rmgrs. I don't agree. The ordering within a checkpoint is a lot more fragile than what you do in an individual redo routine. > Checkpoints exist for one purpose - as the starting place for recovery. > > Why would we allow pluggable recovery without *also* allowing > pluggable checkpoints? Because one can do a lot of stuff with just pluggable WAL records, without integrating into checkpoints? Note that I'm *not* against making checkpoint extensible - I just think it needs a good bit of design work around when the hook is called etc. I definitely think it's too late in the cycle to add checkpoint extensibility now. > > To actually be useful we'd likely need multiple calls to such an rmgr > > callback, with a parameter where in CheckPointGuts() we are. Right now the > > sequencing is explicit in CheckPointGuts(), but with the proposed callback, > > that'd not be the case anymore. > > It is useful without the extra complexity you mention. Shrug. The documentation work definitely is needed. Perhaps we can get away without multiple callbacks within a checkpoint, I think it'll become more apparent when writing information about the precise point in time the checkpoint callback is called. > I see multiple uses for the rm_checkpoint() point proposed and I've > been asked multiple times for a checkpoint hook. Any rmgr that > services crash recovery for a non-smgr based storage system would need > this because the current checkpoint code only handles flushing to disk > for smgr-based approaches. That is orthogonal to other code during > checkpoint, so it stands alone quite well. FWIW, for that there are much bigger problems than checkpoint extensibility. Most importantly there's currently no good way to integrate relation creation / drop with the commit / abort infrastructure... Greetings, Andres Freund
Re: Crash in new pgstats code
On Thu, May 12, 2022 at 09:33:05AM -0700, Andres Freund wrote: > On 2022-05-12 12:12:59 -0400, Tom Lane wrote: >> But we have not seen any pgstat crashes lately, so I'm content to mark the >> open item as resolved. > > Cool. Okay, thanks for the feedback. I have marked the item as resolved for the time being. Let's revisit it later if necessary. -- Michael signature.asc Description: PGP signature
Re: Mark all GUC variable as PGDLLIMPORT
On Thu, May 12, 2022 at 11:59:49AM -0400, Tom Lane wrote: >> It seems pretty nonsensical to add PGDLLIMPORT to frontend only headers / >> variables. What is that supposed to mean? > > Yeah, that's why I removed it in 9a374b77. Perhaps we should try to remove it from the header itself in the long run, even if that's used in a couple of macros? pgbench relies on it to avoid building a debug string for a meta-command, and logging.h has it in those compat macros.. I won't fight on that, though. Anyway, I'll go remove the marking. My apologies for the noise. -- Michael signature.asc Description: PGP signature
Re: make MaxBackends available in _PG_init
On Thu, May 12, 2022 at 06:51:59PM +0300, Anton A. Melnikov wrote: > Maybe remove the first part from the patchset? > Because now the Patch Tester is giving apply error for the first part and > can't test the other. > http://cfbot.cputube.org/patch_38_3614.log Yep. As simple as the attached. -- Michael From d69844e34049d8c38b1e05fb75104469b2d123e1 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 18 Apr 2022 15:25:37 -0700 Subject: [PATCH v9] Add a new shmem_request_hook hook. Currently, preloaded libraries are expected to request additional shared memory and LWLocks in _PG_init(). However, it is not unusal for such requests to depend on MaxBackends, which won't be initialized at that time. Such requests could also depend on GUCs that other modules might change. This introduces a new hook where modules can safely use MaxBackends and GUCs to request additional shared memory and LWLocks. Furthermore, this change restricts requests for shared memory and LWLocks to this hook. Previously, libraries could make requests until the size of the main shared memory segment was calculated. Unlike before, we no longer silently ignore requests received at invalid times. Instead, we FATAL if someone tries to request additional shared memory or LWLocks outside of the hook. Authors: Julien Rouhaud, Nathan Bossart Discussion: https://postgr.es/m/20220412210112.GA2065815%40nathanxps13 --- src/include/miscadmin.h | 5 src/backend/postmaster/postmaster.c | 5 src/backend/storage/ipc/ipci.c| 20 +- src/backend/storage/lmgr/lwlock.c | 18 - src/backend/utils/init/miscinit.c | 15 +++ doc/src/sgml/xfunc.sgml | 12 +++-- contrib/pg_prewarm/autoprewarm.c | 17 +++- .../pg_stat_statements/pg_stat_statements.c | 26 +-- src/tools/pgindent/typedefs.list | 1 + 9 files changed, 81 insertions(+), 38 deletions(-) diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h index 53fd168d93..0af130fbc5 100644 --- a/src/include/miscadmin.h +++ b/src/include/miscadmin.h @@ -465,6 +465,7 @@ extern void BaseInit(void); extern PGDLLIMPORT bool IgnoreSystemIndexes; extern PGDLLIMPORT bool process_shared_preload_libraries_in_progress; extern PGDLLIMPORT bool process_shared_preload_libraries_done; +extern PGDLLIMPORT bool process_shmem_requests_in_progress; extern PGDLLIMPORT char *session_preload_libraries_string; extern PGDLLIMPORT char *shared_preload_libraries_string; extern PGDLLIMPORT char *local_preload_libraries_string; @@ -478,9 +479,13 @@ extern bool RecheckDataDirLockFile(void); extern void ValidatePgVersion(const char *path); extern void process_shared_preload_libraries(void); extern void process_session_preload_libraries(void); +extern void process_shmem_requests(void); extern void pg_bindtextdomain(const char *domain); extern bool has_rolreplication(Oid roleid); +typedef void (*shmem_request_hook_type) (void); +extern PGDLLIMPORT shmem_request_hook_type shmem_request_hook; + /* in executor/nodeHash.c */ extern size_t get_hash_memory_limit(void); diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index bf591f048d..3b73e26956 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -1042,6 +1042,11 @@ PostmasterMain(int argc, char *argv[]) */ InitializeMaxBackends(); + /* + * Give preloaded libraries a chance to request additional shared memory. + */ + process_shmem_requests(); + /* * Now that loadable modules have had their chance to request additional * shared memory, determine the value of any runtime-computed GUCs that diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c index 75e456360b..26372d95b3 100644 --- a/src/backend/storage/ipc/ipci.c +++ b/src/backend/storage/ipc/ipci.c @@ -55,25 +55,21 @@ int shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE; shmem_startup_hook_type shmem_startup_hook = NULL; static Size total_addin_request = 0; -static bool addin_request_allowed = true; - /* * RequestAddinShmemSpace * Request that extra shmem space be allocated for use by * a loadable module. * - * This is only useful if called from the _PG_init hook of a library that - * is loaded into the postmaster via shared_preload_libraries. Once - * shared memory has been allocated, calls will be ignored. (We could - * raise an error, but it seems better to make it a no-op, so that - * libraries containing such calls can be reloaded if needed.) + * This may only be called via the shmem_request_hook of a library that is + * loaded into the postmaster via shared_preload_libraries. Calls from + * elsewhere will fail. */ void RequestAddinShmemSpace(Size size) { - if (IsUnderPostmaster || !addin_request_allowed) - return; /* too late */ + if (!process_shmem_requests_in_progress) +
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022, at 11:22 AM, Bruce Momjian wrote: > On Thu, May 12, 2022 at 11:12:54AM -0300, Euler Taveira wrote: > OB> On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote: > > > > I looked at that but thought that everyone would already assume we > > skipped replication of empty transactions, and I didn't see much impact > > for the user, so I didn't include it. > > > > It certainly has an impact on heavy workloads that replicate tables with few > > modifications. It receives a high traffic of 'begin' and 'commit' messages > > that > > the previous Postgres versions have to handle (discard). I would classify > > it as > > a performance improvement for logical replication. Don't have a strong > > opinion > > if it should be mentioned or not. > > Oh, so your point is that a transaction that only has SELECT would > previously send an empty transaction? I thought this was only for apps > that create literal empty transactions, which seem rare. No. It should be a write transaction. If you have a replication setup that publish only table foo (that isn't modified often) and most of your workload does not contain table foo, Postgres sends 'begin' and 'commit' messages to subscriber even if there is no change to replicate. Let me show you an example: postgres=# CREATE TABLE foo (a integer primary key, b text); CREATE TABLE postgres=# CREATE TABLE bar (c integer primary key, d text); CREATE TABLE postgres=# CREATE TABLE baz (e integer primary key, f text); CREATE TABLE postgres=# CREATE PUBLICATION pubfoo FOR TABLE foo; CREATE PUBLICATION postgres=# SELECT pg_create_logical_replication_slot('slotfoo', 'pgoutput'); pg_create_logical_replication_slot (slotfoo,0/E709AC50) (1 row) Let's create a transaction without table foo: postgres=# BEGIN; BEGIN postgres=*# INSERT INTO bar (c, d) VALUES(1, 'blah'); INSERT 0 1 postgres=*# INSERT INTO baz (e, f) VALUES(2, 'xpto'); INSERT 0 1 postgres=*# COMMIT; COMMIT As you can see, the replication slot contains messages for that transaction. Although, table bar and baz are NOT published, the begin (B) and commit (C) messages that refers to this transaction are sent to subscriber. postgres=# SELECT chr(get_byte(data, 0)) FROM pg_logical_slot_peek_binary_changes('slotfoo', NULL, NULL, 'proto_version', '1', 'publication_names', 'pubfoo'); chr - B C (2 rows) If you execute another transaction without table foo, there will be another B/C pair. postgres=# DELETE FROM baz WHERE e = 2; DELETE 1 postgres=# SELECT chr(get_byte(data, 0)) FROM pg_logical_slot_peek_binary_changes('slotfoo', NULL, NULL, 'proto_version', '1', 'publication_names', 'pubfoo'); chr - B C B C (4 rows) Let's create a transaction that uses table foo but also table bar: postgres=# BEGIN; BEGIN postgres=*# INSERT INTO foo (a, b) VALUES(100, 'asdf'); INSERT 0 1 postgres=*# INSERT INTO bar (c, d) VALUES(200, 'qwert'); INSERT 0 1 postgres=*# COMMIT; COMMIT In this case, there will be other messages since the publication pubfoo publishes table foo. ('I' means there is an INSERT for table foo). postgres=# SELECT chr(get_byte(data, 0)), length(data) FROM pg_logical_slot_peek_binary_changes('slotfoo', NULL, NULL, 'proto_version', '1', 'publication_names', 'pubfoo'); chr | length -+ B | 21 C | 26 B | 21 C | 26 B | 21 R | 41 I | 25 C | 26 (8 rows) In summary, a logical replication setup sends 47 bytes per skipped transaction. v15 won't send the first 2 B/C pairs. Discussion started here [1]. [1] https://postgr.es/m/CAMkU=1yohp9-dv48flosprmqyeyys5zwkazgd41rjr10xin...@mail.gmail.com -- Euler Taveira EDB https://www.enterprisedb.com/
Re: gitmaster access
At Thu, 12 May 2022 10:25:03 -0400, Bruce Momjian wrote in > On Thu, May 12, 2022 at 01:54:57PM +0900, Kyotaro Horiguchi wrote: > > At Thu, 12 May 2022 11:44:33 +0900 (JST), Kyotaro Horiguchi > > wrote in > > > At Thu, 12 May 2022 10:34:49 +0900 (JST), Tatsuo Ishii > > > wrote in > > > > Last year we faced a similar problem, namely, a new committer for > > > > pgpool.git could not access the git repository (Permission denied > > > > (publickey)). Magnus kindly advised following and it worked. Hope this > > > > helps. > > > > > > > > > 1. Log into the git server on https://git.postgresql.org/adm/. It > > > > > should be an automatic log in and show the repository. > > > > > 2. *then* go back to the main website and delete the ssh key > > > > > 3. Now add the ssh key again on the main website > > > > > 4. Wait 10-15 minutes and then it should work > > > > > > Thank you for the info, but unfortunately it hasn't worked. > > > I'm going to try a slightly different steps.. > > > > And finally I succeeded to clone from git.postgresql.org and to push a > > commit. > > Sorry, but this has me confused. When I read this, I thought you were > pushing a 'pgsql' core server commit to gitmaster, but that would be > impossible for git.postgresql.org, so where are you pushing to? This > might be part of the confusion Dave was asking about. The repo I mention here is pgtranslate. Since I didn't find a clear instruction about how to push to the repos of other than core, after failing with "git.postgresql.org", I tried "gitmaster.postgresql.org" following the wiki page[1]. I think Dave's first suggestion (use git.postgresql.org) had a point in that gitmaster is dedicated to core committers. But I got clearly understood that from the later conversatinos. [1] https://wiki.postgresql.org/wiki/Committing_with_Git regards. -- Kyotaro Horiguchi NTT Open Source Software Center
RE: First draft of the PG 15 release notes
On Thursday, May 12, 2022 11:10 PM 'Bruce Momjian' wrote: > On Thu, May 12, 2022 at 01:35:39PM +, osumi.takami...@fujitsu.com > wrote: > > I'd like to suggest that we mention a new option for subscription > 'disable_on_error'. > > > > > > > https://github.com/postgres/postgres/commit/705e20f8550c0e8e47c0b6b20 > b5f5ffd6ffd9e33 > > Yes, I missed that one, added: > > > > > > Allow subscribers to stop logical replication application on error > (Osumi Takamichi, Mark Dilger) > > > > This is enabled with the subscriber option "disable_on_error" > and avoids possible infinite loops during stream application. > > Thank you ! In this last paragraph, how about replacing "infinite loops" with "infinite error loops" ? I think it makes the situation somewhat clear for readers. Best Regards, Takamichi Osumi
recovery test failure on morepork with timestamp mystery
Andres drew my attention to this [1] build farm failure. Looks like a test I wrote resetting pg_stat_replication_slots is failing: # Failed test 'Check that reset timestamp is later after resetting stats for slot 'test_slot' again.' # at t/006_logical_decoding.pl line 261. # got: 'f' # expected: 't' # Looks like you failed 1 test of 20. [19:59:58] t/006_logical_decoding.pl This is the test code itself: is( $node_primary->safe_psql( 'postgres', qq(SELECT stats_reset > '$reset1'::timestamptz FROM pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1') ), qq(t), qq(Check that reset timestamp is later after resetting stats for slot '$stats_test_slot1' again.) ); This is the relevant SQL statement: SELECT stats_reset > '$reset1'::timestamptz FROM pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1' When this statement is executed, reset1 is as shown: 2022-05-12 19:59:58.342 CEST [88829:3] 006_logical_decoding.pl LOG: statement: SELECT stats_reset > '2022-05-12 19:59:58.402808+02'::timestamptz FROM pg_stat_replication_slots WHERE slot_name = 'test_slot' Note the timestamp of this execution. The stats reset occurred in the past, and as such *must* have come before '2022-05-12 19:59:58.402808+02'::timestamptz. The starred line is where `reset1` is fetched: 2022-05-12 19:59:58.305 CEST [86784:2] [unknown] LOG: connection authorized: user=pgbf database=postgres application_name=006_logical_decoding.pl * 2022-05-12 19:59:58.306 CEST [86784:3] 006_logical_decoding.pl LOG: statement: SELECT stats_reset FROM pg_stat_replication_slots WHERE slot_name = 'test_slot' 2022-05-12 19:59:58.308 CEST [86784:4] 006_logical_decoding.pl LOG: disconnection: session time: 0:00:00.003 user=pgbf database=postgres host=[local] 2022-05-12 19:59:58.315 CEST [18214:1] [unknown] LOG: connection received: host=[local] 2022-05-12 19:59:58.316 CEST [18214:2] [unknown] LOG: connection authorized: user=pgbf database=postgres application_name=006_logical_decoding.pl 2022-05-12 19:59:58.317 CEST [18214:3] 006_logical_decoding.pl LOG: statement: SELECT pg_stat_reset_replication_slot(NULL) 2022-05-12 19:59:58.322 CEST [18214:4] 006_logical_decoding.pl LOG: disconnection: session time: 0:00:00.007 user=pgbf database=postgres host=[local] 2022-05-12 19:59:58.329 CEST [45967:1] [unknown] LOG: connection received: host=[local] 2022-05-12 19:59:58.330 CEST [45967:2] [unknown] LOG: connection authorized: user=pgbf database=postgres application_name=006_logical_decoding.pl 2022-05-12 19:59:58.331 CEST [45967:3] 006_logical_decoding.pl LOG: statement: SELECT stats_reset IS NOT NULL FROM pg_stat_replication_slots WHERE slot_name = 'logical_slot' 2022-05-12 19:59:58.333 CEST [45967:4] 006_logical_decoding.pl LOG: disconnection: session time: 0:00:00.003 user=pgbf database=postgres host=[local] 2022-05-12 19:59:58.341 CEST [88829:1] [unknown] LOG: connection received: host=[local] 2022-05-12 19:59:58.341 CEST [88829:2] [unknown] LOG: connection authorized: user=pgbf database=postgres application_name=006_logical_decoding.pl 2022-05-12 19:59:58.342 CEST [88829:3] 006_logical_decoding.pl LOG: statement: SELECT stats_reset > '2022-05-12 19:59:58.402808+02'::timestamptz FROM pg_stat_replication_slots WHERE slot_name = 'test_slot' 2022-05-12 19:59:58.344 CEST [88829:4] 006_logical_decoding.pl LOG: disconnection: session time: 0:00:00.003 user=pgbf database=postgres host=[local] 2022-05-12 19:59:58.350 CEST [50055:4] LOG: received fast shutdown request 2022-05-12 19:59:58.350 CEST [50055:5] LOG: aborting any active transactions 2022-05-12 19:59:58.352 CEST [50055:6] LOG: background worker "logical replication launcher" (PID 89924) exited with exit code 1 2022-05-12 19:59:58.352 CEST [56213:1] LOG: shutting down 2022-05-12 19:59:58.352 CEST [56213:2] LOG: checkpoint starting: shutdown immediate 2022-05-12 19:59:58.353 CEST [56213:3] LOG: checkpoint complete: wrote 4 buffers (3.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.001 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB 2022-05-12 19:59:58.355 CEST [50055:7] LOG: database system is shut down stats_reset was set in the past, so `reset1` shouldn't be after '2022-05-12 19:59:58.306 CEST'. It looks like the timestamp appearing in the test query would correspond to a time after the database is shut down. - melanie [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=morepork&dt=2022-05-12%2017%3A50%3A47
Re: First draft of the PG 15 release notes
On Thu, May 12, 2022 at 10:52 PM Bruce Momjian wrote: > Okay, I went with: > > Previously, such updates ran delete actions on the source > partition and insert actions on the target partition. PostgreSQL will > now run update actions on the referenced partition root. WFM, thanks. Btw, perhaps the following should be listed under E.1.3.2.1. Logical Replication, not E.1.3.1.1. Partitioning? Remove incorrect duplicate partitions in system view pg_publication_tables (Hou Zhijie) Attached a patch to do so. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Re: recovery test failure on morepork with timestamp mystery
On Fri, May 13, 2022 at 11:43 AM Melanie Plageman wrote: > > Andres drew my attention to this [1] build farm failure. > > Looks like a test I wrote resetting pg_stat_replication_slots is > failing: > > # Failed test 'Check that reset timestamp is later after resetting > stats for slot 'test_slot' again.' > # at t/006_logical_decoding.pl line 261. > # got: 'f' > # expected: 't' > # Looks like you failed 1 test of 20. > [19:59:58] t/006_logical_decoding.pl > > This is the test code itself: > > is( $node_primary->safe_psql( > 'postgres', > qq(SELECT stats_reset > '$reset1'::timestamptz FROM > pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1') > ), > qq(t), > qq(Check that reset timestamp is later after resetting stats for > slot '$stats_test_slot1' again.) > ); > > This is the relevant SQL statement: > > SELECT stats_reset > '$reset1'::timestamptz FROM > pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1' > > When this statement is executed, reset1 is as shown: > > 2022-05-12 19:59:58.342 CEST [88829:3] 006_logical_decoding.pl LOG: > statement: SELECT stats_reset > '2022-05-12 > 19:59:58.402808+02'::timestamptz FROM pg_stat_replication_slots WHERE > slot_name = 'test_slot' > I don't know if this is related, but I noticed that the log timestamp (19:59:58.342) is reporting the $reset1 value (19:59:58.402808). I did not understand how a timestamp saved from the past could be ahead of the timestamp of the log. -- Kind Regards, Peter Smith. Fujitsu Australia
RE: Data is copied twice when specifying both child and parent table in publication
On Thur, May 12, 2022 9:48 AM osumi.takami...@fujitsu.com wrote: > On Wednesday, May 11, 2022 11:33 AM I wrote: > > On Monday, May 9, 2022 10:51 AM wangw.f...@fujitsu.com > > wrote: > > > Attach new patches. > > > The patch for HEAD: > > > 1. Modify the approach. Enhance the API of function > > > pg_get_publication_tables to handle one publication or an array of > > > publications. > > > The patch for REL14: > > > 1. Improve the table sync SQL. [suggestions by Shi yu] > > Hi, thank you for updating the patch ! > > > > Minor comments on your patch for HEAD v2. Thanks for your comments. > > (1) commit message sentence > > > > I suggest below sentence. > > > > Kindly change from > > "... when subscribing to both publications using one subscription, the data > > is > > replicated twice in inital copy" > > to "subscribing to both publications from one subscription causes initial > > copy > > twice". Improve it according to your suggestion. > > (2) unused variable > > > > pg_publication.c: In function ‘pg_get_publication_tables’: > > pg_publication.c:1091:11: warning: unused variable ‘pubname’ > > [-Wunused-variable] > > char*pubname; > > > > We can remove this. Fix it. > > (3) free of allocated memory > > > > In the pg_get_publication_tables(), > > we don't free 'elems'. Don't we need it ? Improve it according to your suggestion. Free 'elems'. > > (4) some coding alignments > > > > 4-1. > > > > + List *tables_viaroot = NIL, > > ... > > + *current_table = NIL; > > > > I suggest we can put some variables > > into the condition for the first time call of this function, like > > tables_viaroot and > > current_table. > > When you agree, kindly change it. Improve these according to your suggestions. Also, I put the code for getting publication(s) into the condition for the first time call of this function. > > 4-2. > > > > + /* > > +* Publications support partitioned tables, although > > all changes > > +* are replicated using leaf partition identity and > > schema, so we > > +* only need those. > > +*/ > > + if (publication->alltables) > > + { > > + current_table = > > GetAllTablesPublicationRelations(publication->pubviaroot); > > + } > > > > This is not related to the change itself and now we are inheriting the > > previous > > curly brackets, but I think there's no harm in removing it, since it's only > > for one > > statement. Improve these according to your suggestions. > Hi, > > One more thing I'd like to add is that > we don't hit the below code by tests. > In the HEAD v2, we add a new filtering logic in pg_get_publication_tables. > Although I'm not sure if this is related to the bug fix itself, > when we want to include it in this patch, then > I feel it's better to add some simple test for this part, > to cover all the new main paths and check if > new logic works correctly. > > > + /* > +* If a partition table is published in a publication with > viaroot, > +* and its parent or child table is published in another > publication > +* without viaroot. Then we need to move the parent or child > table > +* from tables to tables_viaroot. > +* > +* If all publication(s)'s viaroot are the same, then skip > this part. > +*/ > > >if (ancestor_viaroot == ancestor) > + { > + tables = > foreach_delete_current(tables, lc2); > + change_tables = > list_append_unique_oid(change_tables, > + > relid); > + } Yes, I agree. But when I was adding the test, I found we could improve this part. So, I removed this part of the code. Also rebase it because the change in HEAD(23e7b38). Attach the patches.(Only changed the patch for HEAD.). 1. Improve the commit message. [suggestions by Osumi-san] 2. Improve coding alignments and the usage for SRFs. [suggestions by Osumi-san and I] 3. Simplify the modifications in function pg_get_publication_tables. Regards, Wang wei HEAD_v3-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch Description: HEAD_v3-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch REL14_v4-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch Description: REL14_v4-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch
Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508
Hi, I finally pushed the fix for this. Erik, thanks for the report! And thanks Michael for the ping... On 2022-05-11 20:32:14 -0700, Andres Freund wrote: > On 2022-05-11 15:48:40 +0900, Michael Paquier wrote: > > Ping. It looks annoying to release beta1 with that, as assertions are > > likely going to be enabled in a lot of test builds. FWIW, it's somewhat hard to hit (basically the sender needs to stall while sending out a transaction / network being really slow), so it'd not have been likely to be hit by all that many people. Greetings, Andres Freund
Re: recovery test failure on morepork with timestamp mystery
On Fri, May 13, 2022 at 12:01:09PM +1000, Peter Smith wrote: > I don't know if this is related, but I noticed that the log timestamp > (19:59:58.342) is reporting the $reset1 value (19:59:58.402808). > > I did not understand how a timestamp saved from the past could be > ahead of the timestamp of the log. morepork is not completely in the white in this area. See the following thread: https://www.postgresql.org/message-id/x+r2vufkzdkcf...@paquier.xyz -- Michael signature.asc Description: PGP signature
Re: recovery test failure on morepork with timestamp mystery
Hi, On 2022-05-12 21:42:43 -0400, Melanie Plageman wrote: > Andres drew my attention to this [1] build farm failure. > > Looks like a test I wrote resetting pg_stat_replication_slots is > failing: > > # Failed test 'Check that reset timestamp is later after resetting > stats for slot 'test_slot' again.' > # at t/006_logical_decoding.pl line 261. > # got: 'f' > # expected: 't' > # Looks like you failed 1 test of 20. > [19:59:58] t/006_logical_decoding.pl > > This is the test code itself: > > is( $node_primary->safe_psql( > 'postgres', > qq(SELECT stats_reset > '$reset1'::timestamptz FROM > pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1') > ), > qq(t), > qq(Check that reset timestamp is later after resetting stats for > slot '$stats_test_slot1' again.) > ); > > This is the relevant SQL statement: > > SELECT stats_reset > '$reset1'::timestamptz FROM > pg_stat_replication_slots WHERE slot_name = '$stats_test_slot1' > > When this statement is executed, reset1 is as shown: > > 2022-05-12 19:59:58.342 CEST [88829:3] 006_logical_decoding.pl LOG: > statement: SELECT stats_reset > '2022-05-12 > 19:59:58.402808+02'::timestamptz FROM pg_stat_replication_slots WHERE > slot_name = 'test_slot' > > Note the timestamp of this execution. The stats reset occurred in the > past, and as such *must* have come before '2022-05-12 > 19:59:58.402808+02'::timestamptz. The timestamp is computed during: > 2022-05-12 19:59:58.317 CEST [18214:3] 006_logical_decoding.pl LOG: > statement: SELECT pg_stat_reset_replication_slot(NULL) One interesting tidbit is that the log timestamps are computed differently (with elog.c:get_formatted_log_time()) than the reset timestamp (with GetCurrentTimestamp()). Both use gettimeofday() internally. I wonder if there's a chance that somehow openbsd ends up with more usecs than "fit" in a second in the result of gettimeofday()? The elog.c case would truncate everything above a second away afaics: /* 'paste' milliseconds into place... */ sprintf(msbuf, ".%03d", (int) (saved_timeval.tv_usec / 1000)); memcpy(formatted_log_time + 19, msbuf, 4); whereas GetCurrentTimestamp() would add them to the timestamp: result = (TimestampTz) tp.tv_sec - ((POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * SECS_PER_DAY); result = (result * USECS_PER_SEC) + tp.tv_usec; Greetings, Andres Freund
Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:
On Thu, May 12, 2022 at 4:57 PM Thomas Munro wrote: > On Thu, May 12, 2022 at 3:13 PM Thomas Munro wrote: > > error running SQL: 'psql::1: ERROR: source database > > "conflict_db_template" is being accessed by other users > > DETAIL: There is 1 other session using the database.' > > Oh, for this one I think it may just be that the autovacuum worker > with PID 23757 took longer to exit than the 5 seconds > CountOtherDBBackends() is prepared to wait, after sending it SIGTERM. In this test, autovacuum_naptime is set to 1s (per Andres, AV was implicated when he first saw the problem with pg_upgrade, hence desire to crank it up). That's not necessary: commenting out the active line in ProcessBarrierSmgrRelease() shows that the tests reliably reproduce data corruption without it. Let's just take that out. As for skink failing, the timeout was hard coded 300s for the whole test, but apparently that wasn't enough under valgrind. Let's use the standard PostgreSQL::Test::Utils::timeout_default (180s usually), but reset it for each query we send. See attached. From ffcb61004fd06c9b2db56c7fe045c7c726d67a72 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Fri, 13 May 2022 13:40:03 +1200 Subject: [PATCH] Fix slow animal timeouts in 032_relfilenode_reuse.pl. Per BF animal chipmunk: CREATE DATABASE could apparently fail due to an AV process being in the template database and not quitting fast enough for the 5 second timeout in CountOtherDBBackends(). The test script had autovacuum_naptime=1s to encourage more activity that opens fds, but that wasn't strictly necessary for this test. Take it out. Per BF animal skink: the test had a global 300s timeout, but apparently that was not enough under valgrind. Use the standard timeout PostgreSQL::Test::Utils::timeout_default, but reset it for each query we run. Discussion: --- src/test/recovery/t/032_relfilenode_reuse.pl | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/src/test/recovery/t/032_relfilenode_reuse.pl b/src/test/recovery/t/032_relfilenode_reuse.pl index ac9340b7dd..5a6a759aa5 100644 --- a/src/test/recovery/t/032_relfilenode_reuse.pl +++ b/src/test/recovery/t/032_relfilenode_reuse.pl @@ -14,7 +14,6 @@ log_connections=on # to avoid "repairing" corruption full_page_writes=off log_min_messages=debug2 -autovacuum_naptime=1s shared_buffers=1MB ]); $node_primary->start; @@ -28,11 +27,8 @@ $node_standby->init_from_backup($node_primary, $backup_name, has_streaming => 1); $node_standby->start; -# To avoid hanging while expecting some specific input from a psql -# instance being driven by us, add a timeout high enough that it -# should never trigger even on very slow machines, unless something -# is really wrong. -my $psql_timeout = IPC::Run::timer(300); +# We'll reset this timeout for each individual query we run. +my $psql_timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default); my %psql_primary = (stdin => '', stdout => '', stderr => ''); $psql_primary{run} = IPC::Run::start( @@ -202,6 +198,9 @@ sub send_query_and_wait my ($psql, $query, $untl) = @_; my $ret; + $psql_timeout->reset(); + $psql_timeout->start(); + # send query $$psql{stdin} .= $query; $$psql{stdin} .= "\n"; -- 2.36.0
Re: First draft of the PG 15 release notes
On Fri, May 13, 2022 at 7:19 AM Amit Langote wrote: > > On Thu, May 12, 2022 at 10:52 PM Bruce Momjian wrote: > > Okay, I went with: > > > > Previously, such updates ran delete actions on the source > > partition and insert actions on the target partition. PostgreSQL > > will > > now run update actions on the referenced partition root. > > WFM, thanks. > > Btw, perhaps the following should be listed under E.1.3.2.1. Logical > Replication, not E.1.3.1.1. Partitioning? > Right. > > > > > Remove incorrect duplicate partitions in system view > pg_publication_tables (Hou Zhijie) > > > > Attached a patch to do so. > I don't see any attachment. -- With Regards, Amit Kapila.
Re: First draft of the PG 15 release notes
On Fri, May 13, 2022 at 11:44 AM Amit Kapila wrote: > On Fri, May 13, 2022 at 7:19 AM Amit Langote wrote: > > > > On Thu, May 12, 2022 at 10:52 PM Bruce Momjian wrote: > > > Okay, I went with: > > > > > > Previously, such updates ran delete actions on the source > > > partition and insert actions on the target partition. PostgreSQL > > > will > > > now run update actions on the referenced partition root. > > > > WFM, thanks. > > > > Btw, perhaps the following should be listed under E.1.3.2.1. Logical > > Replication, not E.1.3.1.1. Partitioning? > > > > Right. > > > > > > > > > > > Remove incorrect duplicate partitions in system view > > pg_publication_tables (Hou Zhijie) > > > > > > > > Attached a patch to do so. > > > > I don't see any attachment. Oops, attached this time. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com move-logirep-item.diff Description: Binary data
Re: First draft of the PG 15 release notes
On Fri, May 13, 2022 at 6:02 AM Euler Taveira wrote: > > On Thu, May 12, 2022, at 11:22 AM, Bruce Momjian wrote: > > On Thu, May 12, 2022 at 11:12:54AM -0300, Euler Taveira wrote: > OB> On Thu, May 12, 2022, at 11:03 AM, Bruce Momjian wrote: > > > > I looked at that but thought that everyone would already assume we > > skipped replication of empty transactions, and I didn't see much impact > > for the user, so I didn't include it. > > > > It certainly has an impact on heavy workloads that replicate tables with few > > modifications. It receives a high traffic of 'begin' and 'commit' messages > > that > > the previous Postgres versions have to handle (discard). I would classify > > it as > > a performance improvement for logical replication. Don't have a strong > > opinion > > if it should be mentioned or not. > > Oh, so your point is that a transaction that only has SELECT would > previously send an empty transaction? I thought this was only for apps > that create literal empty transactions, which seem rare. > > No. It should be a write transaction. If you have a replication setup that > publish only table foo (that isn't modified often) and most of your > workload does not contain table foo, Postgres sends 'begin' and 'commit' > messages to subscriber even if there is no change to replicate. > It reduces network traffic and improves performance by 3-14% on simple tests [1] like the one shown by Euler. I see a value in adding this as for the workloads where it hits, it seems more than 99% of network traffic [2] is due to these empty messages. [1] - https://www.postgresql.org/message-id/OSZPR01MB63105A71CFAA46F5BD7C9D7CFD1E9%40OSZPR01MB6310.jpnprd01.prod.outlook.com [2] - https://www.postgresql.org/message-id/CAMkU=1yohp9-dv48flosprmqyeyys5zwkazgd41rjr10xin...@mail.gmail.com -- With Regards, Amit Kapila.
Re: recovery test failure on morepork with timestamp mystery
Hi, On 2022-05-12 19:14:13 -0700, Andres Freund wrote: > On 2022-05-12 21:42:43 -0400, Melanie Plageman wrote: > > Andres drew my attention to this [1] build farm failure. I just saw that there's another recent timestamp related failure: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gombessa&dt=2022-05-13%2002%3A58%3A52 It's pretty odd that we have two timestamp related failures in stats code that hasn't changed in >30 days, both only on openbsd within the last ~10h. There's not been a similar isolationtest failure before. Greetings, Andres Freund
Re: Skipping schema changes in publication
On Thu, May 12, 2022 at 2:24 PM vignesh C wrote: > ... > The attached patch has the implementation for "ALTER PUBLICATION > pubname RESET". This command will reset the publication to default > state which includes resetting the publication options, setting ALL > TABLES option to false and dropping the relations and schemas that are > associated with the publication. > Please see below my review comments for the v1-0001 (RESET) patch == 1. Commit message This patch adds a new RESET option to ALTER PUBLICATION which Wording: "RESET option" -> "RESET clause" ~~~ 2. doc/src/sgml/ref/alter_publication.sgml + + The RESET clause will reset the publication to default + state which includes resetting the publication options, setting + ALL TABLES option to false and drop the + relations and schemas that are associated with the publication. 2a. Wording: "to default state" -> "to the default state" 2b. Wording: "and drop the relations..." -> "and dropping all relations..." ~~~ 3. doc/src/sgml/ref/alter_publication.sgml + invoking user to be a superuser. RESET of publication + requires invoking user to be a superuser. To alter the owner, you must also Wording: "requires invoking user" -> "requires the invoking user" ~~~ 4. doc/src/sgml/ref/alter_publication.sgml - Example @@ -207,6 +220,12 @@ ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing, sales; production_publication: ALTER PUBLICATION production_publication ADD TABLE users, departments, ALL TABLES IN SCHEMA production; + + + + Resetting the publication production_publication: + +ALTER PUBLICATION production_publication RESET; Wording: "Resetting the publication" -> "Reset the publication" ~~~ 5. src/backend/commands/publicationcmds.c + /* Check and reset the options */ IMO the code can just reset all these options unconditionally. I did not see the point to check for existing option values first. I feel the simpler code outweighs any negligible performance difference in this case. ~~~ 6. src/backend/commands/publicationcmds.c + /* Check and reset the options */ Somehow it seemed a pity having to hardcode all these default values true/false in multiple places; e.g. the same is already hardcoded in the parse_publication_options function. To avoid multiple hard coded bools you could just call the parse_publication_options with an empty options list. That would set the defaults which you can then use: values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactiondefs->insert); Alternatively, maybe there should be #defines to use instead of having the scattered hardcoded bool defaults: #define PUBACTION_DEFAULT_INSERT true #define PUBACTION_DEFAULT_UPDATE true etc ~~~ 7. src/include/nodes/parsenodes.h @@ -4033,7 +4033,8 @@ typedef enum AlterPublicationAction { AP_AddObjects, /* add objects to publication */ AP_DropObjects, /* remove objects from publication */ - AP_SetObjects /* set list of objects */ + AP_SetObjects, /* set list of objects */ + AP_ReSetPublication /* reset the publication */ } AlterPublicationAction; Unusual case: "AP_ReSetPublication" -> "AP_ResetPublication" ~~~ 8. src/test/regress/sql/publication.sql 8a. +-- Test for RESET PUBLICATION SUGGESTED +-- Tests for ALTER PUBLICATION ... RESET 8b. +-- Verify that 'ALL TABLES' option is reset SUGGESTED: +-- Verify that 'ALL TABLES' flag is reset 8c. +-- Verify that publish option and publish via root option is reset SUGGESTED: +-- Verify that publish options and publish_via_partition_root option are reset 8d. +-- Verify that only superuser can execute RESET publication SUGGESTED +-- Verify that only superuser can reset a publication -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Comments on Custom RMGRs
On Thu, 2022-05-12 at 22:26 +0100, Simon Riggs wrote: > I see multiple uses for the rm_checkpoint() point proposed and I've > been asked multiple times for a checkpoint hook. Can you elaborate and/or link to a discussion? Regards, Jeff Davis
Correct comment in ProcedureCreate() for pgstat_create_function() call.
Hi, PFA, attached patch to $SUBJECT. -- Regards, Amul Sul EDB: http://www.enterprisedb.com
Re: Correct comment in ProcedureCreate() for pgstat_create_function() call.
Sorry, hit the send button too early :| Attached here !! On Fri, May 13, 2022 at 10:20 AM Amul Sul wrote: > > Hi, > > PFA, attached patch to $SUBJECT. > > -- > Regards, > Amul Sul > EDB: http://www.enterprisedb.com code_comment.patch Description: Binary data
Re: Data is copied twice when specifying both child and parent table in publication
On Fri, May 13, 2022 at 7:32 AM wangw.f...@fujitsu.com wrote: > > Attach the patches.(Only changed the patch for HEAD.). > Few comments: = 1. @@ -1135,6 +1172,15 @@ pg_get_publication_tables(PG_FUNCTION_ARGS) if (publication->pubviaroot) tables = filter_partitions(tables); } + pfree(elems); + + /* + * We need an additional filter for this case : A partition table is + * published in a publication with viaroot, and its parent or child + * table is published in another publication without viaroot. In this + * case, we should publish only parent table. + */ + tables = filter_partitions(tables); Do we need to filter partitions twice? Can't we check if any of the publications has 'pubviaroot' option set, if so, call filter_partitions at the end? 2. " FROM pg_class c JOIN pg_namespace n" + " ON n.oid = c.relnamespace," + " LATERAL pg_get_publication_tables(array[ %s ]) gst" Here, it is better to have an alias name as gpt. 3. } + pfree(elems); + An extra line between these two lines makes it looks slightly better. 4. Not able to apply patch cleanly. patching file src/test/subscription/t/013_partition.pl Hunk #1 FAILED at 477. Hunk #2 FAILED at 556. Hunk #3 FAILED at 584. 3 out of 3 hunks FAILED -- saving rejects to file src/test/subscription/t/013_partition.pl.rej patching file src/test/subscription/t/028_row_filter.pl Hunk #1 succeeded at 394 (offset 1 line). Hunk #2 FAILED at 722. 1 out of 2 hunks FAILED -- saving rejects to file src/test/subscription/t/028_row_filter.pl.rej patching file src/test/subscription/t/031_column_list.pl Hunk #1 succeeded at 948 (offset -92 lines). Hunk #2 FAILED at 1050. 1 out of 2 hunks FAILED -- saving rejects to file src/test/subscription/t/031_column_list.pl.rej -- With Regards, Amit Kapila.
RE: bogus: logical replication rows/cols combinations
On Thursday, May 12, 2022 2:45 PM Amit Kapila wrote: > > On Wed, May 11, 2022 at 12:55 PM houzj.f...@fujitsu.com > wrote: > > > > On Wednesday, May 11, 2022 11:33 AM Amit Kapila > wrote: > > > > > > Fair enough, then we should go with a simpler approach to detect it > > > in pgoutput.c (get_rel_sync_entry). > > > > OK, here is the patch that try to check column list in that way. The > > patch also check the column list when CREATE SUBSCRIPTION and when > starting initial copy. > > > > Few comments: > === > 1. > initStringInfo(&cmd); > - appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, > t.tablename\n" > -" FROM pg_catalog.pg_publication_tables t\n" > + appendStringInfoString(&cmd, > +"SELECT DISTINCT t.schemaname,\n" > +"t.tablename,\n" > +"(CASE WHEN (array_length(pr.prattrs, 1) = t.relnatts)\n" > +"THEN NULL ELSE pr.prattrs END)\n" > +" FROM (SELECT P.pubname AS pubname,\n" > +" N.nspname AS schemaname,\n" > +" C.relname AS tablename,\n" > +" P.oid AS pubid,\n" > +" C.oid AS reloid,\n" > +" C.relnatts\n" > +" FROM pg_publication P,\n" > +" LATERAL pg_get_publication_tables(P.pubname) GPT,\n" > +" pg_class C JOIN pg_namespace N\n" > +" ON (N.oid = C.relnamespace)\n" > +" WHERE C.oid = GPT.relid) t\n" > +" LEFT OUTER JOIN pg_publication_rel pr\n" > +" ON (t.pubid = pr.prpubid AND\n" > +"pr.prrelid = reloid)\n" > > Can we modify pg_publication_tables to get the row filter and column list and > then use it directly instead of constructing this query? Agreed. If we can get columnlist and rowfilter from pg_publication_tables, it will be more convenient. And I think users that want to fetch the columnlist and rowfilter of table can also benefit from it. > 2. > + if (list_member(tablelist, rv)) > + ereport(WARNING, > + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("cannot use different column lists for table \"%s.%s\" in > different publications", > +nspname, relname)); > + else > > Can we write comments to explain why we are using WARNING here instead of > ERROR? > > 3. > static void > -pgoutput_ensure_entry_cxt(PGOutputData *data, RelationSyncEntry *entry) > +pgoutput_ensure_entry_cxt(PGOutputData *data, RelationSyncEntry *entry, > + Relation relation) > > What is the need to change this interface as part of this patch? Attach the new version patch which addressed these comments and update the document. 0001 patch is to extent the view and 0002 patch is to add restriction for column list. Best regards, Hou zj 0002-Disallow-combining-publication-when-column-list-is-d.patch Description: 0002-Disallow-combining-publication-when-column-list-is-d.patch 0001-extent-pg_publication_tables.patch Description: 0001-extent-pg_publication_tables.patch
Re: Privileges on PUBLICATION
Euler Taveira wrote: > On Tue, May 10, 2022, at 5:37 AM, Antonin Houska wrote: > > My understanding is that the rows/columns filtering is a way for the > *publisher* to control which data is available to particular replica. From > this point of view, the publication privileges would just make the control > complete. > > I agree. IMO it is a new feature. We already require high privilege for > logical > replication. Hence, we expect the replication user to have access to all data. > Unfortunately, nobody mentioned about this requirement during the row filter / > column list development; someone could have written a patch for GRANT ... ON > PUBLICATION. I can try that for PG 16, unless someone is already working on it. > I understand your concern. Like I said in my last sentence in the previous > email: it is a fine-grained access control on the publisher. Keep in mind that > it will *only* work for non-superusers (REPLICATION attribute). It is not > exposing something that we didn't expose before. In this particular case, > there > is no mechanism to prevent the subscriber to obtain data provided by the > various row filters if they know the publication names. We could probably add > a > sentence to "Logical Replication > Security" section: > > There is no privileges for publications. If you have multiple publications in > a > database, a subscription can use all publications available. Attached is my proposal. It tries to be more specific and does not mention the absence of the privileges explicitly. -- Antonin Houska Web: https://www.cybertec-postgresql.com diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml index 1a828e8d2ff..b74ba625649 100644 --- a/doc/src/sgml/ref/create_publication.sgml +++ b/doc/src/sgml/ref/create_publication.sgml @@ -94,6 +94,16 @@ CREATE PUBLICATION name list is specified, it must include the replica identity columns. + + + If you are using the WHERE clause or the column list + to omit some table data from the replication for security reasons, + please make sure that the same data is not exposed via other + publications which contain the same table and have different (or + none) WHERE clause or column list. + + + Only persistent base tables and partitioned tables can be part of a publication. Temporary tables, unlogged tables, foreign tables,
Re: Privileges on PUBLICATION
Peter Eisentraut wrote: > On 10.05.22 10:37, Antonin Houska wrote: > > My understanding is that the rows/columns filtering is a way for the > > *publisher* to control which data is available to particular replica. From > > this point of view, the publication privileges would just make the control > > complete. > > I think privileges on publications would eventually be useful. But are you > arguing that we need them for PG15 to make the new features usable safely? I didn't think that far, but user should be aware of the problem. My proposal of documentation is in https://www.postgresql.org/message-id/5859.1652423797%40antos -- Antonin Houska Web: https://www.cybertec-postgresql.com