Re: Direct SSL connection with ALPN and HBA rules

2024-04-23 Thread Michael Paquier
On Tue, Apr 23, 2024 at 01:48:04AM +0300, Heikki Linnakangas wrote:
> Here's the patch for that. The error message is:
> 
> "direct SSL connection was established without ALPN protocol negotiation
> extension"

WFM.

> That's accurate, but I wonder if we could make it more useful to a user
> who's wondering what went wrong. I'd imagine that if the server doesn't
> support ALPN, it's because you have some kind of a (not necessarily
> malicious) generic SSL man-in-the-middle that doesn't support it. Or you're
> trying to connect to an HTTPS server. Suggestions welcome.

Hmm.  Is there any point in calling SSL_get0_alpn_selected() in
open_client_SSL() to get the ALPN if current_enc_method is not
ENC_DIRECT_SSL?

In the documentation of PQsslAttribute(), it is mentioned that empty
string is returned for "alpn" if ALPN was not used, however the code
returns NULL in this case:
SSL_get0_alpn_selected(conn->ssl, &data, &len);
if (data == NULL || len == 0 || len > sizeof(alpn_str) - 1)
return NULL;
--
Michael


signature.asc
Description: PGP signature


Re: pg_stat_statements and "IN" conditions

2024-04-23 Thread Dmitry Dolgov
> On Mon, Apr 15, 2024 at 06:09:29PM +0900, Sutou Kouhei wrote:
>
> Thanks. I'm not familiar with this code base but I've
> reviewed these patches because I'm interested in this
> feature too.

Thanks for the review! The commentaries for the first patch make sense
to me, will apply.

> 0003:
>
> > diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
> > b/contrib/pg_stat_statements/pg_stat_statements.c
> > index d7841b51cc9..00eec30feb1 100644
> > --- a/contrib/pg_stat_statements/pg_stat_statements.c
> > +++ b/contrib/pg_stat_statements/pg_stat_statements.c
> > ...
> > @@ -2883,12 +2886,22 @@ generate_normalized_query(JumbleState *jstate, 
> > const char *query,
> > /* The firsts merged constant */
> > else if (!skip)
> > {
> > +   static const uint32 powers_of_ten[] = {
> > +   1, 10, 100,
> > +   1000, 1, 10,
> > +   100, 1000, 1,
> > +   10
> > +   };
> > +   int lower_merged = powers_of_ten[magnitude - 1];
> > +   int upper_merged = powers_of_ten[magnitude];
>
> How about adding a reverse function of decimalLength32() to
> numutils.h and use it here?

I was pondering that at some point, but eventually decided to keep it
this way, because:

* the use case is quite specific, I can't image it's being used anywhere
  else

* it would not be strictly reverse, as the transformation itself is not
  reversible in the strict sense

> > -   n_quer_loc += sprintf(norm_query + n_quer_loc, "...");
> > +   n_quer_loc += sprintf(norm_query + n_quer_loc, "... 
> > [%d-%d entries]",
> > + lower_merged, 
> > upper_merged - 1);
>
> Do we still have enough space in norm_query for this change?
> It seems that norm_query expects up to 10 additional bytes
> per jstate->clocations[i].

As far as I understand there should be enough space, because we're going
to replace at least 10 constants with one merge record. But it's a good
point, this should be called out in the commentary explaining why 10
additional bytes are added.

> It seems that we can merge 0001, 0003 and 0004 to one patch.
> (Sorry. I haven't read all discussions yet. If we already
> discuss this, sorry for this noise.)

There is a certain disagreement about which portion of this feature
makes sense to go with first, thus I think keeping all options open is a
good idea. In the end a committer can squash the patches if needed.




POC: make mxidoff 64 bits

2024-04-23 Thread Maxim Orlov
Hi!

I've been trying to introduce 64-bit transaction identifications to
Postgres for quite a while [0]. All this implies,
of course, an enormous amount of change that will have to take place in
various modules. Consider this, the
patch set become too big to be committed “at once”.

The obvious solutions was to try to split the patch set into smaller ones.
But here it comes a new challenge,
not every one of these parts, make Postgres better at the moment. Actually,
even switching to a
FullTransactionId in PGPROC will have disadvantage in increasing of WAL
size [1].

In fact, I believe, we're dealing with the chicken and the egg problem
here. Not able to commit full patch set
since it is too big to handle and not able to commit parts of it, since
they make sense all together and do not
help improve Postgres at the moment.

But it's not that bad. Since the commit 4ed8f0913bfdb5f, added in [3], we
are capable to use 64 bits to
indexing SLRUs.

PROPOSAL
Make multixact offsets 64 bit.

RATIONALE
It is not very difficult to overflow 32-bit mxidoff. Since, it is created
for every unique combination of the
transaction for each tuple, including XIDs and respective flags. And when a
transaction is added to a
specific multixact, it is rewritten with a new offset. In other words, it
is possible to overflow the offsets of
multixacts without overflowing the multixacts themselves and/or ordinary
transactions. I believe, there
was something about in the hackers mail lists, but I could not find links
now.

PFA, patch. Here is a WIP version. Upgrade machinery should be added later.

As always, any opinions on a subject a very welcome!

[0]
https://www.postgresql.org/message-id/flat/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
[1]
https://www.postgresql.org/message-id/flat/CACG%3DezY7msw%2Bjip%3Drtfvnfz051dRqz4s-diuO46v3rAoAE0T0g%40mail.gmail.com
[3]
https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com

-- 
Best regards,
Maxim Orlov.


0001-WIP-mxidoff-to-64bit.patch
Description: Binary data


Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-04-23 Thread Tender Wang
Andrey M. Borodin  于2024年4月8日周一 17:40写道:

>
>
> > On 27 Sep 2023, at 16:06, tender wang  wrote:
> >
> >Do you have any comments or suggestions on this issue? Thanks.
> Hi Tender,
>
> there are some review comments in the thread, that you might be interested
> in.
> I'll mark this [0] entry "Waiting on Author" and move to next CF.
>
> Thanks!
>
>
> Best regards, Andrey Borodin.
>
> [0]https://commitfest.postgresql.org/47/4549/


I have rebased master and fixed a plan diff case.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v3-0001-Support-materializing-inner-path-on-parallel-oute.patch
Description: Binary data


Re: POC: make mxidoff 64 bits

2024-04-23 Thread Heikki Linnakangas

On 23/04/2024 11:23, Maxim Orlov wrote:

PROPOSAL
Make multixact offsets 64 bit.


+1, this is a good next step and useful regardless of 64-bit XIDs.


@@ -156,7 +148,7 @@
((uint32) ((0x % MULTIXACT_MEMBERS_PER_PAGE) + 1))
 
 /* page in which a member is to be found */

-#define MXOffsetToMemberPage(xid) ((xid) / (TransactionId) 
MULTIXACT_MEMBERS_PER_PAGE)
+#define MXOffsetToMemberPage(xid) ((xid) / (MultiXactOffset) 
MULTIXACT_MEMBERS_PER_PAGE)
 #define MXOffsetToMemberSegment(xid) (MXOffsetToMemberPage(xid) / 
SLRU_PAGES_PER_SEGMENT)
 
 /* Location (byte offset within page) of flag word for a given member */


This is really a bug fix. It didn't matter when TransactionId and 
MultiXactOffset were both typedefs of uint32, but it was always wrong. 
The argument name 'xid' is also misleading.


I think there are some more like that, MXOffsetToFlagsBitShift for example.

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: query_id, pg_stat_activity, extended query protocol

2024-04-23 Thread Andrei Lepikhov

On 4/23/24 12:49, Michael Paquier wrote:

On Tue, Apr 23, 2024 at 11:42:41AM +0700, Andrei Lepikhov wrote:

On 23/4/2024 11:16, Imseih (AWS), Sami wrote:

+   pgstat_report_query_id(linitial_node(Query, psrc->query_list)->queryId, 
true);
  set_ps_display("BIND");
@@ -2146,6 +2147,7 @@ exec_execute_message(const char *portal_name, long 
max_rows)
  debug_query_string = sourceText;
  pgstat_report_activity(STATE_RUNNING, sourceText);
+   pgstat_report_query_id(portal->queryDesc->plannedstmt->queryId, true);
  cmdtagname = GetCommandTagNameAndLen(portal->commandTag, &cmdtaglen);


In exec_bind_message, how can you be sure that queryId exists in query_list
before the call of GetCachedPlan(), which will validate and lock the plan?
What if some OIDs were altered in the middle?


I am also a bit surprised with the choice of using the first Query
available in the list for the ID, FWIW.

Did you consider using \bind to show how this behaves in a regression
test?
I'm not sure how to invent a test based on the \bind command - we need 
some pause in the middle.
But simplistic case with a prepared statement shows how the value of 
queryId can be changed if you don't acquire all the objects needed for 
the execution:


CREATE TABLE test();
PREPARE name AS SELECT * FROM test;
EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
DROP TABLE test;
CREATE TABLE test();
EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;

/*
QUERY PLAN
---
 Seq Scan on public.test (actual time=0.002..0.004 rows=0 loops=1)
 Query Identifier: 6750745711909650694

QUERY PLAN
---
 Seq Scan on public.test (actual time=0.004..0.004 rows=0 loops=1)
 Query Identifier: -2597546769858730762
*/

We have different objects which can be changed - I just have invented 
the most trivial example to discuss.


--
regards, Andrei Lepikhov





Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-04-23 Thread Alvaro Herrera
On 2024-Apr-22, Tom Lane wrote:

> The main reason there's a delta is that people don't manage to
> maintain the in-tree copy perfectly (at least, they certainly
> haven't done so for this past year).  So we need to do that
> to clean up every now and then.

Out of curiosity, I downloaded the buildfarm-generated file and
re-indented the whole tree.  It turns out that most commits seem to have
maintained the in-tree typedefs list correctly when adding entries (even
if out of alphabetical order), but a few haven't; and some people have
added entries that the buildfarm script does not detect.  So the import
from BF will delete those entries and mess up the overall indent.  For
example it does stuff like

+++ b/src/backend/commands/async.c
@@ -399,7 +399,7 @@ typedef struct NotificationList
 typedef struct NotificationHash
 {
Notification *event;/* => the actual Notification struct */
-} NotificationHash;
+}  NotificationHash;

There's a good half dozen of those.

I wonder if we're interested in keeping a (very short) manually-
maintained list of symbols that we know are in use but the scripts
don't extract for whatever reason.


The change of NotificationHash looks surprising at first sight:
apparently 095d109ccd7 deleted the only use of that type as a variable
anywhere.  But then I wonder if that datatype is useful at all anymore,
since it only contains one pointer -- it seems we could just remove it.

But there are others: InjectionPointEntry, ResourceOwnerData,
JsonNonTerminal, JsonParserSem, ...

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"This is what I like so much about PostgreSQL.  Most of the surprises
are of the "oh wow!  That's cool" Not the "oh shit!" kind.  :)"
Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php




Minor document typo

2024-04-23 Thread Tatsuo Ishii
Hi,

doc/src/sgml/monitoring.sgml seems to have a minor typo:

In pg_stat_database_conflicts section (around line 3621) we have:

  
   Number of uses of logical slots in this database that have been
   canceled due to old snapshots or too low a 
   on the primary
  

I think "too low a" should be "too low" ('a' is not
necessary). Attached is the patch.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 053da8d6e4..56e79b1304 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3618,7 +3618,7 @@ description | Waiting for a newly initialized WAL file to reach durable storage
   
   
Number of uses of logical slots in this database that have been
-   canceled due to old snapshots or too low a 
+   canceled due to old snapshots or too low 
on the primary
   
  


Re: Race condition in FetchTableStates() breaks synchronization of subscription tables

2024-04-23 Thread Amit Kapila
On Wed, Mar 13, 2024 at 11:59 AM vignesh C  wrote:
>
> On Wed, 13 Mar 2024 at 10:12, Zhijie Hou (Fujitsu)
>  wrote:
> >
> >
> > For 0002, instead of avoid resetting the latch, is it possible to let the
> > logical rep worker wake up the launcher once after attaching ?
>
> Waking up of the launch process uses the same latch that is used for
> subscription creation/modification and apply worker process exit. As
> the handling of this latch for subscription creation/modification and
> worker process exit can be done only by ApplyLauncherMain, we will not
> be able to reset the latch in WaitForReplicationWorkerAttach. I feel
> waking up the launcher process might not help in this case as
> currently we will not be able to differentiate between worker
> attached, subscription creation/modification and apply worker process
> exit.
>

IIUC, even if we set the latch once the worker attaches, the other
set_latch by subscription creation/modification or apply_worker_exit
could also be consumed due to reset of latch in
WaitForReplicationWorkerAttach(). Is that understanding correct? If
so, can we use some other way to wake up
WaitForReplicationWorkerAttach() say condition variable? The current
proposal can fix the issue but looks bit adhoc.

-- 
With Regards,
Amit Kapila.




Re: Minor document typo

2024-04-23 Thread David Rowley
On Tue, 23 Apr 2024 at 23:17, Tatsuo Ishii  wrote:
>Number of uses of logical slots in this database that have been
>canceled due to old snapshots or too low a  linkend="guc-wal-level"/>
>on the primary
>
> I think "too low a" should be "too low" ('a' is not
> necessary). Attached is the patch.

The existing text looks fine to me.  The other form would use "of a"
and become "too low of a wal_level on the primary".

"too low wal_level on the primary" sounds wrong to my native
English-speaking ear.

There's some discussion in [1] that might be of interest to you.

David

[1] 
https://www.reddit.com/r/grammar/comments/qr9z6e/i_need_help_with_sothat_adj_of_a_sing_noun/?ref=share&ref_source=link




Re: subscription/026_stats test is intermittently slow?

2024-04-23 Thread Amit Kapila
On Tue, Apr 23, 2024 at 11:49 AM vignesh C  wrote:
>
> On Sat, 20 Apr 2024 at 10:30, Alexander Lakhin  wrote:
> >
> > Hello Michael and Robert,
> >
> > 20.04.2024 05:57, Michael Paquier wrote:
> > > On Fri, Apr 19, 2024 at 01:57:41PM -0400, Robert Haas wrote:
> > >> It looks to me like in the first run it took 3 minutes for the
> > >> replay_lsn to catch up to the desired value, and in the second run,
> > >> two seconds. I think I have seen previous instances where something
> > >> similar happened, although in those cases I did not stop to record any
> > >> details. Have others seen this? Is there something we can/should do
> > >> about it?
> > > FWIW, I've also seen delays as well with this test on a few occasions.
> > > Thanks for mentioning it.
> >
> > It reminds me of
> > https://www.postgresql.org/message-id/858a7622-2c81-1687-d1df-1322dfcb2e72%40gmail.com
>
> Thanks Alexander for the test, I was able to reproduce the issue with
> the test you shared and also verify that the patch at [1] fixes the
> same.
>

One of the issues reported in the thread you referred to has the same
symptoms [1]. I'll review and analyze your proposal.

[1] - 
https://www.postgresql.org/message-id/858a7622-2c81-1687-d1df-1322dfcb2e72%40gmail.com

-- 
With Regards,
Amit Kapila.




Re: Minor document typo

2024-04-23 Thread Tatsuo Ishii
>> I think "too low a" should be "too low" ('a' is not
>> necessary). Attached is the patch.
> 
> The existing text looks fine to me.  The other form would use "of a"
> and become "too low of a wal_level on the primary".
> 
> "too low wal_level on the primary" sounds wrong to my native
> English-speaking ear.
> 
> There's some discussion in [1] that might be of interest to you.
> 
> David
> 
> [1] 
> https://www.reddit.com/r/grammar/comments/qr9z6e/i_need_help_with_sothat_adj_of_a_sing_noun/?ref=share&ref_source=link

Thank you for the explanation. English is difficult :-)

Just out of a curiosity, is it possible to say "low a wal_level on the
primary"? (just "too" is removed)
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp





Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-04-23 Thread Robert Haas
On Tue, Apr 23, 2024 at 6:23 AM Alvaro Herrera  wrote:
> I wonder if we're interested in keeping a (very short) manually-
> maintained list of symbols that we know are in use but the scripts
> don't extract for whatever reason.

+1. I think this idea has been proposed and rejected before, but I
think it's more important to have our code indented correctly than to
be able to rely on a 100% automated process for gathering typedefs.

There is of course the risk that the manually generated file will
accumulate stale cruft over time, but I don't really see that being a
big problem. First, it doesn't cost much to have a few extra symbols
in there. Second, I suspect someone will go through it at least every
couple of years, if not more often, and figure out which entries are
still doing something.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




RE: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-04-23 Thread Hayato Kuroda (Fujitsu)
Dear hackers,

Per recent commit (b29cbd3da), our patch needed to be rebased.
Here is an updated version.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 
 


v6-0001-Allow-altering-of-two_phase-option-of-a-SUBSCRIPT.patch
Description:  v6-0001-Allow-altering-of-two_phase-option-of-a-SUBSCRIPT.patch


v6-0002-Alter-slot-option-two_phase-only-when-altering-tr.patch
Description:  v6-0002-Alter-slot-option-two_phase-only-when-altering-tr.patch


v6-0003-Abort-prepared-transactions-while-altering-two_ph.patch
Description:  v6-0003-Abort-prepared-transactions-while-altering-two_ph.patch


v6-0004-Add-force_alter-option.patch
Description: v6-0004-Add-force_alter-option.patch


Re: slightly misleading Error message in guc.c

2024-04-23 Thread Daniel Gustafsson
> On 22 Apr 2024, at 18:04, Tom Lane  wrote:

> Seems like a useful change

Agreed.

> ... about like this?

Patch LGTM.

--
Daniel Gustafsson





Re: GUC-ify walsender MAX_SEND_SIZE constant

2024-04-23 Thread Jakub Wartak
On Tue, Apr 23, 2024 at 2:24 AM Michael Paquier  wrote:
>
> On Mon, Apr 22, 2024 at 03:40:01PM +0200, Majid Garoosi wrote:
> > Any news, comments, etc. about this thread?
>
> FWIW, I'd still be in favor of doing a GUC-ification of this part, but
> at this stage I'd need more time to do a proper study of a case where
> this shows benefits to prove your point, or somebody else could come
> in and show it.
>
> Andres has objected to this change, on the ground that this was not
> worth it, though you are telling the contrary.  I would be curious to
> hear from others, first, so as we gather more opinions to reach a
> consensus.

I'm more with Andres on this one.I vaguely remember researching impact
of MAX_SEND_SIZE on independent two occasions (earlier async and more
recent sync case where I've investigated variety of ways to keep
latency down) and my summary would be:

First: it's very hard to get *reliable* replication setup for
benchmark, where one could demonstrate correlation between e.g.
increasing MAX_SEND_SIZE and observing benefits (in sync rep it is
easier, as you are simply stalled in pgbench). Part of the problem are
the following things:

a) workload can be tricky, for this purpose it needs to be trivial but bulky
b) it needs to be on isolated network and with guaranteed bandwidth
c) wal_init_zero impact should be ruled out
d) OS should be properly tuned autotuning TCP max(3rd value) + have
setup rmem_max/wmem_max properly
e) more serious TCP congestion should be used that the default one in OS
f) one should prevent any I/O stalls on walreceiver writeback during
huge WAL activity and restart points on standby (dirty_bytes and so
on, isolated pg_xlog, BDI max_ratio)

Second: once you perform above and ensure that there are no network or
I/O stalls back then I *think* I couldn't see any impact of playing
with MAX_SEND_SIZE from what I remember as probably something else is
saturated first.

I can offer help with trying to test this with artificial tests and
even injecting proper latency (WAN-like), but OP (Majid) I think needs
first describe his env much better (exact latency, bandwidth,
workload, TCP sysctl values, duration of the tests, no# of attempts
tried, exact commands used, iperf3 TCP results demonstrating hw used
and so on). So in short the patch is easy, but demonstrating the
effect and persuading others here would be hard.

-J.




Re: POC: make mxidoff 64 bits

2024-04-23 Thread Andrey M. Borodin



> On 23 Apr 2024, at 11:23, Maxim Orlov  wrote:
> 
> Make multixact offsets 64 bit.

-   ereport(ERROR,
-   (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-errmsg("multixact \"members\" limit exceeded"),
Personally, I'd be happy with this! We had some incidents where the only 
mitigation was vacuum settings tweaking.

BTW as a side note... I see lot's of casts to (unsigned long long), can't we 
just cast to MultiXactOffset?


Best regards, Andrey Borodin.



Re: POC: make mxidoff 64 bits

2024-04-23 Thread wenhui qiu
Hi Maxim Orlov
   Thank you so much for your tireless work on this. Increasing the WAL
size by a few bytes should have very little impact with today's disk
performance(Logical replication of this feature wal log is also increased a
lot, logical replication is a milestone new feature, and the community has
been improving the logical replication of functions),I believe removing
troubled postgresql Transaction ID Wraparound was also a milestone  new
feature  adding a few bytes is worth it!

Best regards

On Tue, 23 Apr 2024 at 17:37, Heikki Linnakangas  wrote:

> On 23/04/2024 11:23, Maxim Orlov wrote:
> > PROPOSAL
> > Make multixact offsets 64 bit.
>
> +1, this is a good next step and useful regardless of 64-bit XIDs.
>
> > @@ -156,7 +148,7 @@
> >   ((uint32) ((0x % MULTIXACT_MEMBERS_PER_PAGE) + 1))
> >
> >  /* page in which a member is to be found */
> > -#define MXOffsetToMemberPage(xid) ((xid) / (TransactionId)
> MULTIXACT_MEMBERS_PER_PAGE)
> > +#define MXOffsetToMemberPage(xid) ((xid) / (MultiXactOffset)
> MULTIXACT_MEMBERS_PER_PAGE)
> >  #define MXOffsetToMemberSegment(xid) (MXOffsetToMemberPage(xid) /
> SLRU_PAGES_PER_SEGMENT)
> >
> >  /* Location (byte offset within page) of flag word for a given member */
>
> This is really a bug fix. It didn't matter when TransactionId and
> MultiXactOffset were both typedefs of uint32, but it was always wrong.
> The argument name 'xid' is also misleading.
>
> I think there are some more like that, MXOffsetToFlagsBitShift for example.
>
> --
> Heikki Linnakangas
> Neon (https://neon.tech)
>
>
>
>


Re: Minor document typo

2024-04-23 Thread David Rowley
On Wed, 24 Apr 2024 at 00:11, Tatsuo Ishii  wrote:
> Just out of a curiosity, is it possible to say "low a wal_level on the
> primary"? (just "too" is removed)

Prefixing the adjective with "too" means it's beyond the acceptable
range.  "This coffee is too hot".

https://dictionary.cambridge.org/grammar/british-grammar/too

David




Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-04-23 Thread Tom Lane
Alvaro Herrera  writes:
> On 2024-Apr-22, Tom Lane wrote:
>> The main reason there's a delta is that people don't manage to
>> maintain the in-tree copy perfectly (at least, they certainly
>> haven't done so for this past year).  So we need to do that
>> to clean up every now and then.

> Out of curiosity, I downloaded the buildfarm-generated file and
> re-indented the whole tree.  It turns out that most commits seem to have
> maintained the in-tree typedefs list correctly when adding entries (even
> if out of alphabetical order), but a few haven't; and some people have
> added entries that the buildfarm script does not detect.

Yeah.  I believe that happens when there is no C variable or field
anywhere that has that specific struct type.  In your example,
NotificationHash appears to only be referenced in a sizeof()
call, which suggests that maybe the coding is a bit squirrely
and could be done another way.

Having said that, there already are manually-curated lists of
inclusions and exclusions hard-wired into pgindent (see around
line 70).  I wouldn't have any great objection to adding more
entries there.  Or if somebody wanted to do the work, they
could be pulled out into separate files.

regards, tom lane




pg_trgm comparison bug on cross-architecture replication due to different char implementation

2024-04-23 Thread Guo, Adam
Hi all,

I would like to report an issue with the pg_trgm extension on 
cross-architecture replication scenarios. When an x86_64 standby server is 
replicating from an aarch64 primary server or vice versa, the gist_trgm_ops 
opclass returns different results on the primary and standby. Masahiko 
previously reported a similar issue affecting the pg_bigm extension [1].

To reproduce, execute the following on the x86_64 primary server:

CREATE EXTENSION pg_trgm;
CREATE TABLE tbl (c text);
CREATE INDEX ON tbl USING gist (c gist_trgm_ops);
INSERT INTO tbl VALUES ('Bóbr');

On the x86_64 primary server:

postgres=> select * from tbl where c like '%Bób%';
  c
--
Bóbr
(1 row)

On the aarch64 replica server:

postgres=> select * from tbl where c like '%Bób%';
c
---
(0 rows)

The root cause looks the same as the pg_bigm issue that Masahiko reported. To 
compare trigrams, pg_trgm uses a numerical comparison of chars [2]. On x86_64 a 
char is signed by default, whereas on aarch64 it is unsigned by default. 
gist_trgm_ops expects the trigram list to be sorted, but due to the different 
signedness of chars, the sort order is broken when replicating the values 
across architectures.

The different sort behaviour can be demonstrated using show_trgm.

On the x86_64 primary server:

postgres=> SELECT show_trgm('Bóbr');
show_trgm
--
{0x89194c,"  b","br ",0x707c72,0x7f7849}
(1 row)

On the aarch64 replica server:

postgres=> SELECT show_trgm('Bóbr');
show_trgm
--
{"  b","br ",0x707c72,0x7f7849,0x89194c}
(1 row)

One simple solution for this specific case is to declare the char signedness in 
the CMPPCHAR macro.

--- a/contrib/pg_trgm/trgm.h
+++ b/contrib/pg_trgm/trgm.h
@@ -42,7 +42,7 @@
typedef char trgm[3];
 #define CMPCHAR(a,b) ( ((a)==(b)) ? 0 : ( ((a)<(b)) ? -1 : 1 ) )
-#define CMPPCHAR(a,b,i)  CMPCHAR( *(((const char*)(a))+i), *(((const 
char*)(b))+i) )
+#define CMPPCHAR(a,b,i)  CMPCHAR( *(((unsigned char*)(a))+i), *(((unsigned 
char*)(b))+i) )
#define CMPTRGM(a,b) ( CMPPCHAR(a,b,0) ? CMPPCHAR(a,b,0) : ( CMPPCHAR(a,b,1) ? 
CMPPCHAR(a,b,1) : CMPPCHAR(a,b,2) ) )
 #define CPTRGM(a,b) do {   
\

Alternatively, Postgres can be compiled with -funsigned-char or -fsigned-char. 
I came across a code comment suggesting that this may not be a good idea in 
general [3].

Given that this has problem has come up before and seems likely to come up 
again, I'm curious what other broad solutions there might be to resolve it? 
Looking forward to any feedback, thanks!

Best,

Adam Guo
Amazon Web Services: https://aws.amazon.com

[1] 
https://osdn.net/projects/pgbigm/lists/archive/hackers/2024-February/000370.html
[2] 
https://github.com/postgres/postgres/blob/480bc6e3ed3a5719cdec076d4943b119890e8171/contrib/pg_trgm/trgm.h#L45
[3] 
https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/cash.c#L114-L123


Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

2024-04-23 Thread Tom Lane
"Guo, Adam"  writes:
> I would like to report an issue with the pg_trgm extension on
> cross-architecture replication scenarios. When an x86_64 standby
> server is replicating from an aarch64 primary server or vice versa,
> the gist_trgm_ops opclass returns different results on the primary
> and standby.

I do not think that is a supported scenario.  Hash functions and
suchlike are not guaranteed to produce the same results on different
CPU architectures.  As a quick example, I get

regression=# select hashfloat8(34);
 hashfloat8 

   21570837
(1 row)

on x86_64 but

postgres=# select hashfloat8(34);
 hashfloat8 

 -602898821
(1 row)

on ppc32 thanks to the endianness difference.

> Given that this has problem has come up before and seems likely to
> come up again, I'm curious what other broad solutions there might be
> to resolve it?

Reject as not a bug.  Discourage people from thinking that physical
replication will work across architectures.

regards, tom lane




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-23 Thread Robert Haas
On Mon, Apr 22, 2024 at 5:19 PM Jelte Fennema-Nio  wrote:
> On Mon, 22 Apr 2024 at 16:26, Robert Haas  wrote:
> > That's a fair point, but I'm still not seeing much practical
> > advantage. It's unlikely that a client is going to set a random bit in
> > a format parameter for no reason.
>
> I think you're missing an important point of mine here. The client
> wouldn't be "setting a random bit in a format parameter for no
> reason". The client would decide it is allowed to set this bit,
> because the PG version it connected to supports column encryption
> (e.g. PG18). But this completely breaks protocol and application layer
> separation.

I can't see what the problem is here. If the client is connected to a
database that contains encrypted columns, and its response to seeing
an encrypted column is to set this bit, that's fine and nothing should
break. If a client doesn't know about encrypted columns and sets that
bit at random, that will break things, and formally I think that's a
risk, because I don't believe we document anywhere that you shouldn't
set unused bits in the format mask. But practically, it's not likely.
(And also, maybe we should document that you shouldn't do that.)

> It doesn't seem completely outside of the realm of possibility for a
> pooler to gather some statistics on the amount of Bind messages that
> use text vs binary query parameters. That's very easily doable now,
> while looking only at the protocol layer. If a client then sets the
> new format parameter bit, this pooler could then get confused and
> close the connection.

Right, this is the kind of risk I was worried about. I think it's
similar to my example of a client setting an unused bit for no reason
and breaking everything. Here, you've hypothesized a pooler that tries
to interpret the bit and just errors out when it sees something it
doesn't understand. I agree that *formally* this is enough to justify
bumping the protocol version, but I think *practically* it isn't,
because the incompatibility is so minor as to inconvenience almost
nobody, whereas changing the protocol version affects everybody.

Let's consider a hypothetical country much like Canada except that
there are three official languages rather than two: English, French,
and Robertish. Robertish is just like English except that the meanings
of the words cabbage and rutabaga are reversed. Shall we mandate that
all signs in the country be printed in three languages rather than
two? Formally, we ought, because the substantial minority of our
hypothetical country that proudly speaks Robertish as their mother
tongue will not want to feel that they are second class citizens. But
practically, there are very few situations where the differences
between the two languages are going to inconvenience anyone. Indeed,
the French speakers might be a bit put out if English is effectively
represented twice on every sign while their mother tongue is there
only once. Of course, people are entitled to organize their countries
politically in any way that works for the people who live in them, but
as a practical matter, English and Robertish are mutually
intelligible.

And so here. If someone codes a connection pooler in the way you
suppose, then it will break. But, first of all, they probably won't do
that, both because it's not particularly likely that someone wants to
gather that particular set of statistics and also because erroring out
seems like an overreaction. And secondly, let's imagine that we do
bump the protocol version and think about whether and how that solves
the problem. A client will request from the pooler a version 3.1
connection and the pooler will say, sorry, no can do, I only
understand 3.0. So the client will now say, oh ok, no problem, I'm
going to refrain from setting that parameter format bit. Cool, right?

Well, no, not really. First, now the client application is probably
broken. If the client is varying its behavior based on the server's
protocol version, that must mean that it cares about accessing
encrypted columns, and that means that the bit in question is not an
optional feature. So actually, the fact that the pooler can force the
client to downgrade hasn't fixed anything at all.

Second, if the connection pooler were written to do something other
than close the connection, like say mask out the one bit that it knows
how to deal with or have an "unknown" bucket to count values that it
doesn't recognize, then it wouldn't have needed to care about the
protocol version in the first place. It would have been better off not
even knowing, because then it wouldn't have forced a downgrade onto
the client application for no real reason. Throwing an error wasn't a
wrong decision on the part of the person writing the pooler, but there
are other things they could have done that would have been less
brittle.

Third, applications, drivers, and connection poolers now all need to
worry about handling downgrades smoothly. If a connection pooler
requests a v3.1 conne

Background Processes in Postgres Extension

2024-04-23 Thread Sushrut Shivaswamy
Hey,

I'm developing a postgres extension as a custom Table Interface method
definition.
WIthin the extension, I"m planning to create two background processes using
`fork()` that will process data in the background.

Are there any recommendations / guidelines around creating background
processes within extensions in postgres?

Thanks,
Sushrut


Re: pgsql: Introduce "builtin" collation provider.

2024-04-23 Thread Andrew Dunstan



On 2024-03-14 Th 02:39, Jeff Davis wrote:

Introduce "builtin" collation provider.



The new value "b" for pg_collation.collprovider doesn't seem to be 
documented. Is that deliberate?



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: slightly misleading Error message in guc.c

2024-04-23 Thread Tom Lane
Daniel Gustafsson  writes:
>> On 22 Apr 2024, at 18:04, Tom Lane  wrote:
>> Seems like a useful change

> Agreed.

>> ... about like this?

> Patch LGTM.

Pushed.

regards, tom lane




Re: Background Processes in Postgres Extension

2024-04-23 Thread Tom Lane
Sushrut Shivaswamy  writes:
> I'm developing a postgres extension as a custom Table Interface method
> definition.
> WIthin the extension, I"m planning to create two background processes using
> `fork()` that will process data in the background.

> Are there any recommendations / guidelines around creating background
> processes within extensions in postgres?

fork() is entirely the wrong way to do it, mainly because when the
creating session exits the postmaster will be unaware of those
now-disconnected child processes.  See the mechanisms for creating
background worker processes (start with bgworker.h).

regards, tom lane




Re: Popcount optimization using AVX512

2024-04-23 Thread Nathan Bossart
On Thu, Apr 18, 2024 at 05:13:58PM -0500, Nathan Bossart wrote:
> Makes sense, thanks.  I'm planning to commit this fix sometime early next
> week.

Committed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: BitmapHeapScan streaming read user and prelim refactoring

2024-04-23 Thread Melanie Plageman
On Mon, Apr 22, 2024 at 1:01 PM Melanie Plageman
 wrote:
>
> On Thu, Apr 18, 2024 at 5:39 AM Tomas Vondra
>  wrote:
> >
> > On 4/18/24 09:10, Michael Paquier wrote:
> > > On Sun, Apr 07, 2024 at 10:54:56AM -0400, Melanie Plageman wrote:
> > >> I've added an open item [1], because what's one open item when you can
> > >> have two? (me)
> > >
> > > And this is still an open item as of today.  What's the plan to move
> > > forward here?
> >
> > AFAIK the plan is to replace the asserts with actually resetting the
> > rs_empty_tuples_pending field to 0, as suggested by Melanie a week ago.
> > I assume she was busy with the post-freeze AM reworks last week, so this
> > was on a back burner.
>
> yep, sorry. Also I took a few days off. I'm just catching up today. I
> want to pop in one of Richard or Tomas' examples as a test, since it
> seems like it would add some coverage. I will have a patch soon.

The patch with a fix is attached. I put the test in
src/test/regress/sql/join.sql. It isn't the perfect location because
it is testing something exercisable with a join but not directly
related to the fact that it is a join. I also considered
src/test/regress/sql/select.sql, but it also isn't directly related to
the query being a SELECT query. If there is a better place for a test
of a bitmap heap scan edge case, let me know.

One other note: there is some concurrency effect in the parallel
schedule group containing "join" where you won't trip the assert if
all the tests in that group in the parallel schedule are run. But, if
you would like to verify that the test exercises the correct code,
just reduce the group containing "join".

- Melanie
From 6ad777979c335f6cc16d3936defb634176a44995 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Tue, 23 Apr 2024 11:45:37 -0400
Subject: [PATCH v1] BitmapHeapScan: Remove incorrect assert

04e72ed617be pushed the skip fetch optimization (allowing bitmap heap
scans to operate like index-only scans if none of the underlying data is
needed) into heap AM-specific bitmap heap scan code.

04e72ed617be added an assert that all tuples in blocks eligible for the
optimization had been NULL-filled and emitted by the end of the scan.
This assert is incorrect when not all tuples need be scanned to execute
the query; for example: a join in which not all inner tuples need to be
scanned before skipping to the next outer tuple.

Author: Melanie Plageman
Reviewed-by: Richard Guo, Tomas Vondra
Discussion: https://postgr.es/m/CAMbWs48orzZVXa7-vP9Nt7vQWLTE04Qy4PePaLQYsVNQgo6qRg%40mail.gmail.com
---
 src/backend/access/heap/heapam.c   |  4 ++--
 src/test/regress/expected/join.out | 36 ++
 src/test/regress/sql/join.sql  | 24 
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 4a4cf76269d..8906f161320 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1184,7 +1184,7 @@ heap_rescan(TableScanDesc sscan, ScanKey key, bool set_params,
 		scan->rs_vmbuffer = InvalidBuffer;
 	}
 
-	Assert(scan->rs_empty_tuples_pending == 0);
+	scan->rs_empty_tuples_pending = 0;
 
 	/*
 	 * The read stream is reset on rescan. This must be done before
@@ -1216,7 +1216,7 @@ heap_endscan(TableScanDesc sscan)
 	if (BufferIsValid(scan->rs_vmbuffer))
 		ReleaseBuffer(scan->rs_vmbuffer);
 
-	Assert(scan->rs_empty_tuples_pending == 0);
+	scan->rs_empty_tuples_pending = 0;
 
 	/*
 	 * Must free the read stream before freeing the BufferAccessStrategy.
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 8b640c2fc2f..ce73939c267 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -8956,3 +8956,39 @@ where exists (select 1 from j3
 (13 rows)
 
 drop table j3;
+-- Check the case when:
+--   1. A join doesn't require all inner tuples to be scanned for each outer
+--  tuple, and
+--   2. The inner side is scanned using a bitmap heap scan, and
+--   3. The bitmap heap scan is eligible for the "skip fetch" optimization.
+--  This optimization is usable when no data from the underlying table is
+--  needed.
+CREATE TABLE skip_fetch (a INT, b INT) WITH (fillfactor=10);
+INSERT INTO skip_fetch SELECT i % 3, i FROM GENERATE_SERIES(0,30) i;
+CREATE INDEX ON skip_fetch(a);
+VACUUM (ANALYZE) skip_fetch;
+SET enable_indexonlyscan = off;
+set enable_indexscan = off;
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF)
+SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
+   QUERY PLAN
+-
+ Nested Loop Anti Join
+   ->  Seq Scan on skip_fetch t1
+   ->  Materialize
+ ->  Bitmap Heap Scan on skip_fetch t2
+   Recheck Cond: (a = 1)
+   ->  Bitmap Index Scan on skip_fetch_a_idx
+ Index Cond: (a = 1)
+(7 rows)

Re: Avoid orphaned objects dependencies, take 3

2024-04-23 Thread Bertrand Drouvot
Hi,

On Tue, Apr 23, 2024 at 04:59:09AM +, Bertrand Drouvot wrote:
> Hi,
> 
> On Mon, Apr 22, 2024 at 03:00:00PM +0300, Alexander Lakhin wrote:
> > 22.04.2024 13:52, Bertrand Drouvot wrote:
> > > 
> > > That's weird, I just launched it several times with the patch applied and 
> > > I'm not
> > > able to see the seg fault (while I can see it constently failing on the 
> > > master
> > > branch).
> > > 
> > > Are you 100% sure you tested it against a binary with the patch applied?
> > > 
> > 
> > Yes, at least I can't see what I'm doing wrong. Please try my
> > self-contained script attached.
> 
> Thanks for sharing your script!
> 
> Yeah your script ensures the patch is applied before the repro is executed.
> 
> I do confirm that I can also see the issue with the patch applied (but I had 
> to
> launch multiple attempts, while on master one attempt is enough).
> 
> I'll have a look.

Please find attached v2 that should not produce the issue anymore (I launched a
lot of attempts without any issues). v1 was not strong enough as it was not
always checking for the dependent object existence. v2 now always checks if the
object still exists after the additional lock acquisition attempt while 
recording
the dependency.

I still need to think about v2 but in the meantime could you please also give
v2 a try on you side?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From f80fdfc32e791463555aa318f26ff5e7363ac3ac Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Fri, 29 Mar 2024 15:43:26 +
Subject: [PATCH v2] Avoid orphaned objects dependencies

It's currently possible to create orphaned objects dependencies, for example:

Scenario 1:

session 1: begin; drop schema schem;
session 2: create a function in the schema schem
session 1: commit;

With the above, the function created in session 2 would be linked to a non
existing schema.

Scenario 2:

session 1: begin; create a function in the schema schem
session 2: drop schema schem;
session 1: commit;

With the above, the function created in session 1 would be linked to a non
existing schema.

To avoid those scenarios, a new lock (that conflicts with a lock taken by DROP)
has been put in place when the dependencies are being recorded. With this in
place, the drop schema in scenario 2 would be locked.

Also, after locking the object, the patch checks that the object still exists:
with this in place session 2 in scenario 1 would be locked and would report an
error once session 1 committs (that would not be the case should session 1 abort
the transaction).

The patch adds a few tests for some dependency cases (that would currently produce
orphaned objects):

- schema and function (as the above scenarios)
- function and type
- table and type
---
 src/backend/catalog/dependency.c  | 48 +
 src/backend/catalog/objectaddress.c   | 70 +++
 src/backend/catalog/pg_depend.c   |  6 ++
 src/include/catalog/dependency.h  |  1 +
 src/include/catalog/objectaddress.h   |  1 +
 src/test/modules/Makefile |  1 +
 src/test/modules/meson.build  |  1 +
 .../test_dependencies_locks/.gitignore|  3 +
 .../modules/test_dependencies_locks/Makefile  | 14 
 .../expected/test_dependencies_locks.out  | 49 +
 .../test_dependencies_locks/meson.build   | 12 
 .../specs/test_dependencies_locks.spec| 39 +++
 12 files changed, 245 insertions(+)
  38.1% src/backend/catalog/
  30.7% src/test/modules/test_dependencies_locks/expected/
  19.5% src/test/modules/test_dependencies_locks/specs/
   8.7% src/test/modules/test_dependencies_locks/

diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index d4b5b2ade1..e3b66025dd 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1517,6 +1517,54 @@ AcquireDeletionLock(const ObjectAddress *object, int flags)
 	}
 }
 
+/*
+ * depLockAndCheckObject
+ *
+ * Lock the object that we are about to record a dependency on.
+ * After it's locked, verify that it hasn't been dropped while we
+ * weren't looking.  If the object has been dropped, this function
+ * does not return!
+ */
+void
+depLockAndCheckObject(const ObjectAddress *object)
+{
+	char	   *object_description;
+
+	/*
+	 * Those don't rely on LockDatabaseObject() when being dropped (see
+	 * AcquireDeletionLock()). Also it looks like they can not produce
+	 * orphaned dependent objects when being dropped.
+	 */
+	if (object->classId == RelationRelationId || object->classId == AuthMemRelationId)
+		return;
+
+	object_description = getObjectDescription(object, true);
+
+	/* assume we should lock the whole object not a sub-object */
+	LockDatabaseObject(object->classId, object->objectId, 0, AccessShareLock);
+
+	/* check if object still exists */
+	if (!ObjectByIdExist(object, false))
+	{
+		/*
+		 * 

Re: Statistics Import and Export

2024-04-23 Thread Matthias van de Meent
On Tue, 23 Apr 2024, 05:52 Tom Lane,  wrote:
> Jeff Davis  writes:
> > On Mon, 2024-04-22 at 16:19 -0400, Tom Lane wrote:
> >> Loading data without stats, and hoping
> >> that auto-analyze will catch up sooner not later, is exactly the
> >> current behavior that we're doing all this work to get out of.
>
> > That's the disconnect, I think. For me, the main reason I'm excited
> > about this work is as a way to solve the bad-plans-after-upgrade
> > problem and to repro planner issues outside of production. Avoiding the
> > need to ANALYZE at the end of a data load is also a nice convenience,
> > but not a primary driver (for me).
>
> Oh, I don't doubt that there are use-cases for dumping stats without
> data.  I'm just dubious about the reverse.  I think data+stats should
> be the default, even if only because pg_dump's default has always
> been to dump everything.  Then there should be a way to get stats
> only, and maybe a way to get data only.  Maybe this does argue for a
> four-section definition, despite the ensuing churn in the pg_dump API.

I've heard of use cases where dumping stats without data would help
with production database planner debugging on a non-prod system.

Sure, some planner inputs would have to be taken into account too, but
having an exact copy of production stats is at least a start and can
help build models and alerts for what'll happen when the tables grow
larger with the current stats.

As for other planner inputs: table size is relatively easy to shim
with sparse files; cumulative statistics can be copied from a donor
replica if needed, and btree indexes only really really need to
contain their highest and lowest values (and need their height set
correctly).

Kind regards,

Matthias van de Meent




Re: gcc 12.1.0 warning

2024-04-23 Thread Andres Freund
Hi,

On 2024-04-15 11:25:05 +0300, Nazir Bilal Yavuz wrote:
> I am able to reproduce this. I regenerated the debian bookworm image
> and ran CI on REL_15_STABLE with this image.
> 
> CI Run: https://cirrus-ci.com/task/4978799442395136

Hm, not sure why I wasn't able to repro - now I can.

It actually seems like a legitimate warning: The caller allocates the key as

static struct config_generic *
find_option(const char *name, bool create_placeholders, bool skip_errors,
int elevel)
{
const char **key = &name;

and then does
res = (struct config_generic **) bsearch((void *) &key,

 (void *) guc_variables,

 num_guc_variables,

 sizeof(struct config_generic *),

 guc_var_compare);

while guc_var_compare() assume it's being passed a full config_generic:

static int
guc_var_compare(const void *a, const void *b)
{
const struct config_generic *confa = *(struct config_generic *const *) 
a;
const struct config_generic *confb = *(struct config_generic *const *) 
b;
return guc_name_compare(confa->name, confb->name);
}


which several versions of gcc then complain about:

In function ‘guc_var_compare’,
inlined from ‘bsearch’ at 
/usr/include/x86_64-linux-gnu/bits/stdlib-bsearch.h:33:23,
inlined from ‘find_option’ at 
/home/andres/src/postgresql-15/src/backend/utils/misc/guc.c:5640:35:
/home/andres/src/postgresql-15/src/backend/utils/misc/guc.c:5727:38: warning: 
array subscript ‘const struct config_generic[0]’ is partly outside array bounds 
of ‘const char[8]’ [-Warray-bounds=]
 5727 | return guc_name_compare(confa->name, confb->name);
  | ~^~
/home/andres/src/postgresql-15/src/backend/utils/misc/guc.c: In function 
‘find_option’:
/home/andres/src/postgresql-15/src/backend/utils/misc/guc.c:5627:25: note: 
object ‘name’ of size 8
 5627 | find_option(const char *name, bool create_placeholders, bool 
skip_errors,


Which seems entirely legitimate. ISTM that guc_var_compare() ought to only
cast the pointers to the key type, i.e. char *.  And incidentally that does
prevent the warning.

The reason it doesn't happen in newer versions of postgres is that we aren't
using guc_var_compare() in the relevant places anymore...

Greetings,

Andres Freund




Re: Cleanup: remove unused fields from nodes

2024-04-23 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Apr 22, 2024 at 06:46:27PM +0200, Matthias van de Meent wrote:
>> On Mon, 22 Apr 2024 at 17:41, Tom Lane  wrote:
>>> I think it would be a good idea to push 0003 for v17, just so nobody
>>> grows an unnecessary dependency on that field.  0001 and 0005 could
>>> be left for v18, but on the other hand they're so trivial that it
>>> could also be sensible to just push them to get them out of the way.

> Tweaking the APIs should be OK until GA, as long as we agree that the
> current interfaces can be improved.
> 0003 is new in v17, so let's apply it now.  I don't see much a strong
> argument in waiting for the removal of 0001 and 0005, either, to keep
> the interfaces cleaner moving on.  However, this is not a regression
> and these have been around for years, so I'd suggest for v18 to open
> before moving on with the removal.

I went ahead and pushed 0001 and 0003, figuring there was little
point in waiting on 0001.  I'd intended to push 0005 (remove "isall")
as well, but it failed check-world:

diff -U3 /home/postgres/pgsql/contrib/pg_stat_statements/expected/utility.out 
/home/postgres/pgsql/contrib/pg_stat_statements/results/utility.out
--- /home/postgres/pgsql/contrib/pg_stat_statements/expected/utility.out
2023-12-08 15:14:55.689347888 -0500
+++ /home/postgres/pgsql/contrib/pg_stat_statements/results/utility.out 
2024-04-23 12:17:22.187721947 -0400
@@ -536,12 +536,11 @@
 SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
  calls | rows |   query
 ---+--+
- 2 |0 | DEALLOCATE $1
- 2 |0 | DEALLOCATE ALL
+ 4 |0 | DEALLOCATE $1
  2 |2 | PREPARE stat_select AS SELECT $1 AS a
  1 |1 | SELECT $1 as a
  1 |1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
-(5 rows)
+(4 rows)
 
 SELECT pg_stat_statements_reset() IS NOT NULL AS t;

That is, query jumbling no longer distinguishes "DEALLOCATE x" from
"DEALLOCATE ALL", because the DeallocateStmt.name field is marked
query_jumble_ignore.  Now maybe that's fine, but it's a point
we'd not considered so far in this thread.  Thoughts?

regards, tom lane




Re: Direct SSL connection with ALPN and HBA rules

2024-04-23 Thread Jacob Champion
On Fri, Apr 19, 2024 at 2:43 PM Heikki Linnakangas  wrote:
>
> On 19/04/2024 19:48, Jacob Champion wrote:
> > On Fri, Apr 19, 2024 at 6:56 AM Heikki Linnakangas  wrote:
> >> With direct SSL negotiation, we always require ALPN.
> >
> >   (As an aside: I haven't gotten to test the version of the patch that
> > made it into 17 yet, but from a quick glance it looks like we're not
> > rejecting mismatched ALPN during the handshake as noted in [1].)
>
> Ah, good catch, that fell through the cracks. Agreed, the client should
> reject a direct SSL connection if the server didn't send ALPN. I'll add
> that to the Open Items so we don't forget again.

Yes, the client should also reject, but that's not what I'm referring
to above. The server needs to fail the TLS handshake itself with the
proper error code (I think it's `no_application_protocol`?); otherwise
a client implementing a different protocol could consume the
application-level bytes coming back from the server and act on them.
That's the protocol confusion attack from ALPACA we're trying to
avoid.

> > Well, assuming the user is okay with plaintext negotiation at all.
> > (Was that fixed before the patch went in? Is GSS negotiation still
> > allowed even with requiredirect?)
>
> To disable sending the startup packet in plaintext, you need to use
> sslmode=require. Same as before the patch. GSS is still allowed, as it
> takes precedence over SSL if both are enabled in libpq. Per the docs:
>
> > Note that if gssencmode is set to prefer, a GSS connection is
> > attempted first. If the server rejects GSS encryption, SSL is
> > negotiated over the same TCP connection using the traditional
> > postgres protocol, regardless of sslnegotiation. In other words, the
> > direct SSL handshake is not used, if a TCP connection has already
> > been established and can be used for the SSL handshake.

Oh. That's actually disappointing, since gssencmode=prefer is the
default. A question I had in the original thread was, what's the
rationale behind a "require direct ssl" option that doesn't actually
require it?

> >> What would be the use case of requiring
> >> direct SSL in the server? What extra security do you get?
> >
> > You get protection against attacks that could have otherwise happened
> > during the plaintext portion of the handshake. That has architectural
> > implications for more advanced uses of SCRAM, and it should prevent
> > any repeats of CVE-2021-23222/23214. And if the peer doesn't pass the
> > TLS handshake, they can't send you anything that you might forget is
> > untrusted (like, say, an error message).
>
> Can you elaborate on the more advanced uses of SCRAM?

If you're using SCRAM to authenticate the server, as opposed to just a
really strong password auth, then it really helps an analysis of the
security to know that there are no plaintext bytes that have been
interpreted by the client. This came up briefly in the conversations
that led to commit d0f4824a.

To be fair, it's a more academic concern at the moment; my imagination
can only come up with problems for SCRAM-based TLS that would also be
vulnerabilities for standard certificate-based TLS. But whether or not
it's an advantage for the code today is also kind of orthogonal to my
point. The security argument of direct SSL mode is that it reduces
risk for the system as a whole, even in the face of future code
changes or regressions. If you can't force its use, you're not
reducing that risk very much. (If anything, a "require" option that
doesn't actually require it makes the analysis more complicated, not
less...)

> >> Controlling these in HBA is a bit inconvenient, because you only find
> >> out after authentication if it's allowed or not. So if e.g. direct SSL
> >> connections are disabled for a user,
> >
> > Hopefully disabling direct SSL piecemeal is not a desired use case?
> > I'm not sure it makes sense to focus on that. Forcing it to be enabled
> > shouldn't have the same problem, should it?
>
> Forcing it to be enabled piecemeal based on role or database has similar
> problems.

Hm. For some reason I thought it was easier the other direction, but I
can't remember why I thought that. I'll withdraw the comment for now
:)

--Jacob




Re: Direct SSL connection with ALPN and HBA rules

2024-04-23 Thread Jacob Champion
On Mon, Apr 22, 2024 at 10:42 PM Michael Paquier  wrote:
>
> On Mon, Apr 22, 2024 at 10:47:51AM +0300, Heikki Linnakangas wrote:
> > On 22/04/2024 10:19, Michael Paquier wrote:
> >> As a whole, I can get behind a unique GUC that forces the use of
> >> direct.  Or, we could extend the existing "ssl" GUC with a new
> >> "direct" value to accept only direct connections and restrict the
> >> original protocol (and a new "postgres" for the pre-16 protocol,
> >> rejecting direct?), while "on" is able to accept both.
> >
> > I'd be OK with that, although I still don't really see the point of forcing
> > this from the server side. We could also add this later.
>
> I'd be OK with doing something only in v18, if need be.  Jacob, what
> do you think?

I think it would be nice to have an option like that. Whether it's
done now or in 18, I don't have a strong opinion about. But I do think
it'd be helpful to have a consensus on whether or not this is a
security improvement, or a performance enhancement only, before adding
said option. As it's implemented, if the requiredirect option doesn't
actually requiredirect, I think it looks like security but isn't
really.

(My ideal server-side option removes all plaintext negotiation and
forces the use of direct SSL for every connection, paired with a new
postgresqls:// scheme for the client. But I don't have any experience
making a switchover like that at scale, and I'd like to avoid a
StartTLS-vs-LDAPS sort of situation. That's obviously not a
conversation for 17.)

As for HBA control: overall, I don't see a burning need for an
HBA-based configuration, honestly. I'd prefer to reduce the number of
knobs and make it easier to apply the strongest security with a broad
brush.

--Jacob




Re: soliciting patches to review

2024-04-23 Thread Robert Haas
Hi,

Just a quick update. We have so far had 8 suggested patches from 6
people, if I haven't missed anything. I'm fairly certain that not all
of those patches are going to be good candidates for this session, so
it would be great if a few more people wanted to volunteer their
patches.

Thanks,

...Robert




Re: soliciting patches to review

2024-04-23 Thread Melanie Plageman
On Tue, Apr 23, 2024 at 1:27 PM Robert Haas  wrote:
>
> Hi,
>
> Just a quick update. We have so far had 8 suggested patches from 6
> people, if I haven't missed anything. I'm fairly certain that not all
> of those patches are going to be good candidates for this session, so
> it would be great if a few more people wanted to volunteer their
> patches.

Since you are going to share the patches anyway at the workshop, do
you mind giving an example of a patch that is a good fit for the
workshop? Alternatively, you could provide a hypothetical example. I,
of course, have patches that I'd like reviewed. But, I'm unconvinced
any of them would be particularly interesting in a workshop.

- Melanie




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-23 Thread Jacob Champion
On Mon, Apr 22, 2024 at 2:20 PM Jelte Fennema-Nio  wrote:
> 1. I strongly believe minor protocol version bumps after the initial
> 3.1 one can be made painless for clients/poolers (so the ones to
> 3.2/3.3/etc). Similar to how TLS 1.3 can be safely introduced, and not
> having to worry about breaking TLS 1.2 communication.

Apologies for focusing on a single portion of your argument, but this
claim in particular stuck out to me. To my understanding, IETF worried
a _lot_ about breaking TLS 1.2 implementations with the TLS 1.3
change, to the point that TLS 1.3 clients and servers advertise
themselves as TLS 1.2 and sneak the actual version used into a TLS
extension (roughly analogous to the _pq_ stuff). I vaguely recall that
the engineering work done for that update was pretty far from
painless.

--Jacob




Re: Direct SSL connection with ALPN and HBA rules

2024-04-23 Thread Robert Haas
On Tue, Apr 23, 2024 at 1:22 PM Jacob Champion
 wrote:
> On Mon, Apr 22, 2024 at 10:42 PM Michael Paquier  wrote:
> > On Mon, Apr 22, 2024 at 10:47:51AM +0300, Heikki Linnakangas wrote:
> > > On 22/04/2024 10:19, Michael Paquier wrote:
> > >> As a whole, I can get behind a unique GUC that forces the use of
> > >> direct.  Or, we could extend the existing "ssl" GUC with a new
> > >> "direct" value to accept only direct connections and restrict the
> > >> original protocol (and a new "postgres" for the pre-16 protocol,
> > >> rejecting direct?), while "on" is able to accept both.
> > >
> > > I'd be OK with that, although I still don't really see the point of 
> > > forcing
> > > this from the server side. We could also add this later.
> >
> > I'd be OK with doing something only in v18, if need be.  Jacob, what
> > do you think?
>
> I think it would be nice to have an option like that. Whether it's
> done now or in 18, I don't have a strong opinion about. But I do think
> it'd be helpful to have a consensus on whether or not this is a
> security improvement, or a performance enhancement only, before adding
> said option. As it's implemented, if the requiredirect option doesn't
> actually requiredirect, I think it looks like security but isn't
> really.

I've not followed this thread closely enough to understand the comment
about requiredirect maybe not actually requiring direct, but if that
were true it seems like it might be concerning.

But as far as having a GUC to force direct SSL or not, I agree that's
a good idea, and that it's better than only being able to control the
behavior through pg_hba.conf, because it removes room for any possible
doubt about whether you're really enforcing the behavior you want to
be enforcing. It might also mean that the connection can be rejected
earlier in the handshaking process on the basis of the GUC value,
which could conceivably prevent a client from reaching some piece of
code that turns out to have a security vulnerability. For example, if
we found out that direct SSL connections let you take over the
Pentagon before reaching the authentication stage, but for some reason
regular connections don't have the same problem, being able to
categorically shut off direct SSL would be valuable.

However, I don't really see why this has to be done for this release.
It seems like a separate feature from direct SSL itself. If direct SSL
hadn't been committed at the very last minute, then it would have been
great if this had been done for this release, too. But it was. The
moral we ought to take from that is "perhaps get the big features in a
bit further in advance of the freeze," not "well we'll just keep
hacking after the freeze."

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: soliciting patches to review

2024-04-23 Thread Robert Haas
On Tue, Apr 23, 2024 at 1:39 PM Melanie Plageman
 wrote:
> Since you are going to share the patches anyway at the workshop, do
> you mind giving an example of a patch that is a good fit for the
> workshop? Alternatively, you could provide a hypothetical example. I,
> of course, have patches that I'd like reviewed. But, I'm unconvinced
> any of them would be particularly interesting in a workshop.

Andres and I haven't discussed our selection criteria yet, but my
feeling is that we're going to want patches that are somewhat
medium-sized. If your patch makes PostgreSQL capable of
faster-than-light travel, it's probably too big to be reviewed
meaningfully in the time we will have. If your patch changes corrects
a bunch of typos, it probably lacks enough substance to be worth
discussing. I hesitate to propose more specific parameters. On the one
hand, a patch that changes something user-visible that someone could
reasonably like or dislike is probably easier to review, in some
sense, than a patch that refactors code or tries to improve
performance. However, talking about how to review patches where it's
less obvious what you should be trying to evaluate might be an
important part of the workshop, so my feeling is that I would prefer
it if more people would volunteer and then let Andres and I sort
through what we think makes sense to include.

I would also be happy to have people "blanket submit" without naming
patches i.e. if anyone wants to email and say "hey, feel free to
include any of my stuff if you want" that is great. Our concern was
that we didn't want to look like we were picking on anyone who wasn't
up for it. I'm happy to keep getting emails from people with specific
patches they want reviewed -- if we can hit a patch that someone wants
reviewed that is better for everyone than if we just pick randomly --
but my number one concern is not offending anyone.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Direct SSL connection with ALPN and HBA rules

2024-04-23 Thread Jacob Champion
On Tue, Apr 23, 2024 at 10:43 AM Robert Haas  wrote:
> I've not followed this thread closely enough to understand the comment
> about requiredirect maybe not actually requiring direct, but if that
> were true it seems like it might be concerning.

It may be my misunderstanding. This seems to imply bad behavior:

> If the server rejects GSS encryption, SSL is
> negotiated over the same TCP connection using the traditional postgres
> protocol, regardless of sslnegotiation.

As does this comment:

> +   /*
> +* If enabled, try direct SSL. Unless we have a valid TCP connection that
> +* failed negotiating GSSAPI encryption or a plaintext connection in case
> +* of sslmode='allow'; in that case we prefer to reuse the connection with
> +* negotiated SSL, instead of reconnecting to do direct SSL. The point of
> +* direct SSL is to avoid the roundtrip from the negotiation, but
> +* reconnecting would also incur a roundtrip.
> +*/

but when I actually try those cases, I see that requiredirect does
actually cause a direct SSL connection to be done, even with
sslmode=allow. So maybe it's just misleading documentation (or my
misreading of it) that needs to be expanded? Am I missing a different
corner case where requiredirect is ignored, Heikki?

I still question the utility of allowing sslmode=allow with
sslnegotiation=requiredirect, because it seems like you've made both
the performance and security characteristics actively worse if you
choose that combination. But I want to make sure I understand the
current behavior correctly before I derail the discussion too much...

Thanks,
--Jacob




Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-04-23 Thread Nathan Bossart
On Mon, Apr 22, 2024 at 03:20:10PM -0500, Nathan Bossart wrote:
> On Mon, Apr 22, 2024 at 04:08:08PM -0400, Tom Lane wrote:
>> The problem with the README is that it describes that process,
>> rather than the now-typical workflow of incrementally keeping
>> the tree indented.  I don't think we want to remove the info
>> about how to do the full-monty process, but you're right that
>> the README needs to explain the incremental method as being
>> the one most developers would usually use.
>> 
>> Want to write some text?
> 
> Yup, I'll put something together.

Here is a first attempt.  I'm not tremendously happy with it, but it at
least gets something on the page to build on.  I was originally going to
copy/paste the relevant steps into the description of the incremental
process, but that seemed kind-of silly, so I instead just pointed to the
relevant steps of the "full" process, along with the deviations from those
steps.  That's a little more work for the reader, but maybe it isn't too
bad...  I plan to iterate on this patch some more.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
diff --git a/src/tools/pgindent/README b/src/tools/pgindent/README
index b649a21f59..221e3fc010 100644
--- a/src/tools/pgindent/README
+++ b/src/tools/pgindent/README
@@ -1,8 +1,11 @@
 pgindent'ing the PostgreSQL source tree
 ===
 
-We run this process at least once in each development cycle,
-to maintain uniform layout style in our C and Perl code.
+The following process is used to maintain uniform layout style in our C and Perl
+code.  Once in each development cycle, a complete, independent run is performed.
+Additionally, an abridged run should be performed on every new commit to
+minimize deviations from pgindent's preferred style.  See below for more details
+on each type of run.
 
 You might find this blog post interesting:
 http://adpgtech.blogspot.com/2015/05/running-pgindent-on-non-core-code-or.html
@@ -25,7 +28,20 @@ PREREQUISITES:
Or if you have cpanm installed, you can just use:
cpanm https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/Perl-Tidy-20230309.tar.gz
 
-DOING THE INDENT RUN:
+DOING AN INCREMENTAL, PER-COMMIT INDENT RUN:
+
+First, complete steps 1-3 of a complete run listed below.  Note that there is
+no need to download the typedef file from the buildfarm.  Standard practice is
+to manually update this file as needed in the same commit that adds or removes
+types.  The once-per-development-cycle run will resolve any differences between
+the incrementally updated version of the file and a clean, autogenerated one.
+
+Finally, complete steps 1-3 of the validation section below.  Ensure that any
+changes made by pgindent are included in the patch prior to committing.  If you
+forget to do so, add the missed pgindent changes in a follow-up commit and also
+do step 4 of the validation section below.
+
+DOING A COMPLETE, ONCE-PER-DEVELOPMENT-CYCLE INDENT RUN:
 
 1) Change directory to the top of the source tree.
 


Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-04-23 Thread Daniel Gustafsson
> On 19 Apr 2024, at 10:06, Peter Eisentraut  wrote:
> 
> On 19.04.24 07:37, Michael Paquier wrote:
>> On Thu, Apr 18, 2024 at 12:53:43PM +0200, Peter Eisentraut wrote:
>>> If everything is addressed, I agree that 0001, 0003, and 0004 can go into
>>> PG17, the rest later.
>> About the PG17 bits, would you agree about a backpatch?  Or perhaps
>> you disagree?
> 
> I don't think any of these need to be backpatched, at least right now.
> 
> 0001 is just a cosmetic documentation tweak, has no reason to be backpatched.
> 
> 0003 adds new functionality for LibreSSL.  While the code looks 
> straightforward, we have little knowledge about how it works in practice.  
> How is the buildfarm coverage of LibreSSL (with SSL tests enabled!)?  If 
> people are keen on this, it might be better to get it into PG17 and at least 
> let to go through a few months of beta testing.
> 
> 0004 effectively just enhances an error message for LibreSSL; there is little 
> reason to backpatch this.

Hearing no objections to this plan (and the posted v10), I'll go ahead with
0001, 0003 and 0004 into v17 tomorrow.

--
Daniel Gustafsson





Re: Direct SSL connection with ALPN and HBA rules

2024-04-23 Thread Heikki Linnakangas

On 23/04/2024 22:33, Jacob Champion wrote:

On Tue, Apr 23, 2024 at 10:43 AM Robert Haas  wrote:

I've not followed this thread closely enough to understand the comment
about requiredirect maybe not actually requiring direct, but if that
were true it seems like it might be concerning.


It may be my misunderstanding. This seems to imply bad behavior:


If the server rejects GSS encryption, SSL is
negotiated over the same TCP connection using the traditional postgres
protocol, regardless of sslnegotiation.


As does this comment:


+   /*
+* If enabled, try direct SSL. Unless we have a valid TCP connection that
+* failed negotiating GSSAPI encryption or a plaintext connection in case
+* of sslmode='allow'; in that case we prefer to reuse the connection with
+* negotiated SSL, instead of reconnecting to do direct SSL. The point of
+* direct SSL is to avoid the roundtrip from the negotiation, but
+* reconnecting would also incur a roundtrip.
+*/


but when I actually try those cases, I see that requiredirect does
actually cause a direct SSL connection to be done, even with
sslmode=allow. So maybe it's just misleading documentation (or my
misreading of it) that needs to be expanded? Am I missing a different
corner case where requiredirect is ignored, Heikki?


You're right, the comment is wrong about sslmode=allow. There is no 
negotiation of a plaintext connection, the client just sends the startup 
packet directly. The HBA rules can reject it, but the client will have 
to disconnect and reconnect in that case.


The documentation and that comment are misleading about failed GSSAPI 
encryption too, and I also misremembered that. With 
sslnegotiation=requiredirect, libpq never uses negotiated SSL mode. It 
will reconnect after a rejected GSSAPI request. So that comment applies 
to sslnegotiation=direct, but not sslnegotiation=requiredirect.


Attached patch tries to fix and clarify those.

(Note that the client will only attempt GSSAPI encryption if it can find 
kerberos credentials in the environment.)


--
Heikki Linnakangas
Neon (https://neon.tech)
From 664555decb695123a4bf25ea56f789202b926ea0 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Wed, 24 Apr 2024 00:10:24 +0300
Subject: [PATCH 1/1] Fix documentation and comments on what happens after GSS
 rejection

The paragraph in the docs and the comment applied to
sslnegotiaton=direct, but not sslnegotiation=requiredirect. In
'requiredirect' mode, negotiated SSL is never used. Move the paragraph
in the docs under the description of 'direct' mode, and rephrase it.

Also the comment's reference to reusing a plaintext connection was
bogus. Authentication failure in plaintext mode only happens after
sending the startup packet, so the connection cannot be reused.
---
 doc/src/sgml/libpq.sgml   | 19 +--
 src/interfaces/libpq/fe-connect.c | 11 ++-
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 9199d0d2e5..7f854edfa2 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1803,6 +1803,15 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
  process adds significant latency if the initial SSL connection
  fails.

+   
+ An exception is if gssencmode is set
+ to prefer, but the server rejects GSS encryption.
+ In that case, SSL is negotiated over the same TCP connection using
+ PostgreSQL protocol negotiation. In
+ other words, the direct SSL handshake is not used, if a TCP
+ connection has already been established and can be used for the
+ SSL handshake.
+   
   
  
 
@@ -1816,16 +1825,6 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   
  
 
-
-   
-Note that if gssencmode is set
-to prefer, a GSS connection is
-attempted first. If the server rejects GSS encryption, SSL is
-negotiated over the same TCP connection using the traditional postgres
-protocol, regardless of sslnegotiation. In other
-words, the direct SSL handshake is not used, if a TCP connection has
-already been established and can be used for the SSL handshake.
-   
   
  
 
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index ec20e3f3a9..b1d3bfa59d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4430,11 +4430,12 @@ select_next_encryption_method(PGconn *conn, bool have_valid_connection)
 
 	/*
 	 * If enabled, try direct SSL. Unless we have a valid TCP connection that
-	 * failed negotiating GSSAPI encryption or a plaintext connection in case
-	 * of sslmode='allow'; in that case we prefer to reuse the connection with
-	 * negotiated SSL, instead of reconnecting to do direct SSL. The point of
-	 * direct SSL is to avoid the ro

Re: GUC-ify walsender MAX_SEND_SIZE constant

2024-04-23 Thread Andres Freund
Hi,

On 2024-04-23 14:47:31 +0200, Jakub Wartak wrote:
> On Tue, Apr 23, 2024 at 2:24 AM Michael Paquier  wrote:
> >
> > > Any news, comments, etc. about this thread?
> >
> > FWIW, I'd still be in favor of doing a GUC-ification of this part, but
> > at this stage I'd need more time to do a proper study of a case where
> > this shows benefits to prove your point, or somebody else could come
> > in and show it.
> >
> > Andres has objected to this change, on the ground that this was not
> > worth it, though you are telling the contrary.  I would be curious to
> > hear from others, first, so as we gather more opinions to reach a
> > consensus.

I think it's a bad idea to make it configurable. It's just one more guc that
nobody has a chance of realistically tuning.  I'm not saying we shouldn't
improve the code - just that making MAX_SEND_SIZE configurable doesn't really
seem like a good answer.

FWIW, I have a hard time believing that MAX_SEND_SIZE is going to be the the
only or even primary issue with high latency, high bandwidth storage devices.



> First: it's very hard to get *reliable* replication setup for
> benchmark, where one could demonstrate correlation between e.g.
> increasing MAX_SEND_SIZE and observing benefits (in sync rep it is
> easier, as you are simply stalled in pgbench). Part of the problem are
> the following things:

Depending on the workload, it's possible to measure streaming-out performance
without actually regenerating WAL. E.g. by using pg_receivewal to stream the
data out multiple times.


Another way to get fairly reproducible WAL workloads is to drive
pg_logical_emit_message() from pgbench, that tends to havea lot less
variability than running tpcb-like or such.


> Second: once you perform above and ensure that there are no network or
> I/O stalls back then I *think* I couldn't see any impact of playing
> with MAX_SEND_SIZE from what I remember as probably something else is
> saturated first.

My understanding of Majid's use-case for tuning MAX_SEND_SIZE is that the
bottleneck is storage, not network. The reason MAX_SEND_SIZE affects that is
that it determines the max size passed to WALRead(), which in turn determines
how much we read from the OS at once.  If the storage has high latency but
also high throughput, and readahead is disabled or just not aggressive enough
after crossing segment boundaries, larger reads reduce the number of times
you're likely to be blocked waiting for read IO.

Which is also why I think that making MAX_SEND_SIZE configurable is a really
poor proxy for improving the situation.

We're imo much better off working on read_stream.[ch] support for reading WAL.

Greetings,

Andres Freund




Tarball builds in the new world order

2024-04-23 Thread Tom Lane
With one eye on the beta-release calendar, I thought it'd be a
good idea to test whether our tarball build script works for
the new plan where we'll use "git archive" instead of the
traditional process.

It doesn't.

It makes tarballs all right, but whatever commit ID you specify
is semi-ignored, and you get a tarball corresponding to HEAD
of master.  (The PDFs come from the right version, though!)

The reason for that is that the mk-one-release script does this
(shorn of not-relevant-here details):

export BASE=/home/pgsql
export GIT_DIR=$BASE/postgresql.git

mkdir pgsql

# Export the selected git ref
git archive ${gitref} | tar xf - -C pgsql

cd pgsql
./configure

# Produce .tar.gz and .tar.bz2 files
make dist

Since 619bc23a1, what "make dist" does is

$(GIT) -C $(srcdir) -c core.autocrlf=false archive --format tar.gz -9 
--prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
$(GIT) -C $(srcdir) -c core.autocrlf=false -c 
tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ 
HEAD -o $(abs_top_builddir)/$@

Since GIT_DIR is set, git consults that repo not the current working
directory, so HEAD means whatever it means in a just-fetched repo,
and mk-one-release's efforts to select the ${gitref} commit mean
nothing.  (If git had tried to consult the current working directory,
it would've failed for lack of any .git subdir therein.)

I really really don't want to put version-specific coding into
mk-one-release, but fortunately I think we don't have to.
What I suggest is doing this in mk-one-release:

-make dist
+make dist PG_COMMIT_HASH=${gitref}

and changing the "make dist" rules to write $(PG_COMMIT_HASH) not
HEAD.  The extra make variable will have no effect in the back
branches, while it should cause the right thing to happen with
the new implementation of "make dist".

This change seems like a good thing anyway for anyone who's tempted
to use "make dist" manually, since they wouldn't necessarily want
to package HEAD either.  Now, if we just do it exactly like that
then trying to "make dist" without setting PG_COMMIT_HASH will
fail, since "git archive" has no default for its 
argument.  I can't quite decide if that's a good thing, or if we
should hack the makefile a little further to allow PG_COMMIT_HASH
to default to HEAD.

Thoughts, better ideas?

regards, tom lane




Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-04-23 Thread Michael Paquier
On Tue, Apr 23, 2024 at 10:08:13PM +0200, Daniel Gustafsson wrote:
> Hearing no objections to this plan (and the posted v10), I'll go ahead with
> 0001, 0003 and 0004 into v17 tomorrow.

WFM, thanks.
--
Michael


signature.asc
Description: PGP signature


Re: BitmapHeapScan streaming read user and prelim refactoring

2024-04-23 Thread Tomas Vondra



On 4/23/24 18:05, Melanie Plageman wrote:
> On Mon, Apr 22, 2024 at 1:01 PM Melanie Plageman
>  wrote:
>>
>> On Thu, Apr 18, 2024 at 5:39 AM Tomas Vondra
>>  wrote:
>>>
>>> On 4/18/24 09:10, Michael Paquier wrote:
 On Sun, Apr 07, 2024 at 10:54:56AM -0400, Melanie Plageman wrote:
> I've added an open item [1], because what's one open item when you can
> have two? (me)

 And this is still an open item as of today.  What's the plan to move
 forward here?
>>>
>>> AFAIK the plan is to replace the asserts with actually resetting the
>>> rs_empty_tuples_pending field to 0, as suggested by Melanie a week ago.
>>> I assume she was busy with the post-freeze AM reworks last week, so this
>>> was on a back burner.
>>
>> yep, sorry. Also I took a few days off. I'm just catching up today. I
>> want to pop in one of Richard or Tomas' examples as a test, since it
>> seems like it would add some coverage. I will have a patch soon.
> 
> The patch with a fix is attached. I put the test in
> src/test/regress/sql/join.sql. It isn't the perfect location because
> it is testing something exercisable with a join but not directly
> related to the fact that it is a join. I also considered
> src/test/regress/sql/select.sql, but it also isn't directly related to
> the query being a SELECT query. If there is a better place for a test
> of a bitmap heap scan edge case, let me know.
> 

I don't see a problem with adding this to join.sql - why wouldn't this
count as something related to a join? Sure, it's not like this code
matters only for joins, but if you look at join.sql that applies to a
number of other tests (e.g. there are a couple btree tests).

That being said, it'd be good to explain in the comment why we're
testing this particular plan, not just what the plan looks like.

> One other note: there is some concurrency effect in the parallel
> schedule group containing "join" where you won't trip the assert if
> all the tests in that group in the parallel schedule are run. But, if
> you would like to verify that the test exercises the correct code,
> just reduce the group containing "join".
> 

That is ... interesting. Doesn't that mean that most test runs won't
actually detect the problem? That would make the test a bit useless.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pg_combinebackup does not detect missing files

2024-04-23 Thread David Steele

On 4/22/24 23:53, Robert Haas wrote:

On Sun, Apr 21, 2024 at 8:47 PM David Steele  wrote:

I figured that wouldn't be particularly meaningful, and
that's pretty much the only kind of validation that's even
theoretically possible without a bunch of extra overhead, since we
compute checksums on entire files rather than, say, individual blocks.
And you could really only do it for the final backup in the chain,
because you should end up accessing all of those files, but the same
is not true for the predecessor backups. So it's a very weak form of
verification.


I don't think it is weak if you can verify that the output is exactly as
expected, i.e. all files are present and have the correct contents.


I don't understand what you mean here. I thought we were in agreement
that verifying contents would cost a lot more. The verification that
we can actually do without much cost can only check for missing files
in the most recent backup, which is quite weak. pg_verifybackup is
available if you want more comprehensive verification and you're
willing to pay the cost of it.


I simply meant that it is *possible* to verify the output of 
pg_combinebackup without explicitly verifying all the backups. There 
would be overhead, yes, but it would be less than verifying each backup 
individually. For my 2c that efficiency would make it worth doing 
verification in pg_combinebackup, with perhaps a switch to turn it off 
if the user is confident in their sources.



I think it is a worthwhile change and we are still a month away from
beta1. We'll see if anyone disagrees.


I don't plan to press forward with this in this release unless we get
a couple of +1s from disinterested parties. We're now two weeks after
feature freeze and this is design behavior, not a bug. Perhaps the
design should have been otherwise, but two weeks after feature freeze
is not the time to debate that.


It doesn't appear that anyone but me is terribly concerned about 
verification, even in this weak form, so probably best to hold this 
patch until the next release. As you say, it is late in the game.


Regards,
-David




Re: Requiring LLVM 14+ in PostgreSQL 18

2024-04-23 Thread Thomas Munro
Rebased over ca89db5f.

I looked into whether we could drop the "old pass manager" code
too[1].  Almost, but nope, even the C++ API lacks a way to set the
inline threshold before LLVM 16, so that would cause a regression.
Although we just hard-code the threshold to 512 with a comment that
sounds like it's pretty arbitrary, a change to the default (225?)
would be unjustifiable just for code cleanup.  Oh well.

[1] 
https://github.com/macdice/postgres/commit/0d40abdf1feb75210c3a3d2a35e3d6146185974c


v2-0001-jit-Require-at-least-LLVM-14-if-enabled.patch
Description: Binary data


v2-0002-jit-Use-opaque-pointers-in-all-supported-LLVM-ver.patch
Description: Binary data


Re: query_id, pg_stat_activity, extended query protocol

2024-04-23 Thread Imseih (AWS), Sami
> I am also a bit surprised with the choice of using the first Query
> available in the list for the ID, FWIW.


IIUC, the query trees returned from QueryRewrite
will all have the same queryId, so it appears valid to 
use the queryId from the first tree in the list. Right?

Here is an example I was working with that includes user-defined rules
that has a list with more than 1 tree.


postgres=# explain (verbose, generic_plan) insert into mytab values ($1) 
RETURNING pg_sleep($1), id ;
QUERY PLAN 
---
Insert on public.mytab (cost=0.00..0.01 rows=1 width=4)
Output: pg_sleep(($1)::double precision), mytab.id
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425


Insert on public.mytab2 (cost=0.00..0.01 rows=0 width=0)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425


Insert on public.mytab3 (cost=0.00..0.01 rows=0 width=0)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425


Insert on public.mytab4 (cost=0.00..0.01 rows=0 width=0)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425
(20 rows)



> Did you consider using \bind to show how this behaves in a regression
> test?


Yes, this is precisely how I tested. Without the patch, I could not
see a queryId after 9 seconds of a pg_sleep, but with the patch it 
appears. See the test below.


## test query
select pg_sleep($1) \bind 30


## unpatched
postgres=# select 
query_id, 
query, 
now()-query_start query_duration, 
state 
from pg_stat_activity where pid <> pg_backend_pid()
and state = 'active';
query_id |query | query_duration  | state  
--+--+-+
  | select pg_sleep($1) +| 00:00:08.604845 | active
  | ;| | 
(1 row)

## patched

postgres=# truncate table large;^C
postgres=# select 
query_id, 
query, 
now()-query_start query_duration, 
state 
from pg_stat_activity where pid <> pg_backend_pid()
and state = 'active';
  query_id   |query | query_duration | state  
-+--++
 2433215470630378210 | select pg_sleep($1) +| 00:00:09.6881  | active
 | ;|| 
(1 row)


For exec_execute_message, I realized that to report queryId for
Utility and non-utility statements, we need to report the queryId 
inside the portal routines where PlannedStmt contains the queryId.

Attached is the first real attempt at the fix. 

Regards,


Sami







0001-v1-Fix-Extended-QUery-Protocol-handling-of-queryId.patch
Description: 0001-v1-Fix-Extended-QUery-Protocol-handling-of-queryId.patch


Re: Streaming I/O, vectored I/O (WIP)

2024-04-23 Thread David Rowley
I've attached a patch with a few typo fixes and what looks like an
incorrect type for max_ios. It's an int16 and I think it needs to be
an int. Doing "max_ios = Min(max_ios, PG_INT16_MAX);" doesn't do
anything when max_ios is int16.

David
diff --git a/src/backend/storage/aio/read_stream.c 
b/src/backend/storage/aio/read_stream.c
index 634cf4f0d1..74b9bae631 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -26,12 +26,12 @@
  *
  * B) I/O is necessary, but fadvise is undesirable because the access is
  * sequential, or impossible because direct I/O is enabled or the system
- * doesn't support advice.  There is no benefit in looking ahead more than
- * io_combine_limit, because in this case only goal is larger read system
+ * doesn't support fadvise.  There is no benefit in looking ahead more than
+ * io_combine_limit, because in this case the only goal is larger read system
  * calls.  Looking further ahead would pin many buffers and perform
  * speculative work looking ahead for no benefit.
  *
- * C) I/O is necesssary, it appears random, and this system supports fadvise.
+ * C) I/O is necessary, it appears random, and this system supports fadvise.
  * We'll look further ahead in order to reach the configured level of I/O
  * concurrency.
  *
@@ -418,7 +418,7 @@ read_stream_begin_relation(int flags,
ReadStream *stream;
size_t  size;
int16   queue_size;
-   int16   max_ios;
+   int max_ios;
int strategy_pin_limit;
uint32  max_pinned_buffers;
Oid tablespace_id;
@@ -447,6 +447,8 @@ read_stream_begin_relation(int flags,
max_ios = 
get_tablespace_maintenance_io_concurrency(tablespace_id);
else
max_ios = get_tablespace_io_concurrency(tablespace_id);
+
+   /* Cap to INT16_MAX to avoid overflowing below */
max_ios = Min(max_ios, PG_INT16_MAX);
 
/*


Extend ALTER DEFAULT PRIVILEGES for large objects

2024-04-23 Thread Yugo NAGATA
Hi,

Currently, ALTER DEFAULT PRIVILEGE doesn't support large objects,
so if we want to allow users other than the owner to use the large
object, we need to grant a privilege on it every time a large object
is created. One of our clients feels that this is annoying, so I would
like propose to extend  ALTER DEFAULT PRIVILEGE to large objects. 

Here are the new actions allowed in abbreviated_grant_or_revoke;

+GRANT { { SELECT | UPDATE }
+[, ...] | ALL [ PRIVILEGES ] }
+ON LARGE OBJECTS
+TO { [ GROUP ] role_name | PUBLIC } [, ...] [ WITH GRANT OPTION ]

+REVOKE [ GRANT OPTION FOR ]
+{ { SELECT | UPDATE }
+[, ...] | ALL [ PRIVILEGES ] }
+ON LARGE OBJECTS
+FROM { [ GROUP ] role_name | PUBLIC } [, ...]
+[ CASCADE | RESTRICT ]

A new keyword OBJECTS is introduced for using plural form in the syntax
as other supported objects. A schema name is not allowed to be specified
for large objects since any large objects don't belong to a schema.

The attached patch is originally proposed by Haruka Takatsuka
and some fixes and tests are made by me. 

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
>From fe2cb39bd83d09a052d5d63889acd0968c1817b6 Mon Sep 17 00:00:00 2001
From: Yugo Nagata 
Date: Fri, 8 Mar 2024 17:43:43 +0900
Subject: [PATCH] Extend ALTER DEFAULT PRIVILEGES for large objects

Original patch by Haruka Takatsuka, some fixes and tests by
Yugo Nagata.
---
 doc/src/sgml/catalogs.sgml|   3 +-
 .../sgml/ref/alter_default_privileges.sgml|  15 ++-
 src/backend/catalog/aclchk.c  |  21 
 src/backend/catalog/pg_largeobject.c  |  18 ++-
 src/backend/parser/gram.y |   5 +-
 src/include/catalog/pg_default_acl.h  |   1 +
 src/include/parser/kwlist.h   |   1 +
 src/test/regress/expected/privileges.out  | 104 +-
 src/test/regress/sql/privileges.sql   |  47 
 9 files changed, 208 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2907079e2a..b8cc822aeb 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3330,7 +3330,8 @@ SCRAM-SHA-256$:&l
S = sequence,
f = function,
T = type,
-   n = schema
+   n = schema,
+   L = large object
   
  
 
diff --git a/doc/src/sgml/ref/alter_default_privileges.sgml b/doc/src/sgml/ref/alter_default_privileges.sgml
index 1de4c5c1b4..3b358b7a88 100644
--- a/doc/src/sgml/ref/alter_default_privileges.sgml
+++ b/doc/src/sgml/ref/alter_default_privileges.sgml
@@ -50,6 +50,11 @@ GRANT { USAGE | CREATE | ALL [ PRIVILEGES ] }
 ON SCHEMAS
 TO { [ GROUP ] role_name | PUBLIC } [, ...] [ WITH GRANT OPTION ]
 
+GRANT { { SELECT | UPDATE }
+[, ...] | ALL [ PRIVILEGES ] }
+ON LARGE OBJECTS
+TO { [ GROUP ] role_name | PUBLIC } [, ...] [ WITH GRANT OPTION ]
+
 REVOKE [ GRANT OPTION FOR ]
 { { SELECT | INSERT | UPDATE | DELETE | TRUNCATE | REFERENCES | TRIGGER | MAINTAIN }
 [, ...] | ALL [ PRIVILEGES ] }
@@ -81,6 +86,13 @@ REVOKE [ GRANT OPTION FOR ]
 ON SCHEMAS
 FROM { [ GROUP ] role_name | PUBLIC } [, ...]
 [ CASCADE | RESTRICT ]
+
+REVOKE [ GRANT OPTION FOR ]
+{ { SELECT | UPDATE }
+[, ...] | ALL [ PRIVILEGES ] }
+ON LARGE OBJECTS
+FROM { [ GROUP ] role_name | PUBLIC } [, ...]
+[ CASCADE | RESTRICT ]
 
  
 
@@ -159,7 +171,8 @@ REVOKE [ GRANT OPTION FOR ]
   If IN SCHEMA is omitted, the global default privileges
   are altered.
   IN SCHEMA is not allowed when setting privileges
-  for schemas, since schemas can't be nested.
+  for schemas and large objects, since schemas can't be nested and
+  large objects don't belong to a schema.
  
 

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 7abf3c2a74..41baf81a1d 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -1077,6 +1077,10 @@ ExecAlterDefaultPrivilegesStmt(ParseState *pstate, AlterDefaultPrivilegesStmt *s
 			all_privileges = ACL_ALL_RIGHTS_SCHEMA;
 			errormsg = gettext_noop("invalid privilege type %s for schema");
 			break;
+		case OBJECT_LARGEOBJECT:
+			all_privileges = ACL_ALL_RIGHTS_LARGEOBJECT;
+			errormsg = gettext_noop("invalid privilege type %s for large object");
+			break;
 		default:
 			elog(ERROR, "unrecognized GrantStmt.objtype: %d",
  (int) action->objtype);
@@ -1268,6 +1272,16 @@ SetDefaultACL(InternalDefaultACL *iacls)
 this_privileges = ACL_ALL_RIGHTS_SCHEMA;
 			break;
 
+		case OBJECT_LARGEOBJECT:
+			if (OidIsValid(iacls->nspid))
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_GRANT_OPERATION),
+		 errmsg("cannot use IN SCHEMA clause when using GRANT/REVOKE ON LARGE OBJECTS")));
+			objtype = DEFACLOBJ_LARGEOBJECT;
+			if (iacls->all_privs && this_privileges == ACL_NO_RIGHTS)
+this_privileges = ACL_ALL_RIGHTS_LARGEOBJECT;
+			break;
+
 		default:
 			elog(ERROR, "unrecogniz

Re: Cleanup: remove unused fields from nodes

2024-04-23 Thread Michael Paquier
On Tue, Apr 23, 2024 at 01:01:04PM -0400, Tom Lane wrote:
> That is, query jumbling no longer distinguishes "DEALLOCATE x" from
> "DEALLOCATE ALL", because the DeallocateStmt.name field is marked
> query_jumble_ignore.  Now maybe that's fine, but it's a point
> we'd not considered so far in this thread.  Thoughts?

And of course, I've managed to forget about bb45156f342c and the
reason behind the addition of the field is to be able to make the
difference between the named and ALL cases for DEALLOCATE, around
here:
https://www.postgresql.org/message-id/ZNq9kRwWbKzvR%2B2a%40paquier.xyz

This is new in v17, so perhaps it could be changed, but I think that's
important to make the difference here for monitoring purposes as
DEALLOCATE ALL could be used as a way to clean up prepared statements
in connection poolers (for example, pgbouncer's server_reset_query).
And doing this tweak in the Node structure of DeallocateStmt is
simpler than having to maintain a new pg_node_attr for query jumbling.
--
Michael


signature.asc
Description: PGP signature


Re: Cleanup: remove unused fields from nodes

2024-04-23 Thread Tom Lane
Michael Paquier  writes:
> On Tue, Apr 23, 2024 at 01:01:04PM -0400, Tom Lane wrote:
>> That is, query jumbling no longer distinguishes "DEALLOCATE x" from
>> "DEALLOCATE ALL", because the DeallocateStmt.name field is marked
>> query_jumble_ignore.  Now maybe that's fine, but it's a point
>> we'd not considered so far in this thread.  Thoughts?

> And of course, I've managed to forget about bb45156f342c and the
> reason behind the addition of the field is to be able to make the
> difference between the named and ALL cases for DEALLOCATE, around
> here:
> https://www.postgresql.org/message-id/ZNq9kRwWbKzvR%2B2a%40paquier.xyz

Hah.  Seems like the comment for isall needs to explain that it
exists for this purpose, so we don't make this mistake again.

regards, tom lane




Re: Row pattern recognition

2024-04-23 Thread Tatsuo Ishii
Hi Vik and Champion,

I think the current RPR patch is not quite correct in handling
count(*).

(using slightly modified version of Vik's example query)

SELECT v.a, count(*) OVER w
FROM (VALUES ('A'),('B'),('B'),('C')) AS v (a)
WINDOW w AS (
  ORDER BY v.a
  ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
  PATTERN (B+)
  DEFINE B AS a = 'B'
)
 a | count 
---+---
 A | 0
 B | 2
 B |  
 C | 0
(4 rows)

Here row 3 is skipped because the pattern B matches row 2 and 3. In
this case I think cont(*) should return 0 rathern than NULL for row 3.

What do you think?

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Disallow changing slot's failover option in transaction block

2024-04-23 Thread Amit Kapila
On Mon, Apr 22, 2024 at 2:31 PM Bertrand Drouvot
 wrote:
>
> On Mon, Apr 22, 2024 at 11:32:14AM +0530, shveta malik wrote:
> > On Mon, Apr 22, 2024 at 5:57 AM Zhijie Hou (Fujitsu)
> >  wrote:
> > > Attach the V3 patch which also addressed Shveta[1] and Bertrand[2]'s 
> > > comments.
>
> Thanks!
>
> >  Tested the patch, works well.
>
> Same here.
>

Pushed!

-- 
With Regards,
Amit Kapila.




Re: Extend ALTER DEFAULT PRIVILEGES for large objects

2024-04-23 Thread Tom Lane
Yugo NAGATA  writes:
> Currently, ALTER DEFAULT PRIVILEGE doesn't support large objects,
> so if we want to allow users other than the owner to use the large
> object, we need to grant a privilege on it every time a large object
> is created. One of our clients feels that this is annoying, so I would
> like propose to extend  ALTER DEFAULT PRIVILEGE to large objects. 

I wonder how this plays with pg_dump, and in particular whether it
breaks the optimizations that a45c78e32 installed for large numbers
of large objects.  The added test cases seem to go out of their way
to leave no trace behind that the pg_dump/pg_upgrade tests might
encounter.

I think you broke psql's \ddp, too.  And some other places; grepping
for DEFACLOBJ_NAMESPACE finds other oversights.

On the whole I find this proposed feature pretty unexciting
and dubiously worthy of the implementation/maintenance effort.

regards, tom lane




Re: promotion related handling in pg_sync_replication_slots()

2024-04-23 Thread shveta malik
On Tue, Apr 23, 2024 at 9:07 AM Amit Kapila  wrote:
>
> On Mon, Apr 22, 2024 at 7:04 PM Masahiko Sawada  wrote:
> >
> > On Mon, Apr 22, 2024 at 9:02 PM shveta malik  wrote:
> > >
> > > Thanks for the patch, the changes look good Amit. Please find the merged 
> > > patch.
> > >
> >
> > I've reviewed the patch and have some comments:

Thanks for the comments.

> > ---
> > /*
> > -* Early initialization.
> > +* Register slotsync_worker_onexit() before we register
> > +* ReplicationSlotShmemExit() in BaseInit(), to ensure that during the
> > +* exit of the slot sync worker, ReplicationSlotShmemExit() is called
> > +* first, followed by slotsync_worker_onexit(). The startup process 
> > during
> > +* promotion invokes ShutDownSlotSync() which waits for slot sync to
> > +* finish and it does that by checking the 'syncing' flag. Thus worker
> > +* must be done with the slots' release and cleanup before it marks 
> > itself
> > +* as finished syncing.
> >  */
> >
> > I'm slightly worried that we register the slotsync_worker_onexit()
> > callback before BaseInit(), because it could be a blocker when we want
> > to add more work in the callback, for example sending the stats.
> >
>
> The other possibility is that we do slot release/clean up in the
> slotsync_worker_onexit() call itself and then we can do it after
> BaseInit(). Do you have any other/better idea for this?

I have currently implemented it this way in v11.

> > ---
> > synchronize_slots(wrconn);
> > +
> > +   /* Cleanup the temporary slots */
> > +   ReplicationSlotCleanup();
> > +
> > +   /* We are done with sync, so reset sync flag */
> > +   reset_syncing_flag();
> >
> > I think it ends up removing other temp slots that are created by the
> > same backend process using
> > pg_create_{physical,logical_replication_slots() function, which could
> > be a large side effect of this function for users.

Yes, this is a problem. Thanks for catching it.

>
> True, I think here we should either remove only temporary and synced
> marked slots. The other possibility is to create slots as RS_EPHEMERAL
> initially when called from the SQL function but that doesn't sound
> like a neat approach.

Modified the logic to remove only synced temporary slots during
SQL-function exit.

Please find v11 with above changes.

thanks
Shveta


v11-0001-Handle-stopSignaled-during-sync-function-call.patch
Description: Binary data


Re: promotion related handling in pg_sync_replication_slots()

2024-04-23 Thread Masahiko Sawada
On Tue, Apr 23, 2024 at 12:37 PM Amit Kapila  wrote:
>
> On Mon, Apr 22, 2024 at 7:04 PM Masahiko Sawada  wrote:
> >
> > On Mon, Apr 22, 2024 at 9:02 PM shveta malik  wrote:
> > >
> > > Thanks for the patch, the changes look good Amit. Please find the merged 
> > > patch.
> > >
> >
> > I've reviewed the patch and have some comments:
> >
> > ---
> > /*
> > -* Early initialization.
> > +* Register slotsync_worker_onexit() before we register
> > +* ReplicationSlotShmemExit() in BaseInit(), to ensure that during the
> > +* exit of the slot sync worker, ReplicationSlotShmemExit() is called
> > +* first, followed by slotsync_worker_onexit(). The startup process 
> > during
> > +* promotion invokes ShutDownSlotSync() which waits for slot sync to
> > +* finish and it does that by checking the 'syncing' flag. Thus worker
> > +* must be done with the slots' release and cleanup before it marks 
> > itself
> > +* as finished syncing.
> >  */
> >
> > I'm slightly worried that we register the slotsync_worker_onexit()
> > callback before BaseInit(), because it could be a blocker when we want
> > to add more work in the callback, for example sending the stats.
> >
>
> The other possibility is that we do slot release/clean up in the
> slotsync_worker_onexit() call itself and then we can do it after
> BaseInit().

This approach sounds clearer and safer to me. The current approach
relies on the callback registration order of
ReplicationSlotShmemExit(). If it changes in the future, we will
silently have the same problem. Every slot sync related work should be
done before allowing someone to touch synced slots by clearing the
'syncing' flag.

>
> > ---
> > synchronize_slots(wrconn);
> > +
> > +   /* Cleanup the temporary slots */
> > +   ReplicationSlotCleanup();
> > +
> > +   /* We are done with sync, so reset sync flag */
> > +   reset_syncing_flag();
> >
> > I think it ends up removing other temp slots that are created by the
> > same backend process using
> > pg_create_{physical,logical_replication_slots() function, which could
> > be a large side effect of this function for users.
> >
>
> True, I think here we should either remove only temporary and synced
> marked slots. The other possibility is to create slots as RS_EPHEMERAL
> initially when called from the SQL function but that doesn't sound
> like a neat approach.
>
> >
>  Also, if users want
> > to have a process periodically calling pg_sync_replication_slots()
> > instead of the slotsync worker, it doesn't support a case where we
> > create a temp not-ready slot and turn it into a persistent slot if
> > it's ready for sync.
> >
>
> True, but eventually the API should be able to directly create the
> persistent slots and anyway this can happen only for the first time
> (till the slots are created and marked persistent) and one who wants
> to use this function periodically should be able to see regular syncs.

I agree that we remove temp-and-synced slots created via the API at
the end of the API . We end up creating and dropping slots in every
API call but since the pg_sync_replication_slots() function is a kind
of debug-purpose function and it will not be common to call this
function regularly instead of using the slot sync worker, we can live
with such overhead.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add memory context type to pg_backend_memory_contexts view

2024-04-23 Thread David Rowley
On Tue, 16 Apr 2024 at 13:30, David Rowley  wrote:
> In [1] Andres mentioned that there's no way to determine the memory
> context type in pg_backend_memory_contexts. This is a bit annoying as
> I'd like to add a test to exercise BumpStats().
>
> Having the context type in the test's expected output helps ensure we
> are exercising BumpStats() and any future changes to the choice of
> context type in tuplesort.c gets flagged up by the test breaking.

bea97cd02 added a new regression test in sysviews.sql to call
pg_backend_memory_contexts to test the BumpStats() function.

The attached updates the v1 patch to add the new type column to the
new call to pg_backend_memory_contexts() to ensure the type = "Bump"

No other changes.

David
From 632be6de363e8f9975722debbd620665f3a0ea71 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Tue, 16 Apr 2024 13:05:34 +1200
Subject: [PATCH v2] Add context type field to pg_backend_memory_contexts

---
 doc/src/sgml/system-views.sgml |  9 +
 src/backend/utils/adt/mcxtfuncs.c  | 50 ++
 src/include/catalog/pg_proc.dat|  6 ++--
 src/test/regress/expected/rules.out|  5 +--
 src/test/regress/expected/sysviews.out | 16 -
 src/test/regress/sql/sysviews.sql  |  4 +--
 6 files changed, 61 insertions(+), 29 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 7ed617170f..18ae5b9138 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -463,6 +463,15 @@
 
 
 
+ 
+  
+   type text
+  
+  
+   Type of the memory context
+  
+ 
+
  
   
name text
diff --git a/src/backend/utils/adt/mcxtfuncs.c 
b/src/backend/utils/adt/mcxtfuncs.c
index 4d4a70915b..fe52d57fd4 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -36,12 +36,13 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
 TupleDesc 
tupdesc, MemoryContext context,
 const char 
*parent, int level)
 {
-#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS9
+#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS10
 
Datum   values[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
boolnulls[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS];
MemoryContextCounters stat;
MemoryContext child;
+   const char *type;
const char *name;
const char *ident;
 
@@ -67,10 +68,31 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
memset(values, 0, sizeof(values));
memset(nulls, 0, sizeof(nulls));
 
+   switch (context->type)
+   {
+   case T_AllocSetContext:
+   type = "AllocSet";
+   break;
+   case T_GenerationContext:
+   type = "Generation";
+   break;
+   case T_SlabContext:
+   type = "Slab";
+   break;
+   case T_BumpContext:
+   type = "Bump";
+   break;
+   default:
+   type = "???";
+   break;
+   }
+
+   values[0] = CStringGetTextDatum(type);
+
if (name)
-   values[0] = CStringGetTextDatum(name);
+   values[1] = CStringGetTextDatum(name);
else
-   nulls[0] = true;
+   nulls[1] = true;
 
if (ident)
{
@@ -86,22 +108,22 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
 
memcpy(clipped_ident, ident, idlen);
clipped_ident[idlen] = '\0';
-   values[1] = CStringGetTextDatum(clipped_ident);
+   values[2] = CStringGetTextDatum(clipped_ident);
}
else
-   nulls[1] = true;
+   nulls[2] = true;
 
if (parent)
-   values[2] = CStringGetTextDatum(parent);
+   values[3] = CStringGetTextDatum(parent);
else
-   nulls[2] = true;
-
-   values[3] = Int32GetDatum(level);
-   values[4] = Int64GetDatum(stat.totalspace);
-   values[5] = Int64GetDatum(stat.nblocks);
-   values[6] = Int64GetDatum(stat.freespace);
-   values[7] = Int64GetDatum(stat.freechunks);
-   values[8] = Int64GetDatum(stat.totalspace - stat.freespace);
+   nulls[3] = true;
+
+   values[4] = Int32GetDatum(level);
+   values[5] = Int64GetDatum(stat.totalspace);
+   values[6] = Int64GetDatum(stat.nblocks);
+   values[7] = Int64GetDatum(stat.freespace);
+   values[8] = Int64GetDatum(stat.freechunks);
+   values[9] = Int64GetDatum(stat.totalspace - stat.freespace);
tuplestore_putvalues(tupstore, tupdesc, values, nulls);
 
for (child = context->firstchild; child != NULL; child = 
child->nextchild)

Re: POC: GROUP BY optimization

2024-04-23 Thread jian he
hi.
I found an interesting case.

CREATE TABLE t1 AS
  SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
z, i::int4 AS w
  FROM generate_series(1, 100) AS i;
CREATE INDEX t1_x_y_idx ON t1 (x, y);
ANALYZE t1;
SET enable_hashagg = off;
SET enable_seqscan = off;

EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,y,z;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,w,y;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,z,y;
the above part will use:
  ->  Incremental Sort
 Sort Key: x, $, $, $
 Presorted Key: x
 ->  Index Scan using t1_x_y_idx on t1

EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY z,y,w,x;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY w,y,z,x;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,z,x,w;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,w,x,z;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,z,w;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,w,z;

these will use:
  ->  Incremental Sort
 Sort Key: x, y, $, $
 Presorted Key: x, y
 ->  Index Scan using t1_x_y_idx on t1

I guess this is fine, but not optimal?




Re: Race condition in FetchTableStates() breaks synchronization of subscription tables

2024-04-23 Thread Amit Kapila
On Wed, Mar 13, 2024 at 9:19 AM vignesh C  wrote:
>
> On Tue, 12 Mar 2024 at 09:34, Ajin Cherian  wrote:
> >
> >
> >
> > On Tue, Mar 12, 2024 at 2:59 PM vignesh C  wrote:
> >>
> >>
> >> Thanks, I have created the following Commitfest entry for this:
> >> https://commitfest.postgresql.org/47/4816/
> >>
> >> Regards,
> >> Vignesh
> >
> >
> > Thanks for the patch, I have verified that the fix works well by following 
> > the steps mentioned to reproduce the problem.
> > Reviewing the patch, it seems good and is well documented. Just one minor 
> > comment I had was probably to change the name of the variable 
> > table_states_valid to table_states_validity. The current name made sense 
> > when it was a bool, but now that it is a tri-state enum, it doesn't fit 
> > well.
>
> Thanks for reviewing the patch, the attached v6 patch has the changes
> for the same.
>

v6_0001* looks good to me. This should be backpatched unless you or
others think otherwise.

-- 
With Regards,
Amit Kapila.




Re: ecpg_config.h symbol missing with meson

2024-04-23 Thread Peter Eisentraut

On 17.04.24 18:15, Tom Lane wrote:

Peter Eisentraut  writes:

I checked the generated ecpg_config.h with make and meson, and the meson
one is missing



#define HAVE_LONG_LONG_INT 1



This is obviously quite uninteresting, since that is required by C99.
But it would be more satisfactory if we didn't have discrepancies like
that.  Note that we also kept ENABLE_THREAD_SAFETY in ecpg_config.h for
compatibility.
...
Alternatively, we could remove the symbol from the make side.


Think I'd vote for removing it, since we use it nowhere.
The ENABLE_THREAD_SAFETY precedent feels a little bit different,
since there's not the C99-requires-the-feature angle.


Ok, fixed by removing instead.





Re: Extend ALTER DEFAULT PRIVILEGES for large objects

2024-04-23 Thread Yugo NAGATA
On Tue, 23 Apr 2024 23:47:38 -0400
Tom Lane  wrote:

> Yugo NAGATA  writes:
> > Currently, ALTER DEFAULT PRIVILEGE doesn't support large objects,
> > so if we want to allow users other than the owner to use the large
> > object, we need to grant a privilege on it every time a large object
> > is created. One of our clients feels that this is annoying, so I would
> > like propose to extend  ALTER DEFAULT PRIVILEGE to large objects. 
> 
> I wonder how this plays with pg_dump, and in particular whether it
> breaks the optimizations that a45c78e32 installed for large numbers
> of large objects.  The added test cases seem to go out of their way
> to leave no trace behind that the pg_dump/pg_upgrade tests might
> encounter.

Thank you for your comments.

The previous patch did not work with pg_dump since I forgot some fixes.
I attached a updated patch including fixes.

I believe a45c78e32 is about already-existing large objects and does
not directly related to default privileges, so will not be affected
by this patch.

> I think you broke psql's \ddp, too.  And some other places; grepping
> for DEFACLOBJ_NAMESPACE finds other oversights.

Yes, I did. The attached patch include fixes for psql, too.
 
> On the whole I find this proposed feature pretty unexciting
> and dubiously worthy of the implementation/maintenance effort.

I believe this feature is beneficial to some users allows because
this enables to omit GRANT that was necessary every large object
creation. It seems to me that implementation/maintenance cost is not
so high compared to other objects (e.g. default privileges on schemas)
unless I am still missing something wrong. 

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
>From 0cfcdc2b297556248cfb64d67779d5fcb8dab227 Mon Sep 17 00:00:00 2001
From: Yugo Nagata 
Date: Fri, 8 Mar 2024 17:43:43 +0900
Subject: [PATCH v2] Extend ALTER DEFAULT PRIVILEGES for large objects

Original patch by Haruka Takatsuka, some fixes and tests by
Yugo Nagata.
---
 doc/src/sgml/catalogs.sgml|   3 +-
 .../sgml/ref/alter_default_privileges.sgml|  15 ++-
 src/backend/catalog/aclchk.c  |  21 
 src/backend/catalog/objectaddress.c   |  18 ++-
 src/backend/catalog/pg_largeobject.c  |  18 ++-
 src/backend/parser/gram.y |   5 +-
 src/bin/pg_dump/dumputils.c   |   3 +-
 src/bin/pg_dump/pg_dump.c |   3 +
 src/bin/psql/describe.c   |   6 +-
 src/bin/psql/tab-complete.c   |   2 +-
 src/include/catalog/pg_default_acl.h  |   1 +
 src/include/parser/kwlist.h   |   1 +
 src/test/regress/expected/privileges.out  | 104 +-
 src/test/regress/sql/privileges.sql   |  47 
 14 files changed, 235 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2907079e2a..b8cc822aeb 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3330,7 +3330,8 @@ SCRAM-SHA-256$:&l
S = sequence,
f = function,
T = type,
-   n = schema
+   n = schema,
+   L = large object
   
  
 
diff --git a/doc/src/sgml/ref/alter_default_privileges.sgml b/doc/src/sgml/ref/alter_default_privileges.sgml
index 1de4c5c1b4..3b358b7a88 100644
--- a/doc/src/sgml/ref/alter_default_privileges.sgml
+++ b/doc/src/sgml/ref/alter_default_privileges.sgml
@@ -50,6 +50,11 @@ GRANT { USAGE | CREATE | ALL [ PRIVILEGES ] }
 ON SCHEMAS
 TO { [ GROUP ] role_name | PUBLIC } [, ...] [ WITH GRANT OPTION ]
 
+GRANT { { SELECT | UPDATE }
+[, ...] | ALL [ PRIVILEGES ] }
+ON LARGE OBJECTS
+TO { [ GROUP ] role_name | PUBLIC } [, ...] [ WITH GRANT OPTION ]
+
 REVOKE [ GRANT OPTION FOR ]
 { { SELECT | INSERT | UPDATE | DELETE | TRUNCATE | REFERENCES | TRIGGER | MAINTAIN }
 [, ...] | ALL [ PRIVILEGES ] }
@@ -81,6 +86,13 @@ REVOKE [ GRANT OPTION FOR ]
 ON SCHEMAS
 FROM { [ GROUP ] role_name | PUBLIC } [, ...]
 [ CASCADE | RESTRICT ]
+
+REVOKE [ GRANT OPTION FOR ]
+{ { SELECT | UPDATE }
+[, ...] | ALL [ PRIVILEGES ] }
+ON LARGE OBJECTS
+FROM { [ GROUP ] role_name | PUBLIC } [, ...]
+[ CASCADE | RESTRICT ]
 
  
 
@@ -159,7 +171,8 @@ REVOKE [ GRANT OPTION FOR ]
   If IN SCHEMA is omitted, the global default privileges
   are altered.
   IN SCHEMA is not allowed when setting privileges
-  for schemas, since schemas can't be nested.
+  for schemas and large objects, since schemas can't be nested and
+  large objects don't belong to a schema.
  
 

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 7abf3c2a74..41baf81a1d 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -1077,6 +1077,10 @@ ExecAlterDefaultPrivilegesStmt(ParseState *pstate, AlterDefaultPrivilegesStmt *s
 			all_privileges = ACL_ALL_RIGHTS_SCHEMA;
 			errormsg = gettext_noop(

Re: Fix parallel vacuum buffer usage reporting

2024-04-23 Thread Masahiko Sawada
On Mon, Apr 22, 2024 at 5:07 PM Anthonin Bonnefoy
 wrote:
>
> On Sat, Apr 20, 2024 at 2:00 PM Alena Rybakina  
> wrote:
>>
>> Hi, thank you for your work with this subject.
>>
>> While I was reviewing your code, I noticed that your patch conflicts with 
>> another patch [0] that been committed.
>>
>> [0] 
>> https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com
>
>
> I've rebased the patch and also split the changes:

Thank you for updating the patch!

> 1: Use pgBufferUsage in Vacuum and Analyze block reporting

I think that if the anayze command doesn't have the same issue, we
don't need to change it. Making the vacuum and the analyze consistent
is a good point but I'd like to avoid doing unnecessary changes in
back branches. I think the patch set would contain:

(a) make lazy vacuum use BufferUsage instead of
VacuumPage{Hit,Miss,Dirty}. (backpatched down to pg13).
(b) make analyze use BufferUsage and remove VacuumPage{Hit,Miss,Dirty}
variables for consistency and simplicity (only for HEAD, if we agree).

BTW I realized that VACUUM VERBOSE running on a temp table always
shows the number of dirtied buffers being 0, which seems to be a bug.
The patch (a) will resolve it as well.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Feature request: schema diff tool

2024-04-23 Thread Neszt Tibor
Hello,

A diff tool that would generate create, drop, alter, etc. commands from the
differences between 2 specified schemes would be very useful. The diff
could even manage data, so there would be insert, delete update command
outputs, although I think the schema diff management is much more important
and necessary.

Today, all modern applications are version-tracked, including the sql
scheme. Now the schema changes must be handled twice: on the one hand, the
schema must be modified, and on the other hand, the schema modification
commands must also be written for the upgrade process. A good diff tool
would allow only the schema to be modified.

Such a tool already exists because the community needed it, e.g. apgdiff. I
think the problem with this is that the concept isn't even good. I think
this tool should be part of postgresql, because postgresql always knows
what the 100% sql syntax is current, an external program, for example
apgdiff can only follow changes afterwards, generating continuous problems.
Not to mention that an external application can stop, e.g. apgdiff is also
no longer actively developed, so users who built on a diff tool are now in
trouble.

Furthermore, it is the least amount of work to do this on the postgresql
development side, you have the expertise, the sql language processor, etc.

What is your opinion on this?

Regards,
 Neszt Tibor


Small filx on the documentation of ALTER DEFAULT PRIVILEGES

2024-04-23 Thread Yugo NAGATA
Hi,

Hi,

We can specify more than one privilege type in 
"ALTER DEFAULT PRIVILEGES GRANT/REVOKE ON SCHEMAS",
for example,

  ALTER DEFAULT PRIVILEGES GRANT USAGE,CREATE ON SCHEMAS TO PUBLIC;

However, the syntax described in the documentation looks to
be allowing only one,

 GRANT { USAGE | CREATE | ALL [ PRIVILEGES ] }
ON SCHEMAS
TO { [ GROUP ] role_name | PUBLIC } [, ...] [ WITH GRANT OPTION ]

while the syntaxes for tables and sequences are described correctly.

e.g.
 GRANT { { SELECT | INSERT | UPDATE | DELETE | TRUNCATE | REFERENCES | TRIGGER }
[, ...] | ALL [ PRIVILEGES ] }
ON TABLES
TO { [ GROUP ] role_name | PUBLIC } [, ...] [ WITH GRANT OPTION ]

I attached a small patch to fix the description.


Regards,
Yugo Nagata

-- 
Yugo NAGATA 
diff --git a/doc/src/sgml/ref/alter_default_privileges.sgml b/doc/src/sgml/ref/alter_default_privileges.sgml
index 1de4c5c1b4..89aacec4fa 100644
--- a/doc/src/sgml/ref/alter_default_privileges.sgml
+++ b/doc/src/sgml/ref/alter_default_privileges.sgml
@@ -46,7 +46,8 @@ GRANT { USAGE | ALL [ PRIVILEGES ] }
 ON TYPES
 TO { [ GROUP ] role_name | PUBLIC } [, ...] [ WITH GRANT OPTION ]
 
-GRANT { USAGE | CREATE | ALL [ PRIVILEGES ] }
+GRANT { { USAGE | CREATE }
+[, ...] | ALL [ PRIVILEGES ] }
 ON SCHEMAS
 TO { [ GROUP ] role_name | PUBLIC } [, ...] [ WITH GRANT OPTION ]
 
@@ -77,7 +78,8 @@ REVOKE [ GRANT OPTION FOR ]
 [ CASCADE | RESTRICT ]
 
 REVOKE [ GRANT OPTION FOR ]
-{ USAGE | CREATE | ALL [ PRIVILEGES ] }
+{ { USAGE | CREATE }
+[, ...] | ALL [ PRIVILEGES ] }
 ON SCHEMAS
 FROM { [ GROUP ] role_name | PUBLIC } [, ...]
 [ CASCADE | RESTRICT ]


Re: GUC-ify walsender MAX_SEND_SIZE constant

2024-04-23 Thread Jakub Wartak
Hi,

> My understanding of Majid's use-case for tuning MAX_SEND_SIZE is that the
> bottleneck is storage, not network. The reason MAX_SEND_SIZE affects that is
> that it determines the max size passed to WALRead(), which in turn determines
> how much we read from the OS at once.  If the storage has high latency but
> also high throughput, and readahead is disabled or just not aggressive enough
> after crossing segment boundaries, larger reads reduce the number of times
> you're likely to be blocked waiting for read IO.
>
> Which is also why I think that making MAX_SEND_SIZE configurable is a really
> poor proxy for improving the situation.
>
> We're imo much better off working on read_stream.[ch] support for reading WAL.

Well then that would be a consistent message at least, because earlier
in [1] it was rejected to have prefetch the WAL segment but on the
standby side, where the patch was only helping in configurations
having readahead *disabled* for some reason.

Now Majid stated that he uses "RBD" - Majid, any chance to specify
what that RBD really is ? What's the tech? What fs? Any ioping or fio
results? What's the blockdev --report /dev/XXX output ? (you stated
"high" latency and "high" bandwidth , but it is relative, for me 15ms+
is high latency and >1000MB/s sequential, but it would help others in
future if you could specify it by the exact numbers please). Maybe
it's just a matter of enabling readahead (line in [1]) there and/or
using a higher WAL segment during initdb.

[1] - 
https://www.postgresql.org/message-id/flat/CADVKa1WsQMBYK_02_Ji%3DpbOFnms%2BCT7TV6qJxDdHsFCiC9V_hw%40mail.gmail.com