Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread Adrian Klaver

On 8/21/25 08:13, hubert depesz lubaczewski wrote:

On Thu, Aug 21, 2025 at 08:04:25AM -0700, Adrian Klaver wrote:



By bouncer I assume you mean something like pgBouncer, a connection pooler.
Is it possible to determine what bouncer the queries in question are coming
from?


 From the POV of db, all queries are coming from one of N localhost
bouncers. N is usually 2…6.
 From the POV of the local bouncers, the queries come from range of
remote bouncers.

Generally we haven't seen any correlation between queries coming from
specific ranges of ips. Logged queries, the ones that we see with
runtime of 1s, have comments that indicate source, and they some from
"all-around". Specifically, "DISCARD ALL" queries are generated by
bouncers themselves (both layers).


Well so much for that theory:)



Just so that it will be clear, I don't expect anyone to be able to
diagnose the problem based on description. I'm looking more into idea
what to look for. The issue is that with the situation being pretty
short, and happening on servers with non-trivial query load, I can't do
stuff, like, for example, strace, stuff.


Getting to the bottom of the bag of ideas:

Have you looked at the OS system log for the time period involved?

You mentioned this seemed to involve PREPARE and DISCARD ALL.
Is this the same set of statements or is it all over the place?

Also it would be helpful to know what bouncer you are actually using and 
what mode you are running in?





Best regards,

depesz




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Q: GRANT ... WITH ADMIN on PG 17

2025-08-21 Thread Karsten Hilbert
Am Thu, Aug 21, 2025 at 08:46:00AM -0700 schrieb Adrian Klaver:

> >PG 17 documentation says that using "WITH ADMIN" allows the
> >role being added to another group role to grant/revoke
> >membership in said group to other roles.
> 
> I would start by reading this:
> 
> https://rhaas.blogspot.com/2023/01/surviving-without-superuser-coming-to.html

Thanks, I did, but did not find the answer to: Is there a
way for a role that can manage membership in a group role to
not itself be a member of that group role ?

Best regards,
Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: Q: GRANT ... WITH ADMIN on PG 17

2025-08-21 Thread Dominique Devienne
On Thu, Aug 21, 2025 at 6:00 PM Karsten Hilbert  wrote:
> Am Thu, Aug 21, 2025 at 08:46:00AM -0700 schrieb Adrian Klaver:
> > https://rhaas.blogspot.com/2023/01/surviving-without-superuser-coming-to.html
>
> Thanks, I did, but did not find the answer to: Is there a
> way for a role that can manage membership in a group role to
> not itself be a member of that group role ?

Yes and no. Depends what you mean by MEMBER...

You can be ADMIN of a ROLE, yet not have SET or INHERIT on that ROLE.
As ADMIN you can grant yourself those SET and INHERIT, sure.
But still, with only ADMIN, you can't SET ROLE. So are you a MEMBER?
In the pg_has_role(), you are. In reality, not so much.

In fact, I argued (after the fact it broke my architecture, introducing cycles)
that have just ADMIN shouldn't have considered it a MEMBER of the role.
But that ship had sailed already. 1 year later, we're finally catching up.
I truly hope v18 won't reserve us similar backward-compatibility breaks.

Read the docs for pg_auth_members. pg_has_role(). create role.
If you have CREATEROLE, and do a CREATE ROLE foo, you'll
have ADMIN on foo, but not SET or INHERIT (but you can grant them to yourself).
Also look at pg_auth_members.grantor::regrole::text and you'll see that the
postgres SUPERUSER itself gave you that ADMIN grant. But if you grant yourself
the role, it's a separate pg_auth_members row, and you're now the grantor.

So I didn't spend time studying your specific use case. That's your job :).
But given my painful experience of the past year, I'd answer yes to your
question, on logical grounds. If you see what I mean. --DD




Re: Q: GRANT ... WITH ADMIN on PG 17

2025-08-21 Thread Karsten Hilbert
Am Thu, Aug 21, 2025 at 06:29:36PM +0200 schrieb Dominique Devienne:

> > Thanks, I did, but did not find the answer to: Is there a
> > way for a role that can manage membership in a group role to
> > not itself be a member of that group role ?
> 
> Yes and no. Depends what you mean by MEMBER...
...
> So I didn't spend time studying your specific use case. That's your job :).
> But given my painful experience of the past year, I'd answer yes to your
> question, on logical grounds. If you see what I mean. --DD

I followed your posts back then when you worked out your use
case so I did have _some_ idea where to look ;-)

I just wanted to confirm my understanding in relation to my
current usage. David kindly provided the needed affirmation.

Regards,
Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: Domains vs data types

2025-08-21 Thread Merlin Moncure
On Thu, Aug 21, 2025 at 2:11 AM Dominique Devienne  wrote:
> On Wed, Aug 20, 2025 at 7:37 PM Florents Tselai
>  wrote:
> > > On 20 Aug 2025, at 7:47 AM, Ertan Küçükoglu  
> > > wrote:
> > > I would like to learn if there is any benefit of using domains over data 
> > > types for table column definitions in terms of performance gain/loss.
>
> > I know that this doesn’t answer your question, but before exploring custom 
> > types / domains,
> > and based on experience, I’d strongly recommend exploring jsonb instead as 
> > an alternative.
>
> I stayed out of that thread, but this makes me step out and react.
> domains are typically out constraining the value space of a type.
> I.e. a type with a CHECK constraint. W/o one, it's pretty useless IMHO.

+1 this.

The main use for domains is to allow for standard constraints.   If
you find yourself writing the same constraint over and over, that's
when you might consider using them.

For example, in the USA, vehicle identification numbers must be
exactly 17 characters long.

postgres@postgres=# create domain vin as text check (length(VALUE) = 17);
CREATE DOMAIN
postgres@postgres=# select 'abc'::TEXT::VIN;
ERROR:  value for domain vin violates check constraint "vin_check"

The other use for domains I see is global sequences where you have
many tables pulling from the same sequence.

postgres@postgres=# create sequence global_id_seq;
CREATE SEQUENCE
postgres@postgres=# create domain global_id as bigint default
nextval('global_id_seq');
CREATE DOMAIN

Using domains is not really a performance thing, it's more regarding
establishing rigor around type rules.  In general, I tend not to use
length constraints for tex columns anymore for most cases (just use
text!), but I could get behind setting some standard length
constraints, say, capped at 1k characters as a safety precaution.

merlin

merlin




Re: Q: GRANT ... WITH ADMIN on PG 17

2025-08-21 Thread Dominique Devienne
On Thu, Aug 21, 2025 at 6:37 PM Karsten Hilbert  wrote:
> Am Thu, Aug 21, 2025 at 06:29:36PM +0200 schrieb Dominique Devienne:
> > > Thanks, I did, but did not find the answer to: Is there a
> > > way for a role that can manage membership in a group role to
> > > not itself be a member of that group role ?
> >
> > Yes and no. Depends what you mean by MEMBER...
> ...
> > So I didn't spend time studying your specific use case. That's your job :).
> > But given my painful experience of the past year, I'd answer yes to your
> > question, on logical grounds. If you see what I mean. --DD
>
> I followed your posts back then when you worked out your use
> case so I did have _some_ idea where to look ;-)

Glad it helped someone, maybe.

> I just wanted to confirm my understanding in relation to my
> current usage. David kindly provided the needed affirmation.

Well, I disagree with David that you're a MEMBER with just ADMIN.
pg_has_role(..., 'MEMBER') says yes. But pg_has_role(..., 'SET') says no.
If you can't endorse the role's privileges, you're not a member in my book.
That was kinda my point. --DD




Re: Q: GRANT ... WITH ADMIN on PG 17

2025-08-21 Thread David G. Johnston
On Thursday, August 21, 2025, Karsten Hilbert 
wrote:

> Am Thu, Aug 21, 2025 at 08:46:00AM -0700 schrieb Adrian Klaver:
>
> > >PG 17 documentation says that using "WITH ADMIN" allows the
> > >role being added to another group role to grant/revoke
> > >membership in said group to other roles.
> >
> > I would start by reading this:
> >
> > https://rhaas.blogspot.com/2023/01/surviving-without-
> superuser-coming-to.html
>
> Thanks, I did, but did not find the answer to: Is there a
> way for a role that can manage membership in a group role to
> not itself be a member of that group role ?
>

A superuser can do this.  Otherwise, no.  In order to be made admin of a
role on must be a member of said role - i.e., “with admin” is only part of
the “grant” command.  You won’t find docs talking about alternatives
because they don’t exist.

David J.


Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread hubert depesz lubaczewski
On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote:
> Getting to the bottom of the bag of ideas:
> Have you looked at the OS system log for the time period involved?

Yes. Mostly dmesg. Nothing interesting logged around the time.

> You mentioned this seemed to involve PREPARE and DISCARD ALL.
> Is this the same set of statements or is it all over the place?

No. From what I can tell it's random sample.

> Also it would be helpful to know what bouncer you are actually using and
> what mode you are running in?

pgBouncer, version 1.23.1. As for more... mostly transaction pooling.
Applications go using transaction pooling, but people (dbas, ops) have
session pooling.

Best regards,

depesz





Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread hubert depesz lubaczewski
On Thu, Aug 21, 2025 at 12:41:44PM +0100, Thom Brown wrote:
> Ah, yeah I meant transparent hugepage:
> cat /sys/kernel/mm/transparent_hugepage/enabled
> This should show it being set as "never".

Ah. Sorry, couldn't decipher. Yes, it's "never".

> > # grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' 
> > postgresql-2025-08-19_22.csv | uniq -c | grep -C3 -P '^\s*\d\d'
> >   2 2025-08-19 22:09:29.084 UTC
> >   1 2025-08-19 22:09:29.094 UTC
> >   2 2025-08-19 22:09:29.097 UTC
> >  70 2025-08-19 22:09:29.109 UTC
> >  90 2025-08-19 22:09:29.110 UTC
> >   6 2025-08-19 22:09:29.111 UTC
> >   1 2025-08-19 22:09:29.153 UTC
> >   1 2025-08-19 22:09:29.555 UTC
…
> > 22:10:54 all2.410.000.280.220.000.100.00
> > 0.000.00   96.99
> > 22:10:59 all2.830.000.290.190.000.120.00
> > 0.000.00   96.57
> 
> This output looks fine, so it doesn't show anything concerning, so
> suggests the issue is somehow on the Postgres side.
> 
> Did you happen to poll pg_stat_activity at the time to see whether you
> had lots of IPC waits? I'm wondering whether the storage layer is
> freezing up for a moment.

So, we get select * from pg_stat_activity, for client backends that are
not idle, every 29 seconds.
So, 1 second "freeze" is impossible to cathc.

Plus - I suspect that if I ran select * from pg_stat_activity while "in
freeze", it would also get frozen.

Anyway, I have data from 22:09:22 and 22:09:51. In both cases only
4 non-idle backend.

6 of them had NULL in wait_event*

one was Client/ClientRead and one was IPC/BgWorkerShutdown.

State_change for the IPC/BgWorkerShutdown backend was 2025-08-19
22:09:51.79504+00 so it was well past the moment when the problem
struck.

Best regards,

depesz





Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread Adrian Klaver

On 8/21/25 03:07, hubert depesz lubaczewski wrote:

On Wed, Aug 20, 2025 at 10:45:13AM -0700, Adrian Klaver wrote:

On 8/20/25 09:08, hubert depesz lubaczewski wrote:

On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:

Hmm.

  From initial post:

"For ~ 1 second there are no logs going to log (we usually have at 5-20
messages logged per second), no connection, nothing. And then we get
bunch (30+) messages with the same milisecond time."
Are the 30+ messages all coming in on one connection or multiple
connections?

Multiple connections.

Also to be clear these are statements that are being run on the replica
locally, correct?

What do you mean locally?

I should have been clearer. Are the queries being run against the replica or
the primary?


All to replica. Primary has its own work, of course, but the problem
we're experiencing is on replicas.


If I am following there is more then one primary --> replica pair and 
the problem exists across all the pairs.



How many applications servers are hitting the database?


To be honest, I'm not sure. I have visibility into dbs, and bouncers,
not really into Apps. I know that these are automatically dynamically
scaled, so number of app server is very varying.

I'd say anything from 40 to 200 app servers hit first layer of bouncers,
which we usually have 6-9 (2-3 per az).

These go to 2nd layer of bouncers, on the db server itself.


By bouncer I assume you mean something like pgBouncer, a connection pooler.

Is it possible to determine what bouncer the queries in question are 
coming from?




Best regards,

depesz




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread Thom Brown
On Thu, 21 Aug 2025 at 11:03, hubert depesz lubaczewski
 wrote:
>
> On Wed, Aug 20, 2025 at 06:30:00PM +0100, Thom Brown wrote:
> > Do you have THP enabled? Can you use mpstat and see what %steal shows as?
>
> Sorry, what is "THP"? I tried searching for "what is thp", and most
> common search results are related to some chemical compound.

Ah, yeah I meant transparent hugepage:

cat /sys/kernel/mm/transparent_hugepage/enabled

This should show it being set as "never".

>
> mpstat, yes, we have it. Let me quickly show what I get:
>
> Two days ago, at ~ 10:10pm UTC I saw this in Pg logs:
>
> # grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_22.csv 
> | uniq -c | grep -C3 -P '^\s*\d\d'
>   2 2025-08-19 22:09:29.084 UTC
>   1 2025-08-19 22:09:29.094 UTC
>   2 2025-08-19 22:09:29.097 UTC
>  70 2025-08-19 22:09:29.109 UTC
>  90 2025-08-19 22:09:29.110 UTC
>   6 2025-08-19 22:09:29.111 UTC
>   1 2025-08-19 22:09:29.153 UTC
>   1 2025-08-19 22:09:29.555 UTC
>
> As you can see we have 70, and then 90 messages all logged with the same
> timestamp. All of them (160) were "duration:x" lines.
>
> At the time, mpstat (called using `mpstat 5`) looked like this:
>
>  CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
> %guest  %gnice   %idle
> 22:09:04 all2.250.000.270.240.000.100.00
> 0.000.00   97.14
> 22:09:09 all2.700.000.280.270.000.120.00
> 0.000.00   96.63
> 22:09:14 all2.830.000.630.230.000.090.00
> 0.000.00   96.22
> 22:09:19 all2.660.000.280.230.000.100.00
> 0.000.00   96.74
> 22:09:24 all2.360.000.270.210.000.100.00
> 0.000.00   97.06
> 22:09:29 all2.490.000.400.400.000.110.00
> 0.000.00   96.60
> 22:09:34 all2.650.000.280.230.000.110.00
> 0.000.00   96.73
> 22:09:39 all2.260.000.260.210.000.090.00
> 0.000.00   97.19
> 22:09:44 all2.310.000.250.220.000.090.00
> 0.000.00   97.13
> 22:09:49 all2.440.000.220.240.000.080.00
> 0.000.00   97.02
> 22:09:54 all2.230.000.230.280.000.100.00
> 0.000.00   97.16
> 22:09:59 all2.480.000.260.300.000.110.00
> 0.000.00   96.85
> 22:10:04 all2.290.000.280.260.000.120.00
> 0.000.00   97.06
> 22:10:09 all2.310.000.220.210.000.110.00
> 0.000.00   97.14
> 22:10:14 all2.380.000.250.210.000.100.00
> 0.000.00   97.05
> 22:10:19 all2.540.000.240.200.000.110.00
> 0.000.00   96.91
> 22:10:24 all2.260.000.250.220.000.100.00
> 0.000.00   97.17
> 22:10:29 all2.120.000.270.240.000.090.00
> 0.000.00   97.27
> 22:10:34 all2.560.000.260.260.000.120.00
> 0.000.00   96.80
> 22:10:39 all2.160.000.230.240.000.100.00
> 0.000.00   97.27
> 22:10:44 all2.370.000.260.220.000.100.00
> 0.000.00   97.05
> 22:10:49 all2.250.000.250.610.000.090.00
> 0.000.00   96.80
> 22:10:54 all2.410.000.280.220.000.100.00
> 0.000.00   96.99
> 22:10:59 all2.830.000.290.190.000.120.00
> 0.000.00   96.57

This output looks fine, so it doesn't show anything concerning, so
suggests the issue is somehow on the Postgres side.

Did you happen to poll pg_stat_activity at the time to see whether you
had lots of IPC waits? I'm wondering whether the storage layer is
freezing up for a moment.

Thom




Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread hubert depesz lubaczewski
On Thu, Aug 21, 2025 at 08:04:25AM -0700, Adrian Klaver wrote:
> > > > > "For ~ 1 second there are no logs going to log (we usually have at 
> > > > > 5-20
> > > > > messages logged per second), no connection, nothing. And then we get
> > > > > bunch (30+) messages with the same milisecond time."
> > > > > Are the 30+ messages all coming in on one connection or multiple
> > > > > connections?
> > > > Multiple connections.
> > > > > Also to be clear these are statements that are being run on the 
> > > > > replica
> > > > > locally, correct?
> > > > What do you mean locally?
> > > I should have been clearer. Are the queries being run against the replica 
> > > or
> > > the primary?
> > All to replica. Primary has its own work, of course, but the problem
> > we're experiencing is on replicas.
> 
> If I am following there is more then one primary --> replica pair and the
> problem exists across all the pairs.

Not all. We have ~ 300 such clusters. The thing doesn't cause any
customer-visible issues (after all it's just 1 second delay every so
often), so it's generally overlooked when it happens.

But we were paying closer attention to one such cluster, and then couple
of other, and we've seen this behavior.

> > > How many applications servers are hitting the database?
> > 
> > To be honest, I'm not sure. I have visibility into dbs, and bouncers,
> > not really into Apps. I know that these are automatically dynamically
> > scaled, so number of app server is very varying.
> > 
> > I'd say anything from 40 to 200 app servers hit first layer of bouncers,
> > which we usually have 6-9 (2-3 per az).
> > 
> > These go to 2nd layer of bouncers, on the db server itself.
> 
> By bouncer I assume you mean something like pgBouncer, a connection pooler.
> Is it possible to determine what bouncer the queries in question are coming
> from?

>From the POV of db, all queries are coming from one of N localhost
bouncers. N is usually 2…6.
>From the POV of the local bouncers, the queries come from range of
remote bouncers.

Generally we haven't seen any correlation between queries coming from
specific ranges of ips. Logged queries, the ones that we see with
runtime of 1s, have comments that indicate source, and they some from
"all-around". Specifically, "DISCARD ALL" queries are generated by
bouncers themselves (both layers).

Just so that it will be clear, I don't expect anyone to be able to
diagnose the problem based on description. I'm looking more into idea
what to look for. The issue is that with the situation being pretty
short, and happening on servers with non-trivial query load, I can't do
stuff, like, for example, strace, stuff.

Best regards,

depesz





Re: Q: GRANT ... WITH ADMIN on PG 17

2025-08-21 Thread Adrian Klaver

On 8/21/25 08:36, Karsten Hilbert wrote:

Dear all,

PG 17 documentation says that using "WITH ADMIN" allows the
role being added to another group role to grant/revoke
membership in said group to other roles.


I would start by reading this:

https://rhaas.blogspot.com/2023/01/surviving-without-superuser-coming-to.html




Thanks,
Karsten



--
Adrian Klaver
adrian.kla...@aklaver.com




Q: GRANT ... WITH ADMIN on PG 17

2025-08-21 Thread Karsten Hilbert
Dear all,

PG 17 documentation says that using "WITH ADMIN" allows the
role being added to another group role to grant/revoke
membership in said group to other roles.

Does this imply that an ADMIN role _must_ itself be a member
of the group role it is to maintain membership of ?

The question arises from a scenario where a DBA role would
not need to be a member of a clinical group role but would
be intended to maintain membership of clinical user roles
within that group role.

 From a security point of view the question might be moot
because an ADMIN role could always grant itself membership
in the group role -- but it feels wrong for reasons of
theoretical "correctness".

IOW:

- gm-dbo: user role for a DBA admin (not! superuser)
- gm-bones: user role for a LLAP doctor
- gm-doctors: group role for doctors, upon which are resting
  access permissions for clinical data
- gm-bones is to be a member of gm-doctors in order to access clinical data
- gm-dbo is intended to manage membership of gm-bones in gm-doctors
- however, gm-dbo need not itself be a member of gm-doctors

Is that possible within the current (as of PG 17) framework ?

Thanks,
Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: Q: GRANT ... WITH ADMIN on PG 17

2025-08-21 Thread Karsten Hilbert
Am Thu, Aug 21, 2025 at 09:11:57AM -0700 schrieb David G. Johnston:

> > Thanks, I did, but did not find the answer to: Is there a
> > way for a role that can manage membership in a group role to
> > not itself be a member of that group role ?
> 
> A superuser can do this.  Otherwise, no.  In order to be made admin of a
> role on must be a member of said role

Thanks, fine, that makes it clear to me.

Regards,
Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread Chris Wilson
If all your queries are coming through pgBouncer, and only those hang (the
server itself responds if you connect directly to it), then it might be
this pgBouncer issue:

https://github.com/pgbouncer/pgbouncer/issues/1054

Although that issue is now "closed", because the invisible "debug" log
message was upgraded to a warning (and I don't think that change is in any
released version), the underlying problem still exists: pgbouncer hangs
completely (stops forwarding packets) for a while if the PAM
authentication queue becomes full.

If you have a relatively slow PAM service (such as pam_ldap) then you can
trigger it by opening ~100 connections to pgBouncer simultaneously (without
waiting for previous ones to authenticate), something like this:

for i in `seq 1 100`; do psql -h pgbouncer -p 6432 -U user db_name -c
"SELECT 1" & done


Thanks, Chris.

On Thu, 21 Aug 2025 at 19:17, Adrian Klaver 
wrote:

> On 8/21/25 09:51, hubert depesz lubaczewski wrote:
> > On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote:
> >> Getting to the bottom of the bag of ideas:
> >> Have you looked at the OS system log for the time period involved?
> >
> > Yes. Mostly dmesg. Nothing interesting logged around the time.
> >
> >> You mentioned this seemed to involve PREPARE and DISCARD ALL.
> >> Is this the same set of statements or is it all over the place?
> >
> > No. From what I can tell it's random sample.
> >
> >> Also it would be helpful to know what bouncer you are actually using and
> >> what mode you are running in?
> >
> > pgBouncer, version 1.23.1. As for more... mostly transaction pooling.
> > Applications go using transaction pooling, but people (dbas, ops) have
> > session pooling.
>
> Have you looked at?:
>
> https://www.pgbouncer.org/changelog.html#pgbouncer-124x
>
> To see if anything stands out.
>
> Then there is:
>
> https://www.pgbouncer.org/config.html#max_prepared_statements
>
> The below may also be worth looking at:
>
> https://github.com/pgbouncer/pgbouncer/pull/1144
>
> I can't help thinking that there is a caching issue at stake, though
> that is just a guess.
>
>
> >
> > Best regards,
> >
> > depesz
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>


Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread Adrian Klaver

On 8/21/25 09:51, hubert depesz lubaczewski wrote:

On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote:

Getting to the bottom of the bag of ideas:
Have you looked at the OS system log for the time period involved?


Yes. Mostly dmesg. Nothing interesting logged around the time.


You mentioned this seemed to involve PREPARE and DISCARD ALL.
Is this the same set of statements or is it all over the place?


No. From what I can tell it's random sample.


Also it would be helpful to know what bouncer you are actually using and
what mode you are running in?


pgBouncer, version 1.23.1. As for more... mostly transaction pooling.
Applications go using transaction pooling, but people (dbas, ops) have
session pooling.


Have you looked at?:

https://www.pgbouncer.org/changelog.html#pgbouncer-124x

To see if anything stands out.

Then there is:

https://www.pgbouncer.org/config.html#max_prepared_statements

The below may also be worth looking at:

https://github.com/pgbouncer/pgbouncer/pull/1144

I can't help thinking that there is a caching issue at stake, though 
that is just a guess.





Best regards,

depesz




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Q: GRANT ... WITH ADMIN on PG 17

2025-08-21 Thread Adrian Klaver

On 8/21/25 09:29, Dominique Devienne wrote:

On Thu, Aug 21, 2025 at 6:00 PM Karsten Hilbert  wrote:

Am Thu, Aug 21, 2025 at 08:46:00AM -0700 schrieb Adrian Klaver:

https://rhaas.blogspot.com/2023/01/surviving-without-superuser-coming-to.html


Thanks, I did, but did not find the answer to: Is there a
way for a role that can manage membership in a group role to
not itself be a member of that group role ?


Yes and no. Depends what you mean by MEMBER...



Read the docs for pg_auth_members. pg_has_role(). create role.
If you have CREATEROLE, and do a CREATE ROLE foo, you'll
have ADMIN on foo, but not SET or INHERIT (but you can grant them to yourself).


That is a matter of choice as described here:

https://www.postgresql.org/docs/current/runtime-config-client.html

createrole_self_grant (string)

If a user who has CREATEROLE but not SUPERUSER creates a role, and 
if this is set to a non-empty value, the newly-created role will be 
granted to the creating user with the options specified. The value must 
be set, inherit, or a comma-separated list of these. The default value 
is an empty string, which disables the feature.


The purpose of this option is to allow a CREATEROLE user who is not 
a superuser to automatically inherit, or automatically gain the ability 
to SET ROLE to, any created users. Since a CREATEROLE user is always 
implicitly granted ADMIN OPTION on created roles, that user could always 
execute a GRANT statement that would achieve the same effect as this 
setting. However, it can be convenient for usability reasons if the 
grant happens automatically. A superuser automatically inherits the 
privileges of every role and can always SET ROLE to any role, and this 
setting can be used to produce a similar behavior for CREATEROLE users 
for users which they create.





Also look at pg_auth_members.grantor::regrole::text and you'll see that the
postgres SUPERUSER itself gave you that ADMIN grant. But if you grant yourself
the role, it's a separate pg_auth_members row, and you're now the grantor.

So I didn't spend time studying your specific use case. That's your job :).
But given my painful experience of the past year, I'd answer yes to your
question, on logical grounds. If you see what I mean. --DD



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Domains vs data types

2025-08-21 Thread Dominique Devienne
On Wed, Aug 20, 2025 at 7:37 PM Florents Tselai
 wrote:
> > On 20 Aug 2025, at 7:47 AM, Ertan Küçükoglu  
> > wrote:
> > I would like to learn if there is any benefit of using domains over data 
> > types for table column definitions in terms of performance gain/loss.


> I know that this doesn’t answer your question, but before exploring custom 
> types / domains,
> and based on experience, I’d strongly recommend exploring jsonb instead as an 
> alternative.

I stayed out of that thread, but this makes me step out and react.
domains are typically out constraining the value space of a type.
I.e. a type with a CHECK constraint. W/o one, it's pretty useless IMHO.

While json/jsonb is about denormalizing and stuffing unconstrained data,
of arbitrary (and often evolving) content. No need for complex relational
modeling and its associated constraints and "rigidity". I.e. easy evolution
of the data tier, at the cost of applications having to deal with the complexity
themselves of constrained data (must expect anything). Your SQL also becomes
less expressive or more complex, although PostgreSQL has excellent JSON support.

> Also note that using custom types can lead to some confusion initially for 
> basic stuff
> you can’t do things like SELECT mytype.f1 and instead you have to do 
> (mytype).f1

I guess you're talking about composite types? Not sure OP had that in mind.
But that's still a good point. Thanks for sharing.

Yes, using a custom type/domain is more metadata, since a new row in pg_type.
But that's mostly negligeable.

There's always an (integral) OID associated to columns, so only its value
changes if you start using a custom type, be it i memory or on disk. So no,
using a custom type is unlikely to make things bigger or slower.

That said, in my case, because I use the BINARY mode of LIBPQ and COPY,
it does matter, as my code knows about built-in OIDs, but not of custom OIDs
for custom types. But few people care about such things.

FWIW, --DD




Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread hubert depesz lubaczewski
On Wed, Aug 20, 2025 at 11:24:29AM -0600, Rob Sargent wrote:
> > On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:
> >> Hmm.
> >> 
> >> From initial post:
> >> 
> >> "For ~ 1 second there are no logs going to log (we usually have at 5-20
> >> messages logged per second), no connection, nothing. And then we get
> >> bunch (30+) messages with the same milisecond time."
> 
> Is that logger ms or db server ms?  The latter seems unlikely to me. 

Sorry, I don't understand your question.

I know it's unlikely, so it seems that something "froze" pg, for ~ 1s.
The question is how to diagnose/debug what it is.

Best regards,

depesz





Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread hubert depesz lubaczewski
On Wed, Aug 20, 2025 at 06:30:00PM +0100, Thom Brown wrote:
> Do you have THP enabled? Can you use mpstat and see what %steal shows as?

Sorry, what is "THP"? I tried searching for "what is thp", and most
common search results are related to some chemical compound.

mpstat, yes, we have it. Let me quickly show what I get:

Two days ago, at ~ 10:10pm UTC I saw this in Pg logs:

# grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_22.csv | 
uniq -c | grep -C3 -P '^\s*\d\d'
  2 2025-08-19 22:09:29.084 UTC
  1 2025-08-19 22:09:29.094 UTC
  2 2025-08-19 22:09:29.097 UTC
 70 2025-08-19 22:09:29.109 UTC
 90 2025-08-19 22:09:29.110 UTC
  6 2025-08-19 22:09:29.111 UTC
  1 2025-08-19 22:09:29.153 UTC
  1 2025-08-19 22:09:29.555 UTC

As you can see we have 70, and then 90 messages all logged with the same
timestamp. All of them (160) were "duration:x" lines.

At the time, mpstat (called using `mpstat 5`) looked like this:

 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
%guest  %gnice   %idle
22:09:04 all2.250.000.270.240.000.100.00
0.000.00   97.14
22:09:09 all2.700.000.280.270.000.120.00
0.000.00   96.63
22:09:14 all2.830.000.630.230.000.090.00
0.000.00   96.22
22:09:19 all2.660.000.280.230.000.100.00
0.000.00   96.74
22:09:24 all2.360.000.270.210.000.100.00
0.000.00   97.06
22:09:29 all2.490.000.400.400.000.110.00
0.000.00   96.60
22:09:34 all2.650.000.280.230.000.110.00
0.000.00   96.73
22:09:39 all2.260.000.260.210.000.090.00
0.000.00   97.19
22:09:44 all2.310.000.250.220.000.090.00
0.000.00   97.13
22:09:49 all2.440.000.220.240.000.080.00
0.000.00   97.02
22:09:54 all2.230.000.230.280.000.100.00
0.000.00   97.16
22:09:59 all2.480.000.260.300.000.110.00
0.000.00   96.85
22:10:04 all2.290.000.280.260.000.120.00
0.000.00   97.06
22:10:09 all2.310.000.220.210.000.110.00
0.000.00   97.14
22:10:14 all2.380.000.250.210.000.100.00
0.000.00   97.05
22:10:19 all2.540.000.240.200.000.110.00
0.000.00   96.91
22:10:24 all2.260.000.250.220.000.100.00
0.000.00   97.17
22:10:29 all2.120.000.270.240.000.090.00
0.000.00   97.27
22:10:34 all2.560.000.260.260.000.120.00
0.000.00   96.80
22:10:39 all2.160.000.230.240.000.100.00
0.000.00   97.27
22:10:44 all2.370.000.260.220.000.100.00
0.000.00   97.05
22:10:49 all2.250.000.250.610.000.090.00
0.000.00   96.80
22:10:54 all2.410.000.280.220.000.100.00
0.000.00   96.99
22:10:59 all2.830.000.290.190.000.120.00
0.000.00   96.57

Best regards,

depesz





Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug

2025-08-21 Thread hubert depesz lubaczewski
On Wed, Aug 20, 2025 at 10:45:13AM -0700, Adrian Klaver wrote:
> On 8/20/25 09:08, hubert depesz lubaczewski wrote:
> > On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote:
> > > Hmm.
> > > 
> > >  From initial post:
> > > 
> > > "For ~ 1 second there are no logs going to log (we usually have at 5-20
> > > messages logged per second), no connection, nothing. And then we get
> > > bunch (30+) messages with the same milisecond time."
> > > Are the 30+ messages all coming in on one connection or multiple
> > > connections?
> > Multiple connections.
> > > Also to be clear these are statements that are being run on the replica
> > > locally, correct?
> > What do you mean locally?
> I should have been clearer. Are the queries being run against the replica or
> the primary?

All to replica. Primary has its own work, of course, but the problem
we're experiencing is on replicas.

> > Application servers are *not* on the same server as the db. So no, they
> > are not local.
> Where are the application servers relative to the replica server?

Well, there is a lot of them. All are in AWS EC2, same region, various
availability zones. Generally we use 3 AZs, with more or less equal
split, so I'd say ~ 33% of app servers is within the same AZ.

> How many applications servers are hitting the database?

To be honest, I'm not sure. I have visibility into dbs, and bouncers,
not really into Apps. I know that these are automatically dynamically
scaled, so number of app server is very varying.

I'd say anything from 40 to 200 app servers hit first layer of bouncers,
which we usually have 6-9 (2-3 per az).

These go to 2nd layer of bouncers, on the db server itself.

Best regards,

depesz