On Mon, Mar 17, 2025 at 10:22:47AM -0700, Jacob Champion wrote:
> I looked into switching over to pgstat_report_activity(), but that
> wasn't designed to be called in the middle of backend initialization.
> It would take more work to make those calls safe/sane when `st_state
> == STATE_STARTING`. I
On Thu, Mar 13, 2025 at 10:56 AM Andres Freund wrote:
> > Given the choice between a usually-working PAM module with known
> > architectural flaws, and not having PAM at all, I think many users
> > would rather continue using what's working for them.
>
> authentication_timeout currently doesn't re
On 2025-03-13 10:29:49 -0700, Jacob Champion wrote:
> On Thu, Mar 13, 2025 at 9:56 AM Andres Freund wrote:
> > I am wondering if PAM is so fundamentally incompatible with handling
> > interrupts / a non-blocking interface that we have little choice but to
> > eventually remove it...
>
> Given the
On Thu, Mar 13, 2025 at 9:56 AM Andres Freund wrote:
> I am wondering if PAM is so fundamentally incompatible with handling
> interrupts / a non-blocking interface that we have little choice but to
> eventually remove it...
Given the choice between a usually-working PAM module with known
architec
Hi,
On 2025-03-13 09:23:10 -0700, Jacob Champion wrote:
> On Wed, Mar 12, 2025 at 3:16 PM Jacob Champion
> wrote:
> > I missed PAM_CONV, sorry. I'm worried about the sendAuthRequest()
> > being done there; it doesn't seem safe to potentially ereport(ERROR)
> > and longjmp through a PAM call stack
On Wed, Mar 12, 2025 at 3:16 PM Jacob Champion
wrote:
> I missed PAM_CONV, sorry. I'm worried about the sendAuthRequest()
> being done there; it doesn't seem safe to potentially ereport(ERROR)
> and longjmp through a PAM call stack? But I'll switch those over to
> something safe or else drop that
On Fri, Mar 7, 2025 at 10:28 AM Jacob Champion
wrote:
> > I think some of the wrapped calls into library code might actually call back
> > into our code (to receive/send data), and our code then will use wait events
> > around lower level operations done as part of that.
>
> That would be a proble
On Fri, Mar 7, 2025 at 9:25 AM Andres Freund wrote:
> I should have clarified - there are a few that I think are ok, basically the
> places where we wrap syscalls, e.g. around the sendto, select and recvfrom in
> PerformRadiusTransaction().
Okay.
> OTOH that code is effectively completely broken
Hi,
On 2025-03-07 09:03:18 -0800, Jacob Champion wrote:
> On Fri, Mar 7, 2025 at 8:38 AM Andres Freund wrote:
> > FWIW, I continue to think that this is a misuse of wait events. We shouldn't
> > use them as a poor man's general purpose tracing framework.
>
> Well, okay. That's frustrating.
I sho
On Fri, Mar 7, 2025 at 8:38 AM Andres Freund wrote:
> FWIW, I continue to think that this is a misuse of wait events. We shouldn't
> use them as a poor man's general purpose tracing framework.
Well, okay. That's frustrating.
If I return to the original design, but replace all of the high-level
w
Hi,
On 2025-03-06 15:39:44 -0800, Jacob Champion wrote:
> I've reattached the wait event patches, to get the cfbot back to where it was.
FWIW, I continue to think that this is a misuse of wait events. We shouldn't
use them as a poor man's general purpose tracing framework.
Greetings,
Andres Fre
On Wed, Mar 5, 2025 at 8:08 PM Michael Paquier wrote:
> Honestly, I don't see a reason not to introduce that, like in the
> attached.
This new code races against the session timeout. I see this on timer expiration:
[14:19:55.224](0.000s) # issuing query 34 via background psql:
SELECT state F
On Thu, Mar 6, 2025 at 3:15 PM Michael Paquier wrote:
> I have applied the simplest patch for now, to silence the failures in
> the CI, and included your suggestion to add a check on the
> backend_type for the extra safety it offers.
Thanks! Initial CI run looks green, so that's a good start.
I'
On Thu, Mar 06, 2025 at 02:25:07PM -0800, Jacob Champion wrote:
> On Wed, Mar 5, 2025 at 8:08 PM Michael Paquier wrote:
>> + WHERE state = 'starting' and wait_event = 'init-pre-auth';});
>
> Did you have thoughts on expanding the check to backend_type [1]?
>
>> + # Give up. The output of the l
On Wed, Mar 05, 2025 at 08:55:55PM -0500, Andres Freund wrote:
> Once we've slept for 10+ seconds without reaching the target, sleeping for
> 100us is way too short a sleep and just wastes CPU cycles. A decent portion
> of the CPU time when running under valgrind is wasted just due to way too
> tig
Hi,
On 2025-03-05 16:19:04 -0800, Jacob Champion wrote:
> On Wed, Mar 5, 2025 at 9:28 AM Andres Freund wrote:
> > Unrelated to the change in this patch, but tests really shouldn't use
> > while(1)
> > loops without a termination condition. If something is wrong, the test will
> > hang indefinite
On Wed, Mar 5, 2025 at 9:28 AM Andres Freund wrote:
> Unrelated to the change in this patch, but tests really shouldn't use while(1)
> loops without a termination condition. If something is wrong, the test will
> hang indefinitely, instead of timing out. On the buildfarm that can take out
> an an
Hi,
On 2025-03-05 08:16:45 -0800, Jacob Champion wrote:
> From efc9fc3b3993601e9611131f229fbcaf4daa46f1 Mon Sep 17 00:00:00 2001
> From: Michael Paquier
> Date: Wed, 5 Mar 2025 13:30:43 +0900
> Subject: [PATCH 1/2] Fix race condition in pre-auth test
>
> ---
> src/test/authentication/t/007_pre_
On Wed, Mar 5, 2025 at 5:47 AM Jacob Champion
wrote:
>
> So while we're at it, should we add a
> `backend_type = 'client backend'` filter to stop that from flaking in
> the future? That would further align this query with the
> wait_for_event() implementation.
More concretely: here's a squashable
On Tue, Mar 4, 2025 at 8:45 PM Michael Paquier wrote:
> What this is telling us is that we should change the query scanning
> pg_stat_activity for a PID of a backend in 'starting' state so as we
> also check the wait_event init-pre-auth, as this is reported when
> using injection point waits. The
On Tue, Mar 4, 2025 at 4:10 PM Andres Freund wrote:
> This seems to trigger a bunch of CI failures, e.g.:
>
> https://cirrus-ci.com/task/5350341408980992
> https://cirrus-ci.com/task/5537391798124544
> https://cirrus-ci.com/task/4657439905153024
Hm. All Windows.
> https://api.cirrus-ci.com/v1/ar
On Tue, Mar 04, 2025 at 04:53:14PM -0800, Jacob Champion wrote:
> I'll work on a fix, but it probably won't be fast since I need to
> learn more about the injection points architecture. The test may need
> to be disabled, or the patch backed out, depending on how painful the
> flake is for everybod
On Tue, Mar 4, 2025 at 4:26 PM Jacob Champion
wrote:
> But attaching to that injection point succeeded above, for us to have
> gotten to this point... Does that error message indicate that the
> point itself doesn't exist, or that nothing is currently waiting?
Looks like the latter. With the foll
Hi,
On 2025-03-04 17:51:17 +0900, Michael Paquier wrote:
> On Mon, Mar 03, 2025 at 02:23:51PM +0900, Michael Paquier wrote:
> > This has always been set last and it's still the case in the patch, so
> > let's just remove that.
>
> This first one has been now applied as c76db55c9085. Attached is t
On Tue, Mar 4, 2025 at 12:51 AM Michael Paquier wrote:
>
> On Mon, Mar 03, 2025 at 02:23:51PM +0900, Michael Paquier wrote:
> > This has always been set last and it's still the case in the patch, so
> > let's just remove that.
>
> This first one has been now applied as c76db55c9085.
Thanks!
> At
On Mon, Mar 03, 2025 at 02:23:51PM +0900, Michael Paquier wrote:
> This has always been set last and it's still the case in the patch, so
> let's just remove that.
This first one has been now applied as c76db55c9085. Attached is the
rest to add the wait events (still need to have a closer look at
On Fri, Feb 28, 2025 at 12:40:13PM -0800, Jacob Champion wrote:
> v9 removes the first call, and moves the second (now only) call up and
> out of the if/else chain, just past client authentication. The SSL
> pre-auth tests have been removed.
I have put my eyes on 0001, and this version looks sensi
On Fri, Feb 14, 2025 at 5:34 PM Jacob Champion
wrote:
> (But it doesn't
> seem like we're going to agree on this for now; in the meantime I'll
> prepare a version of the patch that only calls
> pgstat_bestart_security() once.)
v9 removes the first call, and moves the second (now only) call up and
On Thu, Feb 13, 2025 at 4:03 PM Michael Paquier wrote:
> > If a CA is issuing Subject data that is somehow dangerous to the
> > operation of the server, I think that's a security problem in and of
> > itself: there are clientcert HBA modes that don't validate the
> > Subject, but they're still goi
On Thu, Feb 13, 2025 at 09:53:52AM -0800, Jacob Champion wrote:
> I guess I'm going to zero in on your definition of "may know nothing
> about". If that is true, something is very wrong IMO.
>
> My understanding of the backend code was that port->peer is only set
> after OpenSSL has verified that
On Tue, Feb 11, 2025 at 11:23 PM Michael Paquier wrote:
> +be_tls_get_peer_subject_name(MyProcPort, lsslstatus.ssl_client_dn,
> NAMEDATALEN);
> +be_tls_get_peer_serial(MyProcPort, lsslstatus.ssl_client_serial,
> NAMEDATALEN);
> +be_tls_get_peer_issuer_name(MyProcPort, lss
On Mon, Feb 10, 2025 at 11:05:32AM -0800, Jacob Champion wrote:
> Bad regex escaping on my part; fixed in v8. Thanks for the report!
>
> While debugging, I also noticed that a poorly timed autovacuum could
> also show up in my new pg_stat_activity query, so I've increased the
> specificity.
Now t
On Mon, Feb 10, 2025 at 8:23 AM Jacob Champion
wrote:
> The test is supposed to enforce that, but I see that it's not for some
> reason. That's concerning. I'll investigate, thanks for pointing it
> out.
Bad regex escaping on my part; fixed in v8. Thanks for the report!
While debugging, I also n
On Thu, Feb 6, 2025 at 12:35 AM Michael Paquier wrote:
> /* Update app name to current GUC setting */
> + /* TODO: ask the list: maybe do this before setting STATE_UNDEFINED?
> */
> if (application_name)
> pgstat_report_appname(application_name);
>
> Historic
On Mon, Nov 11, 2024 at 03:17:44PM -0800, Jacob Champion wrote:
> 1. pgstat_bestart_initial() reports STATE_STARTING, fills in the early
> fields and clears out the rest.
> 2. pgstat_bestart_security() reports the SSL/GSS status of the
> connection. This is only for backends with a valid MyProcPort
On Fri, Nov 8, 2024 at 4:23 PM Jacob Champion
wrote:
> While I work on breaking pgstat_bestart() apart, here is a v6 which
> pushes down the "coarse" wait events. No changes to 0001 yet.
v7 rewrites 0001 by splitting pgstat_bestart() into three phases.
(0002-3 are unchanged.)
1. pgstat_bestart_i
On Thu, Nov 7, 2024 at 4:38 PM Jacob Champion
wrote:
> Oh... I think that alone is enough to change my mind; I neglected the
> effects of that little pgstat_report_appname() stinger...
(Note that application_name is not yet set at the site of the first
call, so I think the set-unset-set can't hap
On Thu, Nov 7, 2024 at 2:56 PM Andres Freund wrote:
> It does actually make things harder - what if somebody added a
> pgstat_report_activity() somewhere between the call? It would suddenly get
> lost after the second "initialization". Actually, the proposed patch already
> has weird, externally
Hi,
On 2024-11-07 14:29:18 -0800, Jacob Champion wrote:
> On Thu, Nov 7, 2024 at 1:37 PM Andres Freund wrote:
> > > - the pre-auth step must always initialize the entire pgstat struct
> >
> > Correct. And that has to happen exactly once, not twice.
>
> What goes wrong if it happens twice?
Prima
On Thu, Nov 7, 2024 at 1:37 PM Andres Freund wrote:
> > - the pre-auth step must always initialize the entire pgstat struct
>
> Correct. And that has to happen exactly once, not twice.
What goes wrong if it happens twice?
> > - two-step initialization requires two PGSTAT_BEGIN_WRITE_ACTIVITY()
>
Hi,
On 2024-11-07 12:11:46 -0800, Jacob Champion wrote:
> On Thu, Nov 7, 2024 at 11:41 AM Andres Freund wrote:
> > I think the patch should not be merged as-is. It's just too ugly and
> > fragile.
>
> Understood; I'm trying to find a way forward, and I'm pointing out
> that the alternatives I'v
On Thu, Nov 7, 2024 at 11:41 AM Andres Freund wrote:
> I think the patch should not be merged as-is. It's just too ugly and fragile.
Understood; I'm trying to find a way forward, and I'm pointing out
that the alternatives I've tried seem to me to be _more_ fragile.
Are there any items in this li
Hi,
On 2024-11-07 10:44:25 -0800, Jacob Champion wrote:
> On Thu, Nov 7, 2024 at 10:12 AM Andres Freund wrote:
> > I don't understand why the pgstat_bestart()/pgstat_bestart_pre_auth() split
> > makes sense. The latter is going to redo most of the work that the former
> > did. What's the point of
On Thu, Nov 7, 2024 at 10:12 AM Andres Freund wrote:
> I don't understand why the pgstat_bestart()/pgstat_bestart_pre_auth() split
> makes sense. The latter is going to redo most of the work that the former
> did. What's the point of that?
>
> Why not have a new function that initializes just the
Hi,
On 2024-11-07 09:20:24 -0800, Jacob Champion wrote:
> From e755fdccf16cb4fcd3036e1209291750ffecd261 Mon Sep 17 00:00:00 2001
> From: Jacob Champion
> Date: Fri, 3 May 2024 15:54:58 -0700
> Subject: [PATCH v5 1/2] pgstat: report in earlier with STATE_STARTING
>
> Add pgstat_bestart_pre_auth()
On Tue, Nov 5, 2024 at 9:48 PM Michael Paquier wrote:
> +PAM_ACCT_MGMT "Waiting for the local PAM service to validate the user
> account."
> +PAM_AUTHENTICATE "Waiting for the local PAM service to authenticate
> the user."
>
> Is "local" required for both? Perhaps just use "the PAM servi
On Wed, Nov 06, 2024 at 02:48:31PM +0900, Michael Paquier wrote:
> I'm OK with 0002 to add the wait parameter to BackgroundPsql and be
> able to take some actions until a manual wait_connect(). I'll go do
> this one. Also perhaps 0001 while on it but I am a bit puzzled by the
> removal of the thr
On Fri, Nov 01, 2024 at 02:47:38PM -0700, Jacob Champion wrote:
> On Sun, Sep 1, 2024 at 5:10 PM Michael Paquier wrote:
>> Could it be more transparent to use a "startup" or
>> "starting"" state instead that gets also used by pgstat_bestart() in
>> the case of the patch where !pre_auth?
>
> Done.
Hi all,
Here's a v4, with a separate wait event for each location. (I could
use some eyes on the specific phrasing I've chosen for each one.)
On Sun, Sep 1, 2024 at 5:10 PM Michael Paquier wrote:
> Could it be more transparent to use a "startup" or
> "starting"" state instead that gets also used
On Wed, Sep 11, 2024 at 09:00:33AM -0400, Robert Haas wrote:
> On Tue, Sep 10, 2024 at 4:58 PM Noah Misch wrote:
> > ... a rule of "each wait event appears in one
> > pgstat_report_wait_start()" would be a rule I don't want.
>
> As the original committer of the wait event stuff, I intended for th
On Wed, Sep 11, 2024 at 02:29:49PM -0700, Jacob Champion wrote:
> On Mon, Sep 9, 2024 at 10:30 PM Michael Paquier wrote:
>> No. My question was about splitting pgstat_bestart() and
>> pgstat_bestart_pre_auth() in a cleaner way, because authenticated
>> connections finish by calling both, meaning
On Mon, Sep 9, 2024 at 10:30 PM Michael Paquier wrote:
> No. My question was about splitting pgstat_bestart() and
> pgstat_bestart_pre_auth() in a cleaner way, because authenticated
> connections finish by calling both, meaning that we do twice the same
> setup for backend entries depending on th
On Tue, Sep 10, 2024 at 4:58 PM Noah Misch wrote:
> ... a rule of "each wait event appears in one
> pgstat_report_wait_start()" would be a rule I don't want.
As the original committer of the wait event stuff, I intended for the
rule that you do not want to be the actual rule. However, I see that
On Tue, Sep 10, 2024 at 01:58:50PM -0700, Noah Misch wrote:
> On Tue, Sep 10, 2024 at 02:51:23PM -0400, Robert Haas wrote:
>> I think unique names are a good idea. If a user doesn't care about the
>> difference between sdgjsA and sdjgsB, they can easily ignore the
>> trailing suffix, and IME, peopl
On Tue, Sep 10, 2024 at 02:51:23PM -0400, Robert Haas wrote:
> On Tue, Sep 10, 2024 at 1:27 PM Noah Misch wrote:
> > On Tue, Sep 10, 2024 at 02:29:57PM +0900, Michael Paquier wrote:
> > > You are adding twelve event points with only 5
> > > new wait names. Couldn't it be better to have a one-one
On Tue, Sep 10, 2024 at 1:27 PM Noah Misch wrote:
> On Tue, Sep 10, 2024 at 02:29:57PM +0900, Michael Paquier wrote:
> > You are adding twelve event points with only 5
> > new wait names. Couldn't it be better to have a one-one mapping
> > instead, adding twelve entries in wait_event_names.txt?
>
On Tue, Sep 10, 2024 at 02:29:57PM +0900, Michael Paquier wrote:
> You are adding twelve event points with only 5
> new wait names. Couldn't it be better to have a one-one mapping
> instead, adding twelve entries in wait_event_names.txt?
No, I think the patch's level of detail is better. One sho
On Tue, Sep 03, 2024 at 02:47:57PM -0700, Jacob Champion wrote:
> On Sun, Sep 1, 2024 at 5:10 PM Michael Paquier wrote:
>> that gets also used by pgstat_bestart() in
>> the case of the patch where !pre_auth?
>
> To clarify, do you want me to just add the new boolean directly to
> pgstat_bestart()
On Sun, Sep 1, 2024 at 5:10 PM Michael Paquier wrote:
> On Fri, Aug 30, 2024 at 04:10:32PM -0400, Andrew Dunstan wrote:
> > Patch 0001 looks sane to me.
> So does 0002 to me.
Thanks both!
> I'm not much a fan of the addition of
> pgstat_bestart_pre_auth(), which is just a shortcut to set a diffe
On Fri, Aug 30, 2024 at 04:10:32PM -0400, Andrew Dunstan wrote:
>
> On 2024-08-29 Th 4:44 PM, Jacob Champion wrote:
> > As for the other patches, I'll ping Andrew about 0001,
>
>
> Patch 0001 looks sane to me.
So does 0002 to me. I'm not much a fan of the addition of
pgstat_bestart_pre_auth(),
On 2024-08-29 Th 4:44 PM, Jacob Champion wrote:
As for the other patches, I'll ping Andrew about 0001,
Patch 0001 looks sane to me.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
On Sun, Jun 30, 2024 at 10:48 AM Noah Misch wrote:v
> > Would anyone like me to be more aggressive, and create a pgstat entry
> > as soon as we have the opening transaction? Or... as soon as a
> > connection is made?
>
> All else being equal, I'd like backends to have one before taking any lmgr
>
On Mon, Jul 08, 2024 at 02:09:21PM -0700, Jacob Champion wrote:
> On Sun, Jun 30, 2024 at 10:48 AM Noah Misch wrote:
> > That said, it
> > may be more fruitful to arrange for authentication timeout to cut through
> > PAM
> > etc.
>
> That seems mostly out of our hands -- the misbehaving modules
On Sun, Jun 30, 2024 at 10:48 AM Noah Misch wrote:
> That looks like a reasonable user experience. Is any field newly-nullable?
Technically I think the answer is no, since backends such as walwriter
already have null database and user fields. It's new for a client
backend to have nulls there, th
On Mon, May 06, 2024 at 02:23:38PM -0700, Jacob Champion wrote:
> =# select * from pg_stat_activity where state = 'authenticating';
> -[ RECORD 1 ]+--
> datid|
> datname |
> pid | 745662
> leader_pid |
>
65 matches
Mail list logo