On Tue, Jan 14, 2025 at 6:00 AM Jacob Champion <
jacob.champ...@enterprisedb.com> wrote:

> On Mon, Jan 13, 2025 at 3:21 PM Jacob Champion
> <jacob.champ...@enterprisedb.com> wrote:
> > Next email will discuss the architectural bug that Kashif found.
>
> Okay, here goes. A standard OAuth connection attempt looks like this
> (oh, I hope Gmail doesn't mangle it):
>
>     Issuer    User    libpq   Backend
>       |        |        |
>       |        x -----> x -----> o      [1] Startup Packet
>       |        |        |        |
>       |        |        x <----- x      [2] OAUTHBEARER Request
>       |        |        |        |
>       |        |        x -----> x      [3] Parameter Discovery
>       |        |        |        |
>       |        |        x <----- o      [4] Parameters Stored
>       |        |        |
>       |        |        |
>       |        |        |
>       |        |        x -----> o      [5] New Startup Packet
>       |        |        |        |
>       |        |        x <----- x      [6] OAUTHBEARER Request
>       |        |        |        |
>       x <----- x <----> x        |
>       x <----- x <----> x        |      [7] OAuth Handshake
>       x <----- x <----> x        |
>       |        |        |        |
>       o        |        x -----> x      [8] Send Token
>                |        |        |
>                | <----- x <----- x      [9] Connection Established
>                |        |        |
>                x <----> x <----> x
>                x <----> x <----> x      [10] Use the DB
>                .        .        .
>                .        .        .
>                .        .        .
>
> When the server first asks for a token via OAUTHBEARER (step 2), the
> client doesn't necessarily know what the server's requirements are for
> a given user. It uses the rest of the doomed OAUTHBEARER exchange to
> store the issuer and scope information in the PGconn (step 3-4), then
> disconnects and sets need_new_connection in PQconnectPoll() so that a
> second connection is immediately opened (step 5). When the OAUTHBEARER
> mechanism takes control the second time, it has everything it needs to
> conduct the login flow with the issuer (step 7). It then sends the
> obtained token to establish a connection (steps 8 onward).
>
> The problem is that step 7 is consuming the authentication_timeout for
> the backend. I'm very good at completing these flows quickly, but if
> you can't complete the browser prompts in time, you will simply not be
> able to log into the server. Which is harsh to say the least. (Imagine
> the pain if the standard psql password prompt timed out.) DBAs can get
> around it by increasing the timeout, obviously, but that doesn't feel
> very good as a solution.
>
> Last week I looked into a fix where libpq would simply try again with
> the stored token if the backend hangs up on it during the handshake,
> but I think that will end up making the UX worse. The token validation
> on the server side isn't going to be instantaneous, so if the client
> is able to complete the token exchange in 59 seconds and send it to
> the backend, there's an excellent chance that the connection is still
> going to be torn down in a way that's indistinguishable from a crash.
> We don't want the two sides to fight for time.
>
> So I think what I'm going to need to do is modify v41-0003 to allow
> the mechanism to politely hang up the connection while the flow is in
> progress. This further decouples the lifetimes of the mechanism and
> the async auth -- the async state now has to live outside of the SASL
> exchange -- but I think it's probably more architecturally sound. Yell
> at me if that sounds unmaintainable or if there's a more obvious fix
> I'm missing.
>
> Huge thanks to Kashif for pointing this out!
>
Thanks Jacob, the latest patch fixed the issues.

>
> --Jacob
>

Reply via email to