[I considered splitting this off into a new thread, but I think Dave
has to wait for it to be resolved before much can happen with the
patch. Sorry Dave.]

On Wed, Dec 10, 2025 at 3:01 PM Jelte Fennema-Nio <[email protected]> wrote:
> If we keep the features that are bundled with a protocol version bump
> of the kind where a client, either has to do nothing to implement it,
> or at worst has to ignore the contents of a new message/field. Then
> implementing support becomes so trivial for clients that I don't think
> it'd be a hurdle for client authors to implement support for 3.3, 3.4,
> 3.5 and if they only wanted a feature from the 3.6 protocol.^1 I'll
> call these things "no-op implementations" from now on.

It's too late for that, isn't it? 3.2's only feature doesn't work that
way (and couldn't have been designed that way, as far as I can tell).
So I don't have any confidence that all future features will fall in
line with this new rule.

NegotiateProtocolVersion is the only in-band tool we have to ratchet
the protocol forward. Why go through all this pain of getting NPV
packets working, only to immediately limit its power to the most
trivial cases?

> I think we disagree on this. I think the downside of using protocol
> extensions for everything is that we then end up with N*N different
> combinations of features in the wild that servers and clients need to
> deal with. We have to start to define what happens when features
> interact, but either of them is not enabled.

In the worst case? Yes. (That worst case doesn't really bother me.
Many other protocols regularly navigate extension combinations.)

But! The two extension proposals in flight at the moment -- GoAway and
cursor options -- are completely orthogonal, no? Both to each other,
and to the functionality in 3.2. There are no combinatorics yet. So it
seems strange to optimize for combinatorics out of the gate, by
burning through a client-mandatory minor version every year.

> With incrementing
> versions you don't have that problem,

You still have N*M. Implementers have to test each feature of their
3.10 client against server versions 3.0-9, rather than testing against
a single server that turns individual extension support on and off. I
prefer the latter (but maybe that's just because it's what I'm used
to). Middleboxes increase the matrix further, as you point out below.

Paradoxically, if all N features happen to be orthogonal, the testing
burden for the extension strategy collapses to... N.
Minor-version-per-year is worse for that case.

> which results in simpler logic
> in the spec, servers and clients.

I don't want to dissuade a proof of concept for this, because simpler
logic everywhere sounds amazing. But it sounds like magical thinking
to me. A bit like telling Christoph that the dpkg dependency graph is
too complicated, so it should be a straight line instead -- if that
worked, presumably everyone would have done it that way, right?
Convince me that you're not just ignoring necessary complexity in an
attempt to stamp out unnecessary complexity.

An example of an established network protocol that follows this same
strategy would be helpful. How do their clients deal with the
minor-version treadmill?

> Finally, because we don't have any protocol extensions yet. All
> clients still need to build infrastructure for them, including libpq.

For clients still on 3.0 (the vast majority of them), they'd have to
add infrastructure for sliding minor version ranges, too.

> So I'd argue that if we make such "no-op implementation" features use
> protocol extensions, then it'd be more work for everyone.

Why advertise a protocol extension if you plan to ignore it? Don't
advertise it. Do nothing. That's even less work than retrofitting
packet parsers to correctly ignore a byte range when minorversion > X.

> > Plus I think it's unwise to introduce a 3.3 before we're
> > confident that 3.2 can be widely deployed, and I'm trying to put
> > effort into the latter for 19, so that I'm not just sitting here
> > gatekeeping.
>
> I'm not sure what you mean with this. People use libpq18 and PG18, and
> I've heard no complaints about protocol problems. So I think it was a
> success. Do you mean widely deployed by default?

Yes. Or even just "deployed". GitHub shows zero hits outside of the
Postgres fork graph.

Google's results show that an organization called "cardo" tried
max_protocol_version=latest. They had to revert it. :( Time for
grease.

> Why exactly does that
> matter for 3.3? Anything that stands default deployment in the way for
> 3.2, will continue to stand default deployment in the way for 3.3.

Exactly. Don't you want to make sure that clients in the ecosystem are
able to use this _before_ we rev the version again, and again? We
don't ever get these numbers back.

Like, I'm arguing as hard as I can against the very existence of the
treadmill. But if I'm outvoted on that, *please* don't start the
treadmill before other people can climb on -- otherwise, they won't be
able to give us any feedback at all!

> Personally, if we flip the default in e.g. 5 years from now. I'd much
> rather have it be flipped to a very nice 3.6 protocol, than still only
> having the single new feature that was added in 3.2.

Those are not the only two choices. I'd rather we get a bunch of nice
features without any flipping at all, if that's possible. It looks
possible to me.

> > IETF has a bunch of related case studies [1,2,3] that might be useful
> > reading, even if we decide that their experience differs heavily from
> > ours.
>
> I gave them a skim and they seem like a good read (which I'll do
> later). But I'm not sure part of them you thought was actionable for
> the discussion about version bumps vs protocol extensions. (I did see
> useful stuff for the grease thread, but that seems better to discuss
> there)

For this conversation, I'm focused on RFC 8170. Specifically the
concepts of incremental transitions and incentive alignment
(cost/benefit to individual community members).

I view minor-version-per-year as violating both of those principles.
It instead focuses on the ease of the people who are most plugged into
this mailing list, and who have the most power to change things on a
whim.

> ^1: You and I only talked about clients above, but obviously there's
> also proxies and other servers that implement the protocol to
> consider. If a feature that is "no-op implementation" on the client is
> a complicated implementation on the proxy/server then maybe a protocol
> extension is indeed the better choice. I think for GoAway it's trivial
> to "no-op implement" too on the proxy/server. For this cursor option
> proposal it's less clear cut imo. Proxies can probably simply forward
> the message to the server, although maybe PgBouncer would want to
> throw an error when a client uses a hold cursor (but it also doesn't
> do that for SQL level hold cursors, so that seems like an optional
> enhancement).

I think proposals should attempt to answer those questions as a
prerequisite to commit, personally. Or at least, we should be moving
in that direction, if that's too harsh on the first authors who are
trying to get things moving inside the protocol.

More generally, it bothers me that we still don't have a clear mental
model of middlebox extensibility. We're just retreading the
discussions from [1] instead of starting from where we stopped, and
that's exhausting for me.

(As a reminder: 3.2 broke my testing rig, which relied on implicit
assumptions around minor-version extensibility for middleboxes. I
didn't speak up until very late, because it was just a testing rig,
and I could change it. I should have spoken up immediately, because
IIRC, pgpool then broke as well.)

> Other servers might not even support hold cursors, but
> then they could simply throw a clear error (like pgbouncer would do).
> If throwing an error is an acceptable server implementation, then I
> think a "no-op implementation" is again trivial.

A server is always free to decide at the _application_ layer that it
will error out for a particular packet that it can parse at the
_network_ layer. But it seems a lot more user-friendly to just decline
the protocol bit, if it's directly tied to an application-level
feature that isn't implemented. I think we should encourage that when
possible; otherwise we've traded protocol fragmentation for
application fragmentation.

--Jacob

[1] 
https://postgr.es/m/CAGECzQR5PMud4q8Atyz0gOoJ1xNH33g7g-MLXFML1_Vrhbzs6Q%40mail.gmail.com


Reply via email to