On 2020-10-09 21:02, Dave Cramer wrote:
For the most part we know exactly which types we want in binary for 99% of queries.

    The hard part around this really is whether and how to deal with changes
    in type definitions. From types just being created - comparatively
    simple - to extensions being dropped and recreated, with oids
    potentially being reused.


Fair point but this is going to be much more complex than just sending most of the results in binary which would speed up the overwhelming majority of queries

I've been studying in more detail how the JDBC driver handles binary format use. Having some kind of message "use binary for these types" would match its requirements quite exactly. (I have also studied npgsql, but it appears to work quite differently. More input from there and other places with similar requirements would be welcome.) The question as mentioned above is how to deal with type changes. Let's work through a couple of options.

We could send the type/format list with every query. For example, we could extend/enhance/alter the Bind message so that instead of a format-per-column it sends a format-per-type. But then you'd need to send the complete type list every time. The JDBC driver currently has 20+ types already hardcoded and more optionally, so you'd send 100+ bytes for every query, plus required effort for encoding and decoding. That seems unattractive.

Or we send the type/format list once near the beginning of the session. Then we need to deal with types being recreated or updated etc.

The first option is that we "lock" the types against changes (ignoring whether that's actually possible right now). That would mean you couldn't update an affected type/extension while a JDBC session is active. That's no good. (Imagine connection pools with hours of server lifetime.)

Another option is that we invalidate the session when a thus-registered type changes. Also no good. (We don't want an extension upgrade suddenly breaking all open connections.)

Finally, we could do it an a best-effort basis. We use binary format for registered types, until there is some invalidation event for the type, at which point we revert to default/text format until the end of a session (or until another protocol message arrives re-registering the type). This should work, because the result row descriptor contains the actual format type, and there is no guarantee that it's the same one that was requested.

So how about that last option? I imagine a new protocol message, say, TypeFormats, that contains a number of type/format pairs. The message would typically be sent right after the first ReadyForQuery, gets no response. It could also be sent at any other time, but I expect that to be less used in practice. Binary format is used for registered types if they have binary format support functions, otherwise text continues to be used. There is no error response for types without binary support. (There should probably be an error response for registering a type that does not exist.)

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply via email to