dynamic result sets support in extended query protocol

Peter Eisentraut Thu, 08 Oct 2020 00:47:16 -0700

I want to progress work on stored procedures returning multiple resultsets. Examples of how this could work on the SQL side have previouslybeen shown [0]. We also have ongoing work to make psql show multipleresult sets [1]. This appears to work fine in the simple queryprotocol. But the extended query protocol doesn't support multipleresult sets at the moment [2]. This would be desirable to be able touse parameter binding, and also since one of the higher-level goalswould be to support the use case of stored procedures returning multipleresult sets via JDBC.

[0]:https://www.postgresql.org/message-id/flat/4580ff7b-d610-eaeb-e06f-4d686896b93b%402ndquadrant.com

[1]: https://commitfest.postgresql.org/29/2096/
[2]: https://www.postgresql.org/message-id/9507.1534370765%40sss.pgh.pa.us

(Terminology: I'm calling this project "dynamic result sets", whichincludes several concepts: 1) multiple result sets, 2) those result setscan have different structures, 3) the structure of the result sets isdecided at run time, not declared in the schema/procedure definition/etc.)

One possibility I rejected was to invent a third query protocol besidethe simple and extended one. This wouldn't really match with therequirements of JDBC and similar APIs because the APIs for sendingqueries don't indicate whether dynamic result sets are expected orrequired, you only indicate that later by how you process the resultsets. So we really need to use the existing ways of sending off thequeries. Also, avoiding a third query protocol is probably desirable ingeneral to avoid extra code and APIs.

So here is my sketch on how this functionality could be woven into theextended query protocol. I'll go through how the existing protocolexchange works and then point out the additions that I have in mind.

These additions could be enabled by a _pq_ startup parameter sent by theclient. Alternatively, it might also work without that because theclient would just reject protocol messages it doesn't understand, butthat's probably less desirable behavior.


So here is how it goes:

C: Parse
S: ParseComplete

At this point, the server would know whether the statement it has parsedcan produce dynamic result sets. For a stored procedure, this would bedeclared with the procedure definition, so when the CALL statement isparsed, this can be noticed. I don't actually plan any other cases, butfor the sake of discussion, perhaps some variant of EXPLAIN could alsoreturn multiple result sets, and that could also be detected fromparsing the EXPLAIN invocation.


At this point a client would usually do

C: Describe (statement)
S: ParameterDescription
S: RowDescription

New would be that the server would now also respond with a new message, say,

S: DynamicResultInfo

that indicates that dynamic result sets will follow later. The messagewould otherwise be empty. (We could perhaps include the number ofresult sets, but this might not actually be useful, and perhaps it'sbetter not to spent effort on counting things that don't need to becounted.)

(If we don't guard this by a _pq_ startup parameter from the client, anold client would now error out because of an unexpected protocol message.)


Now the normal bind and execute sequence follows:

C: Bind
S: BindComplete
(C: Describe (portal))
(S: RowDescription)
C: Execute
S: ... (DataRows)
S: CommandComplete

In the case of a CALL with output parameters, this "primary" result setcontains one row with the output parameters (existing behavior).

Now, if the client has seen DynamicResultInfo earlier, it should now gointo a new subsequence to get the remaining result sets, like this(naming obviously to be refined):


C: NextResult
S: NextResultReady
C: Describe (portal)
S: RowDescription
C: Execute
....
S: CommandComplete
C: NextResult
...
C: NextResult
S: NoNextResult
C: Sync
S: ReadyForQuery

I think this would all have to use the unnamed portal, but perhaps therecould be other uses with named portals. Some details to be worked out.

One could perhaps also do without the DynamicResultInfo message and justput extra information into the CommandComplete message indicating "thereare more result sets after this one".

(Following the model from the simple query protocol, CommandCompletereally means one result set complete, not the whole top-level command.ReadyForQuery means the whole command is complete. This is perhapsdebatable, and interesting questions could also arise when consideringwhat should happen in the simple query protocol when a query stringconsists of multiple commands each returning multiple result sets. Butit doesn't really seem sensible to cater to that.)

One thing that's missing in this sequence is a way to specify thedesired output format (text/binary) for each result set. This could beadded to the NextResult message, but at that point the client doesn'tyet know the number of columns in the result set, so we could only do itglobally. Then again, since the result sets are dynamic, it's lesslikely that a client would be coded to set per-column output codes.Then again, I would hate to bake such a restriction into the protocol,because some is going to try. (I suspect what would be more useful inpractice is to designate output formats per data type.) So if we wantedto have this fully featured, it might have to look something like this:


C: NextResult
S: NextResultReady
C: Describe (dynamic) (new message subkind)
S: RowDescription
C: Bind (zero parameters, optionally format codes)
S: BindComplete
C: Describe (portal)
S: RowDescription
C: Execute
...

While this looks more complicated, client libraries could reuse existingcode that starts processing with a Bind message and continues toCommandComplete, and then just loops back around.


The mapping of this to libpq in a simple case could look like this:

PQsendQueryParams(conn, "CALL ...", ...);
PQgetResult(...);  // gets output parameters
PQnextResult(...);  // new: sends NextResult+Bind
PQgetResult(...);  // and repeat

Again, it's not clear here how to declare the result column outputformats. Since libpq doesn't appear to expose the Bind messageseparately, I'm not sure what to do here.

In JDBC, the NextResult message would correspond to theStatement.getMoreResults() method. It will need a bit of conceptualadjustment because the first result set sent on the protocol is actuallythe output parameters, which the JDBC API returns separately from aResultSet, so the initial CallableStatement.execute() call will need toprocess the primary result set and then send NextResult and obtain thefirst dynamic result as the first ResultSet for its API, but that can behandled internally.


Thoughts so far?

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

dynamic result sets support in extended query protocol

Reply via email to