Re: dynamic result sets support in extended query protocol

Tatsuo Ishii Thu, 08 Oct 2020 01:24:25 -0700

Are you proposing to bump up the protocol version (either major or
minor)?  I am asking because it seems you are going to introduce some
new message types.


Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> I want to progress work on stored procedures returning multiple result
> sets.  Examples of how this could work on the SQL side have previously
> been shown [0].  We also have ongoing work to make psql show multiple
> result sets [1].  This appears to work fine in the simple query
> protocol.  But the extended query protocol doesn't support multiple
> result sets at the moment [2].  This would be desirable to be able to
> use parameter binding, and also since one of the higher-level goals
> would be to support the use case of stored procedures returning
> multiple result sets via JDBC.
> 
> [0]:
> https://www.postgresql.org/message-id/flat/4580ff7b-d610-eaeb-e06f-4d686896b93b%402ndquadrant.com
> [1]: https://commitfest.postgresql.org/29/2096/
> [2]:
> https://www.postgresql.org/message-id/9507.1534370765%40sss.pgh.pa.us
> 
> (Terminology: I'm calling this project "dynamic result sets", which
> includes several concepts: 1) multiple result sets, 2) those result
> sets can have different structures, 3) the structure of the result
> sets is decided at run time, not declared in the schema/procedure
> definition/etc.)
> 
> One possibility I rejected was to invent a third query protocol beside
> the simple and extended one.  This wouldn't really match with the
> requirements of JDBC and similar APIs because the APIs for sending
> queries don't indicate whether dynamic result sets are expected or
> required, you only indicate that later by how you process the result
> sets.  So we really need to use the existing ways of sending off the
> queries.  Also, avoiding a third query protocol is probably desirable
> in general to avoid extra code and APIs.
> 
> So here is my sketch on how this functionality could be woven into the
> extended query protocol.  I'll go through how the existing protocol
> exchange works and then point out the additions that I have in mind.
> 
> These additions could be enabled by a _pq_ startup parameter sent by
> the client.  Alternatively, it might also work without that because
> the client would just reject protocol messages it doesn't understand,
> but that's probably less desirable behavior.
> 
> So here is how it goes:
> 
> C: Parse
> S: ParseComplete
> 
> At this point, the server would know whether the statement it has
> parsed can produce dynamic result sets.  For a stored procedure, this
> would be declared with the procedure definition, so when the CALL
> statement is parsed, this can be noticed.  I don't actually plan any
> other cases, but for the sake of discussion, perhaps some variant of
> EXPLAIN could also return multiple result sets, and that could also be
> detected from parsing the EXPLAIN invocation.
> 
> At this point a client would usually do
> 
> C: Describe (statement)
> S: ParameterDescription
> S: RowDescription
> 
> New would be that the server would now also respond with a new
> message, say,
> 
> S: DynamicResultInfo
> 
> that indicates that dynamic result sets will follow later.  The
> message would otherwise be empty.  (We could perhaps include the
> number of result sets, but this might not actually be useful, and
> perhaps it's better not to spent effort on counting things that don't
> need to be counted.)
> 
> (If we don't guard this by a _pq_ startup parameter from the client,
> an old client would now error out because of an unexpected protocol
> message.)
> 
> Now the normal bind and execute sequence follows:
> 
> C: Bind
> S: BindComplete
> (C: Describe (portal))
> (S: RowDescription)
> C: Execute
> S: ... (DataRows)
> S: CommandComplete
> 
> In the case of a CALL with output parameters, this "primary" result
> set contains one row with the output parameters (existing behavior).
> 
> Now, if the client has seen DynamicResultInfo earlier, it should now
> go into a new subsequence to get the remaining result sets, like this
> (naming obviously to be refined):
> 
> C: NextResult
> S: NextResultReady
> C: Describe (portal)
> S: RowDescription
> C: Execute
> ....
> S: CommandComplete
> C: NextResult
> ...
> C: NextResult
> S: NoNextResult
> C: Sync
> S: ReadyForQuery
> 
> I think this would all have to use the unnamed portal, but perhaps
> there could be other uses with named portals.  Some details to be
> worked out.
> 
> One could perhaps also do without the DynamicResultInfo message and
> just put extra information into the CommandComplete message indicating
> "there are more result sets after this one".
> 
> (Following the model from the simple query protocol, CommandComplete
> really means one result set complete, not the whole top-level
> command. ReadyForQuery means the whole command is complete.  This is
> perhaps debatable, and interesting questions could also arise when
> considering what should happen in the simple query protocol when a
> query string consists of multiple commands each returning multiple
> result sets.  But it doesn't really seem sensible to cater to that.)
> 
> One thing that's missing in this sequence is a way to specify the
> desired output format (text/binary) for each result set.  This could
> be added to the NextResult message, but at that point the client
> doesn't yet know the number of columns in the result set, so we could
> only do it globally.  Then again, since the result sets are dynamic,
> it's less likely that a client would be coded to set per-column output
> codes. Then again, I would hate to bake such a restriction into the
> protocol, because some is going to try.  (I suspect what would be more
> useful in practice is to designate output formats per data type.)  So
> if we wanted to have this fully featured, it might have to look
> something like this:
> 
> C: NextResult
> S: NextResultReady
> C: Describe (dynamic) (new message subkind)
> S: RowDescription
> C: Bind (zero parameters, optionally format codes)
> S: BindComplete
> C: Describe (portal)
> S: RowDescription
> C: Execute
> ...
> 
> While this looks more complicated, client libraries could reuse
> existing code that starts processing with a Bind message and continues
> to CommandComplete, and then just loops back around.
> 
> The mapping of this to libpq in a simple case could look like this:
> 
> PQsendQueryParams(conn, "CALL ...", ...);
> PQgetResult(...);  // gets output parameters
> PQnextResult(...);  // new: sends NextResult+Bind
> PQgetResult(...);  // and repeat
> 
> Again, it's not clear here how to declare the result column output
> formats.  Since libpq doesn't appear to expose the Bind message
> separately, I'm not sure what to do here.
> 
> In JDBC, the NextResult message would correspond to the
> Statement.getMoreResults() method.  It will need a bit of conceptual
> adjustment because the first result set sent on the protocol is
> actually the output parameters, which the JDBC API returns separately
> from a ResultSet, so the initial CallableStatement.execute() call will
> need to process the primary result set and then send NextResult and
> obtain the first dynamic result as the first ResultSet for its API,
> but that can be handled internally.
> 
> Thoughts so far?
> 
> -- 
> Peter Eisentraut              http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> 
>

Re: dynamic result sets support in extended query protocol

Reply via email to