Hi Jim, Thanks for your feedback!
> Should this configuration be mentioned in the FLIP? Sure. > some way for the server to be able to limit the number of requests it > receives. I’m sorry that this FLIP is dedicated in implementing the Remote mode, so we didn't consider much about this. I think the option is enough currently. I will add the improvement suggestions to the ‘Future Work’. > I wonder if two other options are possible To forward the raw format to gateway and then to client is possible. The raw results from sink is in ‘CollectResultIterator#bufferedResult’. First, we can find a way to get this result without wrapping it. Second, constructing a ‘InternalTypeInfo’. We can construct it using the schema information (data’s logical type). After construction, we can get the ’TypeSerializer’ to deserialize the raw result. > 2022年12月1日 04:54,Jim Hughes <jhug...@confluent.io.INVALID> 写道: > > Hi Yu, > > Thanks for moving my comments to this thread! Also, thank you for > answering my questions; it is helping me understand the SQL Gateway > better. > > 5. >> Our idea is to introduce a new session option (like > 'sql-client.result.fetch-interval') to control > the fetching requests sending frequency. What do you think? > > Should this configuration be mentioned in the FLIP? > > One slight concern I have with having 'sql-client.result.fetch-interval' as > a session configuration is that users could set it low and cause the client > to send a large volume of requests to the SQL gateway. > > Generally, I'd like to see some way for the server to be able to limit the > number of requests it receives. If that really needs to be done by a proxy > in front of the SQL gateway, that is fine as well. (To be clear, I don't > think my concern here should be blocking in any way.) > > 7. >> What is the serialization lifecycle for results? > > I wonder if two other options are possible: > 3) Could the Gateway just forward the result byte array? (Or does the > Gateway need to deserialize the response in order to understand it for some > reason?) > 4) Could the JobManager prepare the results in JSON? (Or similarly could > the Client read the format which the JobManager sends?) > > Thanks again! > > Cheers, > > Jim > > On Wed, Nov 30, 2022 at 9:40 AM yu zelin <yuzelin....@gmail.com> wrote: > >> Hi, all >> >> Thanks Jim’s questions below. Here I’d like to reply to them. >> >>> 1. For the Client Parser, is it going to work with the extended syntax >>> from the Flink Table Store? >>> >>> 2. Relatedly, what will happen if an older Client tries to handle >> syntax >>> that a newer service supports? (Suppose I use a 1.17 client with a >> 1.18 >>> Gateway/system which has a new keyword. Is there anything we should be >>> designing for upfront?) >>> >>> 3. How will client and server version mismatches be handled? Will a >>> single gateway be able to support multiple endpoint versions? >>> 4. How are commands which change a session handled? Are those sent via >>> an ExecuteStatementRequest? >>> >>> 5. The remote POC uses polling for getting back status and getting back >>> results. Would it be possible to switch to web sockets or some other >>> mechanism to avoid polling? If polling is used for both, the polling >>> frequency should be different between local and remote configurations. >>> >>> 6. What does this sentence mean? "The reason why we didn't get the sql >>> type in client side is because it's hard for the lightweight >> client-level >>> parser to recognize some sql type sql, such as query with CTE. " >>> >>> 7. What is the serialization lifecycle for results? It makes sense to >>> have some control over whether the gateway returns results as SQL or >> JSON. >>> I'd love to see a way to avoid needing to serialize and deserialize >> results >>> on the SQL Gateway if possible. I'm still new enough to the project >> that >>> I'm not sure if that's readily possible. Maybe the SQL Gateway's >> return >>> type can be sent as part of the request so that the JobManager can send >>> back results in an advantageous format? >>> >>> 8. Does ErrorType need to be marked as @PublicEvolving? >>> >>> I'm excited for the SQL client to support gateway mode! Given the change >>> in design, do you think it'll still be part of the Flink 1.17 release? >> >> 1. ClientParser can work with new (and unknown) SQL syntax. It is because >> if the >> sql type is not recognized, the sql will be submitted to the gateway >> directly. >> >> For more information: Actually, the proposed ClientParser only do two >> things: >> (1) Tell client commands (help, clear, etc) and sqls apart. >> (2) parses several sql types (e.g. SHOW CREATE statement, we can print raw >> string >> for the SHOW CREATE result instead of table). Here the recognization of >> sql types >> mostly affects the print style, and unrecognized sql also can be submitted >> to cluster. >> So the Client with new ClientParser can work compatible with new syntax. >> >> 2. First, I'd like to explain that the gateway APIs and supported syntax >> is two things. >> For example, ‘configureSession' and 'completeStatement' are APIs. As >> mentioned >> in #1, the sql statements which syntax is unknown will be submitted to the >> gateway, >> and whether they can be executed normally depends on whether the execution >> environment supports the syntax. >> >>> Is there anything we should be designing for upfront? >> >> The 'SqlGatewayRestAPIVersion’ has been introduced. But it is for sql >> gateway APIs. >> >> 3. >>> How will client and server version mismatches be handled? >> >> A lower version client can work compatible with a higher version gateway >> because the >> old interfaces won’t be deleted. When a higher version client connects to >> a lower version >> gateway, the client should notify the users if they try to use unsupported >> features. For >> example, the client start option ‘-i’ means using initialization file to >> initialize the session. >> We plan to use the gateway’s ‘configureSession’ to implement it. But this >> API is not >> implemented in 1.16 Gateway (SqlGatewayRestAPIVersion = V1), so if the >> user try to >> use ‘-i’ option to start the client with the 1.16 gateway, the client >> should tell the user that >> Can’t execute ‘-i’ option with gateway which version is lower than V2. >> >>> Will a single gateway be able to support multiple endpoint versions? >> >> Currently, the gateway only starts a highest version endpoint and the >> higher version endpoint >> is compatible with the lower version endpoint’s protocol. >> >> 4. Yes. Mostly, we use ’SET’ and ‘RESET’ statements to change the session >> configuration. >> Notice: the client can’t change the session (I mean, close current session >> and open another >> one). I’m not sure if you have need to change the session itself? >> >> 5. >>> Would it be possible to switch to web sockets or some other mechanism >> to avoid polling? >> >> Your suggestion is very good, but this flip is for supporting the remote >> client. How about taking >> it as a future work? >> >>> If polling is used for both, the polling frequency should be different >> between local and remote >> configurations. >> >> Our idea is to introduce a new session option (like >> 'sql-client.result.fetch-interval') to control >> the fetching requests sending frequency. What do you think? >> >> For more information: we are inclined to keep the polling behavior in this >> version. For streaming >> query, fetching results synchronously may occupy resources of the gateway >> in a long period. >> For example, if the job doesn’t return results for a long time because the >> window has not been >> triggered, the synchronously fetching will keep occupying the connection. >> In asynchronous >> situation, the gateway can return a NOT_READY_RESULT quickly and release >> the resources >> for other clients to use. I think we can make some improvements for the >> whole flow path in the >> future. >> >> 6. Sorry for that there is mistakes in this sentence. Let me make it clear. >> >> We proposed to add 'ContentType' to indicates the result is for what kind >> of sql. In this sentence, >> I want to explain why we add 'ContentType' since the ClientParser can >> recognize the sql type too. >> It is because the proposed ClientParser can't recognize complex syntax. >> For example, it can’t >> recognize query with CTE. So the result should carry content type >> information to help the client to >> know the sql type. For example, the 'ContentType.QUERY_RESULT' indicates >> the result is for a >> query statement. >> >> 7. >>> What is the serialization lifecycle for results? >> >> 1) Sink to JobManager : RowData -> Byte[ ] (serialize) >> 2) JobManager to Gateway : Byte[ ] -> RowData (deserialize) >> 3) Gateway sending : RowData -> Byte[ ] (serialized to JSON >> format) >> 4) Client receiving : Byte[ ] -> RowData (deserialize) >> >>> Maybe the SQL Gateway's return type can be sent as part of the request >> so that the >> JobManager can send back results in an advantageous format? >> >> Yes. I think it's an improvement for the Client and Gateway. We have some >> ideas. For example, >> >> 1) We can move the Gateway into the JobManager and reduce the Ser/De costs >> from JM to Gateway. >> 2) Or the Gateway can collect the data from the sink function directly >> instead of JobManager. >> >> But I think we can leave this as a future work and discuss in another >> thread. >> >> 8. Yes. >> >>> Do you think it'll still be part of the Flink 1.17 release? >> Yes. We will try our best to finish the work. >> >> Feel free to talk to me if I’m wrong or you have any other questions. >> >> >>> 2022年11月25日 11:48,yu zelin <yuzelin....@gmail.com> 写道: >>> >>> Hi, all >>> >>> I want to initiate a discussion on the FLIP-275: Support Remote SQL >> Client Based on SQL Gateway[1]. >>> The motivation of this FLIP is that the current SQL Client allows only >> local connection which can not satisfy >>> the common need of connecting to a remote cluster. >>> >>> Since the FLIP-91[2] has introduced SQL Gateway, we proposed to >> implement the Remote SQL Client >>> based on SQL Gateway. In our design, we proposed two main changes: >>> >>> 1. New remote mode client which performs connection to the remote >> gateway through REST API. >>> 2. Migration of the current local mode client. We proposed to refactor >> the local client based on SQL Gateway >>> to unify the interface for two modes. >>> >>> Looking forward to your suggestions. >>> >>> Best, >>> Yu Zelin >>> >>> [1] https://cwiki.apache.org/confluence/x/T48ODg >>> [2] https://cwiki.apache.org/confluence/x/rIyMC >> >>