Hi, Shammon Thanks for your feedback. I think it’s good to support jdbc-sdk. However, it's not supported in the gateway side yet. In my opinion, this FLIP is more concerned with the SQL Client. How about put “supporting jdbc-sdk” in ‘Future Work’? We can discuss how to implement it in another thread.
Best, Yu Zelin > 2022年12月2日 18:12,Shammon FY <zjur...@gmail.com> 写道: > > Hi zelin > > Thanks for driving this discussion. > > I notice that the sql-client will interact with sql-gateway by `REST > Client` in the `Executor` in the FLIP, how about introducing jdbc-sdk for > sql-gateway? > > Then the sql-client can connect the gateway with jdbc-sdk, on the other > hand, the other applications and tools such as jmeter can use the jdbc-sdk > to connect sql-gateway too. > > Best, > Shammon > > > On Fri, Dec 2, 2022 at 4:10 PM yu zelin <yuzelin....@gmail.com> wrote: > >> Hi Jim, >> >> Thanks for your feedback! >> >>> Should this configuration be mentioned in the FLIP? >> >> Sure. >> >>> some way for the server to be able to limit the number of requests it >> receives. >> I’m sorry that this FLIP is dedicated in implementing the Remote mode, so >> we >> didn't consider much about this. I think the option is enough currently. >> I will add >> the improvement suggestions to the ‘Future Work’. >> >>> I wonder if two other options are possible >> >> To forward the raw format to gateway and then to client is possible. The >> raw >> results from sink is in ‘CollectResultIterator#bufferedResult’. First, we >> can find >> a way to get this result without wrapping it. Second, constructing a >> ‘InternalTypeInfo’. >> We can construct it using the schema information (data’s logical type). >> After >> construction, we can get the ’TypeSerializer’ to deserialize the raw >> result. >> >> >> >> >>> 2022年12月1日 04:54,Jim Hughes <jhug...@confluent.io.INVALID> 写道: >>> >>> Hi Yu, >>> >>> Thanks for moving my comments to this thread! Also, thank you for >>> answering my questions; it is helping me understand the SQL Gateway >>> better. >>> >>> 5. >>>> Our idea is to introduce a new session option (like >>> 'sql-client.result.fetch-interval') to control >>> the fetching requests sending frequency. What do you think? >>> >>> Should this configuration be mentioned in the FLIP? >>> >>> One slight concern I have with having 'sql-client.result.fetch-interval' >> as >>> a session configuration is that users could set it low and cause the >> client >>> to send a large volume of requests to the SQL gateway. >>> >>> Generally, I'd like to see some way for the server to be able to limit >> the >>> number of requests it receives. If that really needs to be done by a >> proxy >>> in front of the SQL gateway, that is fine as well. (To be clear, I don't >>> think my concern here should be blocking in any way.) >>> >>> 7. >>>> What is the serialization lifecycle for results? >>> >>> I wonder if two other options are possible: >>> 3) Could the Gateway just forward the result byte array? (Or does the >>> Gateway need to deserialize the response in order to understand it for >> some >>> reason?) >>> 4) Could the JobManager prepare the results in JSON? (Or similarly could >>> the Client read the format which the JobManager sends?) >>> >>> Thanks again! >>> >>> Cheers, >>> >>> Jim >>> >>> On Wed, Nov 30, 2022 at 9:40 AM yu zelin <yuzelin....@gmail.com> wrote: >>> >>>> Hi, all >>>> >>>> Thanks Jim’s questions below. Here I’d like to reply to them. >>>> >>>>> 1. For the Client Parser, is it going to work with the extended syntax >>>>> from the Flink Table Store? >>>>> >>>>> 2. Relatedly, what will happen if an older Client tries to handle >>>> syntax >>>>> that a newer service supports? (Suppose I use a 1.17 client with a >>>> 1.18 >>>>> Gateway/system which has a new keyword. Is there anything we should >> be >>>>> designing for upfront?) >>>>> >>>>> 3. How will client and server version mismatches be handled? Will a >>>>> single gateway be able to support multiple endpoint versions? >>>>> 4. How are commands which change a session handled? Are those sent >> via >>>>> an ExecuteStatementRequest? >>>>> >>>>> 5. The remote POC uses polling for getting back status and getting >> back >>>>> results. Would it be possible to switch to web sockets or some other >>>>> mechanism to avoid polling? If polling is used for both, the polling >>>>> frequency should be different between local and remote configurations. >>>>> >>>>> 6. What does this sentence mean? "The reason why we didn't get the >> sql >>>>> type in client side is because it's hard for the lightweight >>>> client-level >>>>> parser to recognize some sql type sql, such as query with CTE. " >>>>> >>>>> 7. What is the serialization lifecycle for results? It makes sense to >>>>> have some control over whether the gateway returns results as SQL or >>>> JSON. >>>>> I'd love to see a way to avoid needing to serialize and deserialize >>>> results >>>>> on the SQL Gateway if possible. I'm still new enough to the project >>>> that >>>>> I'm not sure if that's readily possible. Maybe the SQL Gateway's >>>> return >>>>> type can be sent as part of the request so that the JobManager can >> send >>>>> back results in an advantageous format? >>>>> >>>>> 8. Does ErrorType need to be marked as @PublicEvolving? >>>>> >>>>> I'm excited for the SQL client to support gateway mode! Given the >> change >>>>> in design, do you think it'll still be part of the Flink 1.17 release? >>>> >>>> 1. ClientParser can work with new (and unknown) SQL syntax. It is >> because >>>> if the >>>> sql type is not recognized, the sql will be submitted to the gateway >>>> directly. >>>> >>>> For more information: Actually, the proposed ClientParser only do two >>>> things: >>>> (1) Tell client commands (help, clear, etc) and sqls apart. >>>> (2) parses several sql types (e.g. SHOW CREATE statement, we can print >> raw >>>> string >>>> for the SHOW CREATE result instead of table). Here the recognization of >>>> sql types >>>> mostly affects the print style, and unrecognized sql also can be >> submitted >>>> to cluster. >>>> So the Client with new ClientParser can work compatible with new syntax. >>>> >>>> 2. First, I'd like to explain that the gateway APIs and supported syntax >>>> is two things. >>>> For example, ‘configureSession' and 'completeStatement' are APIs. As >>>> mentioned >>>> in #1, the sql statements which syntax is unknown will be submitted to >> the >>>> gateway, >>>> and whether they can be executed normally depends on whether the >> execution >>>> environment supports the syntax. >>>> >>>>> Is there anything we should be designing for upfront? >>>> >>>> The 'SqlGatewayRestAPIVersion’ has been introduced. But it is for sql >>>> gateway APIs. >>>> >>>> 3. >>>>> How will client and server version mismatches be handled? >>>> >>>> A lower version client can work compatible with a higher version gateway >>>> because the >>>> old interfaces won’t be deleted. When a higher version client connects >> to >>>> a lower version >>>> gateway, the client should notify the users if they try to use >> unsupported >>>> features. For >>>> example, the client start option ‘-i’ means using initialization file >> to >>>> initialize the session. >>>> We plan to use the gateway’s ‘configureSession’ to implement it. But >> this >>>> API is not >>>> implemented in 1.16 Gateway (SqlGatewayRestAPIVersion = V1), so if the >>>> user try to >>>> use ‘-i’ option to start the client with the 1.16 gateway, the client >>>> should tell the user that >>>> Can’t execute ‘-i’ option with gateway which version is lower than V2. >>>> >>>>> Will a single gateway be able to support multiple endpoint versions? >>>> >>>> Currently, the gateway only starts a highest version endpoint and the >>>> higher version endpoint >>>> is compatible with the lower version endpoint’s protocol. >>>> >>>> 4. Yes. Mostly, we use ’SET’ and ‘RESET’ statements to change the >> session >>>> configuration. >>>> Notice: the client can’t change the session (I mean, close current >> session >>>> and open another >>>> one). I’m not sure if you have need to change the session itself? >>>> >>>> 5. >>>>> Would it be possible to switch to web sockets or some other mechanism >>>> to avoid polling? >>>> >>>> Your suggestion is very good, but this flip is for supporting the remote >>>> client. How about taking >>>> it as a future work? >>>> >>>>> If polling is used for both, the polling frequency should be different >>>> between local and remote >>>> configurations. >>>> >>>> Our idea is to introduce a new session option (like >>>> 'sql-client.result.fetch-interval') to control >>>> the fetching requests sending frequency. What do you think? >>>> >>>> For more information: we are inclined to keep the polling behavior in >> this >>>> version. For streaming >>>> query, fetching results synchronously may occupy resources of the >> gateway >>>> in a long period. >>>> For example, if the job doesn’t return results for a long time because >> the >>>> window has not been >>>> triggered, the synchronously fetching will keep occupying the >> connection. >>>> In asynchronous >>>> situation, the gateway can return a NOT_READY_RESULT quickly and release >>>> the resources >>>> for other clients to use. I think we can make some improvements for the >>>> whole flow path in the >>>> future. >>>> >>>> 6. Sorry for that there is mistakes in this sentence. Let me make it >> clear. >>>> >>>> We proposed to add 'ContentType' to indicates the result is for what >> kind >>>> of sql. In this sentence, >>>> I want to explain why we add 'ContentType' since the ClientParser can >>>> recognize the sql type too. >>>> It is because the proposed ClientParser can't recognize complex syntax. >>>> For example, it can’t >>>> recognize query with CTE. So the result should carry content type >>>> information to help the client to >>>> know the sql type. For example, the 'ContentType.QUERY_RESULT' indicates >>>> the result is for a >>>> query statement. >>>> >>>> 7. >>>>> What is the serialization lifecycle for results? >>>> >>>> 1) Sink to JobManager : RowData -> Byte[ ] (serialize) >>>> 2) JobManager to Gateway : Byte[ ] -> RowData (deserialize) >>>> 3) Gateway sending : RowData -> Byte[ ] (serialized to JSON >>>> format) >>>> 4) Client receiving : Byte[ ] -> RowData (deserialize) >>>> >>>>> Maybe the SQL Gateway's return type can be sent as part of the request >>>> so that the >>>> JobManager can send back results in an advantageous format? >>>> >>>> Yes. I think it's an improvement for the Client and Gateway. We have >> some >>>> ideas. For example, >>>> >>>> 1) We can move the Gateway into the JobManager and reduce the Ser/De >> costs >>>> from JM to Gateway. >>>> 2) Or the Gateway can collect the data from the sink function directly >>>> instead of JobManager. >>>> >>>> But I think we can leave this as a future work and discuss in another >>>> thread. >>>> >>>> 8. Yes. >>>> >>>>> Do you think it'll still be part of the Flink 1.17 release? >>>> Yes. We will try our best to finish the work. >>>> >>>> Feel free to talk to me if I’m wrong or you have any other questions. >>>> >>>> >>>>> 2022年11月25日 11:48,yu zelin <yuzelin....@gmail.com> 写道: >>>>> >>>>> Hi, all >>>>> >>>>> I want to initiate a discussion on the FLIP-275: Support Remote SQL >>>> Client Based on SQL Gateway[1]. >>>>> The motivation of this FLIP is that the current SQL Client allows only >>>> local connection which can not satisfy >>>>> the common need of connecting to a remote cluster. >>>>> >>>>> Since the FLIP-91[2] has introduced SQL Gateway, we proposed to >>>> implement the Remote SQL Client >>>>> based on SQL Gateway. In our design, we proposed two main changes: >>>>> >>>>> 1. New remote mode client which performs connection to the remote >>>> gateway through REST API. >>>>> 2. Migration of the current local mode client. We proposed to refactor >>>> the local client based on SQL Gateway >>>>> to unify the interface for two modes. >>>>> >>>>> Looking forward to your suggestions. >>>>> >>>>> Best, >>>>> Yu Zelin >>>>> >>>>> [1] https://cwiki.apache.org/confluence/x/T48ODg >>>>> [2] https://cwiki.apache.org/confluence/x/rIyMC >>>> >>>> >> >>