Hi, zeklin >The CLI will use default print style for the non-query result. Please make sure the print results of EXPLAIN/DESC/SHOW CREATE TABLE commands are clear.
> We think it’s better to add the root cause to the ErrorResponseBody. LGTM Best, Godfrey yu zelin <yuzelin....@gmail.com> 于2022年12月6日周二 17:51写道: > > Hi, Godfrey > > Thanks for your feedback. Below is my thoughts about your questions. > > 1. About RowFormat. > I agree to your opinion. So we decided to revert the RowFormat related changes > and let the client to resolve the print format. > > 2. About ContentType > I agree that the definition of the ContentType is not clear. But how to > define the > statement type is another big question. So, we decided to only tell the query > result > and non-query result apart. The CLI will use default print style for the > non-query > result. > > 3. About ErrorHandling > I think reuse the current ErrorResponseBody is good, but parse the root cause > from the exception stack strings is quite hacking. We think it’s better to > add the > root cause to the ErrorResponseBody. > > 4. About Runtime REST API Modifications > I agree, too. This part is moved to the ‘Future Work’. > > Best, > Yu Zelin > > > > 2022年12月5日 18:33,godfrey he <godfre...@gmail.com> 写道: > > > > Hi Zelin, > > > > Thanks for driving this discussion. > > > > I have a few comments, > > > >> Add RowFormat to ResultSet to indicate the format of rows. > > We should not require SqlGateway server to meet the display > > requirements of a CliClient. > > Because different CliClients may have different display style. The > > server just need to response the data, > > and the CliClient prints the result as needed. So RowFormat is not needed. > > > >> Add ContentType to ResultSet to indicate what kind of data the result > >> contains. > > from my first sight, the values of ContentType are intersected, such > > as: A select query will return QUERY_RESULT, > > but it also has JOB_ID. OTHER is too ambiguous, I don't know which > > kind of query will return OTHER. > > I recommend returning the concrete type for each statement, such as > > "CREATE TABLE" for "create table xx (...) with ()", > > "SELECT" for "select * from xxx". The statement type can be maintained > > in `Operation`s. > > > >> Error Handling > > I think current design of error handling mechanism can meet the > > requirement of CliClient, we can get the root cause from > > the stack (see ErrorResponseBody#errors). If it becomes a common > > requirement (for many clients) in the future, > > we can introduce this interface. > > > >> Runtime REST API Modification for Local Client Migration > > I think this part is over-engineered, this part belongs to optimization. > > The client does not require very high performance, the current design > > can already meet our needs. > > If we find performance problems in the future, do such optimizations. > > > > Best, > > Godfrey > > > > yu zelin <yuzelin....@gmail.com> 于2022年12月5日周一 11:11写道: > >> > >> Hi, Shammon > >> > >> Thanks for your feedback. I think it’s good to support jdbc-sdk. However, > >> it's not supported in the gateway side yet. In my opinion, this FLIP is > >> more > >> concerned with the SQL Client. How about put “supporting jdbc-sdk” in > >> ‘Future Work’? We can discuss how to implement it in another thread. > >> > >> Best, > >> Yu Zelin > >>> 2022年12月2日 18:12,Shammon FY <zjur...@gmail.com> 写道: > >>> > >>> Hi zelin > >>> > >>> Thanks for driving this discussion. > >>> > >>> I notice that the sql-client will interact with sql-gateway by `REST > >>> Client` in the `Executor` in the FLIP, how about introducing jdbc-sdk for > >>> sql-gateway? > >>> > >>> Then the sql-client can connect the gateway with jdbc-sdk, on the other > >>> hand, the other applications and tools such as jmeter can use the jdbc-sdk > >>> to connect sql-gateway too. > >>> > >>> Best, > >>> Shammon > >>> > >>> > >>> On Fri, Dec 2, 2022 at 4:10 PM yu zelin <yuzelin....@gmail.com> wrote: > >>> > >>>> Hi Jim, > >>>> > >>>> Thanks for your feedback! > >>>> > >>>>> Should this configuration be mentioned in the FLIP? > >>>> > >>>> Sure. > >>>> > >>>>> some way for the server to be able to limit the number of requests it > >>>> receives. > >>>> I’m sorry that this FLIP is dedicated in implementing the Remote mode, so > >>>> we > >>>> didn't consider much about this. I think the option is enough currently. > >>>> I will add > >>>> the improvement suggestions to the ‘Future Work’. > >>>> > >>>>> I wonder if two other options are possible > >>>> > >>>> To forward the raw format to gateway and then to client is possible. The > >>>> raw > >>>> results from sink is in ‘CollectResultIterator#bufferedResult’. First, we > >>>> can find > >>>> a way to get this result without wrapping it. Second, constructing a > >>>> ‘InternalTypeInfo’. > >>>> We can construct it using the schema information (data’s logical type). > >>>> After > >>>> construction, we can get the ’TypeSerializer’ to deserialize the raw > >>>> result. > >>>> > >>>> > >>>> > >>>> > >>>>> 2022年12月1日 04:54,Jim Hughes <jhug...@confluent.io.INVALID> 写道: > >>>>> > >>>>> Hi Yu, > >>>>> > >>>>> Thanks for moving my comments to this thread! Also, thank you for > >>>>> answering my questions; it is helping me understand the SQL Gateway > >>>>> better. > >>>>> > >>>>> 5. > >>>>>> Our idea is to introduce a new session option (like > >>>>> 'sql-client.result.fetch-interval') to control > >>>>> the fetching requests sending frequency. What do you think? > >>>>> > >>>>> Should this configuration be mentioned in the FLIP? > >>>>> > >>>>> One slight concern I have with having 'sql-client.result.fetch-interval' > >>>> as > >>>>> a session configuration is that users could set it low and cause the > >>>> client > >>>>> to send a large volume of requests to the SQL gateway. > >>>>> > >>>>> Generally, I'd like to see some way for the server to be able to limit > >>>> the > >>>>> number of requests it receives. If that really needs to be done by a > >>>> proxy > >>>>> in front of the SQL gateway, that is fine as well. (To be clear, I > >>>>> don't > >>>>> think my concern here should be blocking in any way.) > >>>>> > >>>>> 7. > >>>>>> What is the serialization lifecycle for results? > >>>>> > >>>>> I wonder if two other options are possible: > >>>>> 3) Could the Gateway just forward the result byte array? (Or does the > >>>>> Gateway need to deserialize the response in order to understand it for > >>>> some > >>>>> reason?) > >>>>> 4) Could the JobManager prepare the results in JSON? (Or similarly > >>>>> could > >>>>> the Client read the format which the JobManager sends?) > >>>>> > >>>>> Thanks again! > >>>>> > >>>>> Cheers, > >>>>> > >>>>> Jim > >>>>> > >>>>> On Wed, Nov 30, 2022 at 9:40 AM yu zelin <yuzelin....@gmail.com> wrote: > >>>>> > >>>>>> Hi, all > >>>>>> > >>>>>> Thanks Jim’s questions below. Here I’d like to reply to them. > >>>>>> > >>>>>>> 1. For the Client Parser, is it going to work with the extended syntax > >>>>>>> from the Flink Table Store? > >>>>>>> > >>>>>>> 2. Relatedly, what will happen if an older Client tries to handle > >>>>>> syntax > >>>>>>> that a newer service supports? (Suppose I use a 1.17 client with a > >>>>>> 1.18 > >>>>>>> Gateway/system which has a new keyword. Is there anything we should > >>>> be > >>>>>>> designing for upfront?) > >>>>>>> > >>>>>>> 3. How will client and server version mismatches be handled? Will a > >>>>>>> single gateway be able to support multiple endpoint versions? > >>>>>>> 4. How are commands which change a session handled? Are those sent > >>>> via > >>>>>>> an ExecuteStatementRequest? > >>>>>>> > >>>>>>> 5. The remote POC uses polling for getting back status and getting > >>>> back > >>>>>>> results. Would it be possible to switch to web sockets or some other > >>>>>>> mechanism to avoid polling? If polling is used for both, the polling > >>>>>>> frequency should be different between local and remote configurations. > >>>>>>> > >>>>>>> 6. What does this sentence mean? "The reason why we didn't get the > >>>> sql > >>>>>>> type in client side is because it's hard for the lightweight > >>>>>> client-level > >>>>>>> parser to recognize some sql type sql, such as query with CTE. " > >>>>>>> > >>>>>>> 7. What is the serialization lifecycle for results? It makes sense to > >>>>>>> have some control over whether the gateway returns results as SQL or > >>>>>> JSON. > >>>>>>> I'd love to see a way to avoid needing to serialize and deserialize > >>>>>> results > >>>>>>> on the SQL Gateway if possible. I'm still new enough to the project > >>>>>> that > >>>>>>> I'm not sure if that's readily possible. Maybe the SQL Gateway's > >>>>>> return > >>>>>>> type can be sent as part of the request so that the JobManager can > >>>> send > >>>>>>> back results in an advantageous format? > >>>>>>> > >>>>>>> 8. Does ErrorType need to be marked as @PublicEvolving? > >>>>>>> > >>>>>>> I'm excited for the SQL client to support gateway mode! Given the > >>>> change > >>>>>>> in design, do you think it'll still be part of the Flink 1.17 release? > >>>>>> > >>>>>> 1. ClientParser can work with new (and unknown) SQL syntax. It is > >>>> because > >>>>>> if the > >>>>>> sql type is not recognized, the sql will be submitted to the gateway > >>>>>> directly. > >>>>>> > >>>>>> For more information: Actually, the proposed ClientParser only do two > >>>>>> things: > >>>>>> (1) Tell client commands (help, clear, etc) and sqls apart. > >>>>>> (2) parses several sql types (e.g. SHOW CREATE statement, we can print > >>>> raw > >>>>>> string > >>>>>> for the SHOW CREATE result instead of table). Here the recognization of > >>>>>> sql types > >>>>>> mostly affects the print style, and unrecognized sql also can be > >>>> submitted > >>>>>> to cluster. > >>>>>> So the Client with new ClientParser can work compatible with new > >>>>>> syntax. > >>>>>> > >>>>>> 2. First, I'd like to explain that the gateway APIs and supported > >>>>>> syntax > >>>>>> is two things. > >>>>>> For example, ‘configureSession' and 'completeStatement' are APIs. As > >>>>>> mentioned > >>>>>> in #1, the sql statements which syntax is unknown will be submitted to > >>>> the > >>>>>> gateway, > >>>>>> and whether they can be executed normally depends on whether the > >>>> execution > >>>>>> environment supports the syntax. > >>>>>> > >>>>>>> Is there anything we should be designing for upfront? > >>>>>> > >>>>>> The 'SqlGatewayRestAPIVersion’ has been introduced. But it is for sql > >>>>>> gateway APIs. > >>>>>> > >>>>>> 3. > >>>>>>> How will client and server version mismatches be handled? > >>>>>> > >>>>>> A lower version client can work compatible with a higher version > >>>>>> gateway > >>>>>> because the > >>>>>> old interfaces won’t be deleted. When a higher version client connects > >>>> to > >>>>>> a lower version > >>>>>> gateway, the client should notify the users if they try to use > >>>> unsupported > >>>>>> features. For > >>>>>> example, the client start option ‘-i’ means using initialization file > >>>> to > >>>>>> initialize the session. > >>>>>> We plan to use the gateway’s ‘configureSession’ to implement it. But > >>>> this > >>>>>> API is not > >>>>>> implemented in 1.16 Gateway (SqlGatewayRestAPIVersion = V1), so if the > >>>>>> user try to > >>>>>> use ‘-i’ option to start the client with the 1.16 gateway, the client > >>>>>> should tell the user that > >>>>>> Can’t execute ‘-i’ option with gateway which version is lower than V2. > >>>>>> > >>>>>>> Will a single gateway be able to support multiple endpoint versions? > >>>>>> > >>>>>> Currently, the gateway only starts a highest version endpoint and the > >>>>>> higher version endpoint > >>>>>> is compatible with the lower version endpoint’s protocol. > >>>>>> > >>>>>> 4. Yes. Mostly, we use ’SET’ and ‘RESET’ statements to change the > >>>> session > >>>>>> configuration. > >>>>>> Notice: the client can’t change the session (I mean, close current > >>>> session > >>>>>> and open another > >>>>>> one). I’m not sure if you have need to change the session itself? > >>>>>> > >>>>>> 5. > >>>>>>> Would it be possible to switch to web sockets or some other mechanism > >>>>>> to avoid polling? > >>>>>> > >>>>>> Your suggestion is very good, but this flip is for supporting the > >>>>>> remote > >>>>>> client. How about taking > >>>>>> it as a future work? > >>>>>> > >>>>>>> If polling is used for both, the polling frequency should be different > >>>>>> between local and remote > >>>>>> configurations. > >>>>>> > >>>>>> Our idea is to introduce a new session option (like > >>>>>> 'sql-client.result.fetch-interval') to control > >>>>>> the fetching requests sending frequency. What do you think? > >>>>>> > >>>>>> For more information: we are inclined to keep the polling behavior in > >>>> this > >>>>>> version. For streaming > >>>>>> query, fetching results synchronously may occupy resources of the > >>>> gateway > >>>>>> in a long period. > >>>>>> For example, if the job doesn’t return results for a long time because > >>>> the > >>>>>> window has not been > >>>>>> triggered, the synchronously fetching will keep occupying the > >>>> connection. > >>>>>> In asynchronous > >>>>>> situation, the gateway can return a NOT_READY_RESULT quickly and > >>>>>> release > >>>>>> the resources > >>>>>> for other clients to use. I think we can make some improvements for the > >>>>>> whole flow path in the > >>>>>> future. > >>>>>> > >>>>>> 6. Sorry for that there is mistakes in this sentence. Let me make it > >>>> clear. > >>>>>> > >>>>>> We proposed to add 'ContentType' to indicates the result is for what > >>>> kind > >>>>>> of sql. In this sentence, > >>>>>> I want to explain why we add 'ContentType' since the ClientParser can > >>>>>> recognize the sql type too. > >>>>>> It is because the proposed ClientParser can't recognize complex syntax. > >>>>>> For example, it can’t > >>>>>> recognize query with CTE. So the result should carry content type > >>>>>> information to help the client to > >>>>>> know the sql type. For example, the 'ContentType.QUERY_RESULT' > >>>>>> indicates > >>>>>> the result is for a > >>>>>> query statement. > >>>>>> > >>>>>> 7. > >>>>>>> What is the serialization lifecycle for results? > >>>>>> > >>>>>> 1) Sink to JobManager : RowData -> Byte[ ] (serialize) > >>>>>> 2) JobManager to Gateway : Byte[ ] -> RowData (deserialize) > >>>>>> 3) Gateway sending : RowData -> Byte[ ] (serialized to JSON > >>>>>> format) > >>>>>> 4) Client receiving : Byte[ ] -> RowData (deserialize) > >>>>>> > >>>>>>> Maybe the SQL Gateway's return type can be sent as part of the request > >>>>>> so that the > >>>>>> JobManager can send back results in an advantageous format? > >>>>>> > >>>>>> Yes. I think it's an improvement for the Client and Gateway. We have > >>>> some > >>>>>> ideas. For example, > >>>>>> > >>>>>> 1) We can move the Gateway into the JobManager and reduce the Ser/De > >>>> costs > >>>>>> from JM to Gateway. > >>>>>> 2) Or the Gateway can collect the data from the sink function directly > >>>>>> instead of JobManager. > >>>>>> > >>>>>> But I think we can leave this as a future work and discuss in another > >>>>>> thread. > >>>>>> > >>>>>> 8. Yes. > >>>>>> > >>>>>>> Do you think it'll still be part of the Flink 1.17 release? > >>>>>> Yes. We will try our best to finish the work. > >>>>>> > >>>>>> Feel free to talk to me if I’m wrong or you have any other questions. > >>>>>> > >>>>>> > >>>>>>> 2022年11月25日 11:48,yu zelin <yuzelin....@gmail.com> 写道: > >>>>>>> > >>>>>>> Hi, all > >>>>>>> > >>>>>>> I want to initiate a discussion on the FLIP-275: Support Remote SQL > >>>>>> Client Based on SQL Gateway[1]. > >>>>>>> The motivation of this FLIP is that the current SQL Client allows only > >>>>>> local connection which can not satisfy > >>>>>>> the common need of connecting to a remote cluster. > >>>>>>> > >>>>>>> Since the FLIP-91[2] has introduced SQL Gateway, we proposed to > >>>>>> implement the Remote SQL Client > >>>>>>> based on SQL Gateway. In our design, we proposed two main changes: > >>>>>>> > >>>>>>> 1. New remote mode client which performs connection to the remote > >>>>>> gateway through REST API. > >>>>>>> 2. Migration of the current local mode client. We proposed to refactor > >>>>>> the local client based on SQL Gateway > >>>>>>> to unify the interface for two modes. > >>>>>>> > >>>>>>> Looking forward to your suggestions. > >>>>>>> > >>>>>>> Best, > >>>>>>> Yu Zelin > >>>>>>> > >>>>>>> [1] https://cwiki.apache.org/confluence/x/T48ODg > >>>>>>> [2] https://cwiki.apache.org/confluence/x/rIyMC > >>>>>> > >>>>>> > >>>> > >>>> > >> >