Hi, regarding the (un-)quoted question, compatibility is of course an important argument, but in terms of consistency I'd find it a bit surprising that WITH handles it differently than SET, and I wonder if that could cause friction for developers when writing their SQL.
Regards Ingo On Thu, Feb 4, 2021 at 9:38 AM Jark Wu <imj...@gmail.com> wrote: > Hi all, > > Regarding "One Parser", I think it's not possible for now because Calcite > parser can't parse > special characters (e.g. "-") unless quoting them as string literals. > That's why the WITH option > key are string literals not identifiers. > > SET table.exec.mini-batch.enabled = true and ADD JAR > /local/my-home/test.jar > have the same > problems. That's why we propose two parser, one splits lines into multiple > statements and match special > command through regex which is light-weight, and delegate other statements > to the other parser which is Calcite parser. > > Note: we should stick on the unquoted SET table.exec.mini-batch.enabled = > true syntax, > both for backward-compatibility and easy-to-use, and all the other systems > don't have quotes on the key. > > > Regarding "table.planner" vs "sql-client.planner", > if we want to use "table.planner", I think we should explain clearly what's > the scope it can be used in documentation. > Otherwise, there will be users complaining why the planner doesn't change > when setting the configuration on TableEnv. > Would be better throwing an exception to indicate users it's now allowed to > change planner after TableEnv is initialized. > However, it seems not easy to implement. > > Best, > Jark > > On Thu, 4 Feb 2021 at 15:49, godfrey he <godfre...@gmail.com> wrote: > > > Hi everyone, > > > > Regarding "table.planner" and "table.execution-mode" > > If we define that those two options are just used to initialize the > > TableEnvironment, +1 for introducing table options instead of sql-client > > options. > > > > Regarding "the sql client, we will maintain two parsers", I want to give > > more inputs: > > We want to introduce sql-gateway into the Flink project (see FLIP-24 & > > FLIP-91 for more info [1] [2]). In the "gateway" mode, the CLI client and > > the gateway service will communicate through Rest API. The " ADD JAR > > /local/path/jar " will be executed in the CLI client machine. So when we > > submit a sql file which contains multiple statements, the CLI client > needs > > to pick out the "ADD JAR" line, and also statements need to be submitted > or > > executed one by one to make sure the result is correct. The sql file may > be > > look like: > > > > SET xxx=yyy; > > create table my_table ...; > > create table my_sink ...; > > ADD JAR /local/path/jar1; > > create function my_udf as com....MyUdf; > > insert into my_sink select ..., my_udf(xx) from ...; > > REMOVE JAR /local/path/jar1; > > drop function my_udf; > > ADD JAR /local/path/jar2; > > create function my_udf as com....MyUdf2; > > insert into my_sink select ..., my_udf(xx) from ...; > > > > The lines need to be splitted into multiple statements first in the CLI > > client, there are two approaches: > > 1. The CLI client depends on the sql-parser: the sql-parser splits the > > lines and tells which lines are "ADD JAR". > > pro: there is only one parser > > cons: It's a little heavy that the CLI client depends on the sql-parser, > > because the CLI client is just a simple tool which receives the user > > commands and displays the result. The non "ADD JAR" command will be > parsed > > twice. > > > > 2. The CLI client splits the lines into multiple statements and finds the > > ADD JAR command through regex matching. > > pro: The CLI client is very light-weight. > > cons: there are two parsers. > > > > (personally, I prefer the second option) > > > > Regarding "SHOW or LIST JARS", I think we can support them both. > > For default dialect, we support SHOW JARS, but if we switch to hive > > dialect, LIST JARS is also supported. > > > > > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client > > [2] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway > > > > Best, > > Godfrey > > > > Rui Li <lirui.fu...@gmail.com> 于2021年2月4日周四 上午10:40写道: > > > > > Hi guys, > > > > > > Regarding #3 and #4, I agree SHOW JARS is more consistent with other > > > commands than LIST JARS. I don't have a strong opinion about REMOVE vs > > > DELETE though. > > > > > > While flink doesn't need to follow hive syntax, as far as I know, most > > > users who are requesting these features are previously hive users. So I > > > wonder whether we can support both LIST/SHOW JARS and REMOVE/DELETE > JARS > > > as synonyms? It's just like lots of systems accept both EXIT and QUIT > as > > > the command to terminate the program. So if that's not hard to achieve, > > and > > > will make users happier, I don't see a reason why we must choose one > over > > > the other. > > > > > > On Wed, Feb 3, 2021 at 10:33 PM Timo Walther <twal...@apache.org> > wrote: > > > > > > > Hi everyone, > > > > > > > > some feedback regarding the open questions. Maybe we can discuss the > > > > `TableEnvironment.executeMultiSql` story offline to determine how we > > > > proceed with this in the near future. > > > > > > > > 1) "whether the table environment has the ability to update itself" > > > > > > > > Maybe there was some misunderstanding. I don't think that we should > > > > support `tEnv.getConfig.getConfiguration.setString("table.planner", > > > > "old")`. Instead I'm proposing to support > > > > `TableEnvironment.create(Configuration)` where planner and execution > > > > mode are read immediately and a subsequent changes to these options > > will > > > > have no effect. We are doing it similar in `new > > > > StreamExecutionEnvironment(Configuration)`. These two ConfigOption's > > > > must not be SQL Client specific but can be part of the core table > code > > > > base. Many users would like to get a 100% preconfigured environment > > from > > > > just Configuration. And this is not possible right now. We can solve > > > > both use cases in one change. > > > > > > > > 2) "the sql client, we will maintain two parsers" > > > > > > > > I remember we had some discussion about this and decided that we > would > > > > like to maintain only one parser. In the end it is "One Flink SQL" > > where > > > > commands influence each other also with respect to keywords. It > should > > > > be fine to include the SQL Client commands in the Flink parser. Of > > > > cource the table environment would not be able to handle the > > `Operation` > > > > instance that would be the result but we can introduce hooks to > handle > > > > those `Operation`s. Or we introduce parser extensions. > > > > > > > > Can we skip `table.job.async` in the first version? We should further > > > > discuss whether we introduce a special SQL clause for wrapping async > > > > behavior or if we use a config option? Esp. for streaming queries we > > > > need to be careful and should force users to either "one INSERT INTO" > > or > > > > "one STATEMENT SET". > > > > > > > > 3) 4) "HIVE also uses these commands" > > > > > > > > In general, Hive is not a good reference. Aligning the commands more > > > > with the remaining commands should be our goal. We just had a MODULE > > > > discussion where we selected SHOW instead of LIST. But it is true > that > > > > JARs are not part of the catalog which is why I would not use > > > > CREATE/DROP. ADD/REMOVE are commonly siblings in the English > language. > > > > Take a look at the Java collection API as another example. > > > > > > > > 6) "Most of the commands should belong to the table environment" > > > > > > > > Thanks for updating the FLIP this makes things easier to understand. > It > > > > is good to see that most commends will be available in > > TableEnvironment. > > > > However, I would also support SET and RESET for consistency. Again, > > from > > > > an architectural point of view, if we would allow some kind of > > > > `Operation` hook in table environment, we could check for SQL Client > > > > specific options and forward to regular > `TableConfig.getConfiguration` > > > > otherwise. What do you think? > > > > > > > > Regards, > > > > Timo > > > > > > > > > > > > On 03.02.21 08:58, Jark Wu wrote: > > > > > Hi Timo, > > > > > > > > > > I will respond some of the questions: > > > > > > > > > > 1) SQL client specific options > > > > > > > > > > Whether it starts with "table" or "sql-client" depends on where the > > > > > configuration takes effect. > > > > > If it is a table configuration, we should make clear what's the > > > behavior > > > > > when users change > > > > > the configuration in the lifecycle of TableEnvironment. > > > > > > > > > > I agree with Shengkai `sql-client.planner` and > > > > `sql-client.execution.mode` > > > > > are something special > > > > > that can't be changed after TableEnvironment has been initialized. > > You > > > > can > > > > > see > > > > > `StreamExecutionEnvironment` provides `configure()` method to > > override > > > > > configuration after > > > > > StreamExecutionEnvironment has been initialized. > > > > > > > > > > Therefore, I think it would be better to still use > > > `sql-client.planner` > > > > > and `sql-client.execution.mode`. > > > > > > > > > > 2) Execution file > > > > > > > > > >>From my point of view, there is a big difference between > > > > > `sql-client.job.detach` and > > > > > `TableEnvironment.executeMultiSql()` that `sql-client.job.detach` > > will > > > > > affect every single DML statement > > > > > in the terminal, not only the statements in SQL files. I think the > > > single > > > > > DML statement in the interactive > > > > > terminal is something like tEnv#executeSql() instead of > > > > > tEnv#executeMultiSql. > > > > > So I don't like the "multi" and "sql" keyword in > > > `table.multi-sql-async`. > > > > > I just find that runtime provides a configuration called > > > > > "execution.attached" [1] which is false by default > > > > > which specifies if the pipeline is submitted in attached or > detached > > > > mode. > > > > > It provides exactly the same > > > > > functionality of `sql-client.job.detach`. What do you think about > > using > > > > > this option? > > > > > > > > > > If we also want to support this config in TableEnvironment, I think > > it > > > > > should also affect the DML execution > > > > > of `tEnv#executeSql()`, not only DMLs in > `tEnv#executeMultiSql()`. > > > > > Therefore, the behavior may look like this: > > > > > > > > > > val tableResult = tEnv.executeSql("INSERT INTO ...") ==> async by > > > > default > > > > > tableResult.await() ==> manually block until finish > > > > > tEnv.getConfig().getConfiguration().setString("execution.attached", > > > > "true") > > > > > val tableResult2 = tEnv.executeSql("INSERT INTO ...") ==> sync, > > don't > > > > need > > > > > to wait on the TableResult > > > > > tEnv.executeMultiSql( > > > > > """ > > > > > CREATE TABLE .... ==> always sync > > > > > INSERT INTO ... => sync, because we set configuration above > > > > > SET execution.attached = false; > > > > > INSERT INTO ... => async > > > > > """) > > > > > > > > > > On the other hand, I think `sql-client.job.detach` > > > > > and `TableEnvironment.executeMultiSql()` should be two separate > > topics, > > > > > as Shengkai mentioned above, SQL CLI only depends on > > > > > `TableEnvironment#executeSql()` to support multi-line statements. > > > > > I'm fine with making `executeMultiSql()` clear but don't want it to > > > block > > > > > this FLIP, maybe we can discuss this in another thread. > > > > > > > > > > > > > > > Best, > > > > > Jark > > > > > > > > > > [1]: > > > > > > > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/deployment/config.html#execution-attached > > > > > > > > > > On Wed, 3 Feb 2021 at 15:33, Shengkai Fang <fskm...@gmail.com> > > wrote: > > > > > > > > > >> Hi, Timo. > > > > >> Thanks for your detailed feedback. I have some thoughts about your > > > > >> feedback. > > > > >> > > > > >> *Regarding #1*: I think the main problem is whether the table > > > > environment > > > > >> has the ability to update itself. Let's take a simple program as > an > > > > >> example. > > > > >> > > > > >> > > > > >> ``` > > > > >> TableEnvironment tEnv = TableEnvironment.create(...); > > > > >> > > > > >> tEnv.getConfig.getConfiguration.setString("table.planner", "old"); > > > > >> > > > > >> > > > > >> tEnv.executeSql("..."); > > > > >> > > > > >> ``` > > > > >> > > > > >> If we regard this option as a table option, users don't have to > > create > > > > >> another table environment manually. In that case, tEnv needs to > > check > > > > >> whether the current mode and planner are the same as before when > > > > executeSql > > > > >> or explainSql. I don't think it's easy work for the table > > environment, > > > > >> especially if users have a StreamExecutionEnvironment but set old > > > > planner > > > > >> and batch mode. But when we make this option as a sql client > option, > > > > users > > > > >> only use the SET command to change the setting. We can rebuild a > new > > > > table > > > > >> environment when set successes. > > > > >> > > > > >> > > > > >> *Regarding #2*: I think we need to discuss the implementation > before > > > > >> continuing this topic. In the sql client, we will maintain two > > > parsers. > > > > The > > > > >> first parser(client parser) will only match the sql client > commands. > > > If > > > > the > > > > >> client parser can't parse the statement, we will leverage the > power > > of > > > > the > > > > >> table environment to execute. According to our blueprint, > > > > >> TableEnvironment#executeSql is enough for the sql client. > Therefore, > > > > >> TableEnvironment#executeMultiSql is out-of-scope for this FLIP. > > > > >> > > > > >> But if we need to introduce the `TableEnvironment.executeMultiSql` > > in > > > > the > > > > >> future, I think it's OK to use the option `table.multi-sql-async` > > > rather > > > > >> than option `sql-client.job.detach`. But we think the name is not > > > > suitable > > > > >> because the name is confusing for others. When setting the option > > > > false, we > > > > >> just mean it will block the execution of the INSERT INTO > statement, > > > not > > > > DDL > > > > >> or others(other sql statements are always executed synchronously). > > So > > > > how > > > > >> about `table.job.async`? It only works for the sql-client and the > > > > >> executeMultiSql. If we set this value false, the table environment > > > will > > > > >> return the result until the job finishes. > > > > >> > > > > >> > > > > >> *Regarding #3, #4*: I still think we should use DELETE JAR and > LIST > > > JAR > > > > >> because HIVE also uses these commands to add the jar into the > > > classpath > > > > or > > > > >> delete the jar. If we use such commands, it can reduce our work > for > > > > hive > > > > >> compatibility. > > > > >> > > > > >> For SHOW JAR, I think the main concern is the jars are not > > maintained > > > by > > > > >> the Catalog. If we really needs to keep consistent with SQL > grammar, > > > > maybe > > > > >> we should use > > > > >> > > > > >> `ADD JAR` -> `CREATE JAR`, > > > > >> `DELETE JAR` -> `DROP JAR`, > > > > >> `LIST JAR` -> `SHOW JAR`. > > > > >> > > > > >> *Regarding #5*: I agree with you that we'd better keep consistent. > > > > >> > > > > >> *Regarding #6*: Yes. Most of the commands should belong to the > table > > > > >> environment. In the Summary section, I use the <NOTE> tag to > > identify > > > > which > > > > >> commands should belong to the sql client and which commands should > > > > belong > > > > >> to the table environment. I also add a new section about > > > implementation > > > > >> details in the FLIP. > > > > >> > > > > >> Best, > > > > >> Shengkai > > > > >> > > > > >> Timo Walther <twal...@apache.org> 于2021年2月2日周二 下午6:43写道: > > > > >> > > > > >>> Thanks for this great proposal Shengkai. This will give the SQL > > > Client > > > > a > > > > >>> very good update and make it production ready. > > > > >>> > > > > >>> Here is some feedback from my side: > > > > >>> > > > > >>> 1) SQL client specific options > > > > >>> > > > > >>> I don't think that `sql-client.planner` and > > > `sql-client.execution.mode` > > > > >>> are SQL Client specific. Similar to `StreamExecutionEnvironment` > > and > > > > >>> `ExecutionConfig#configure` that have been added recently, we > > should > > > > >>> offer a possibility for TableEnvironment. How about we offer > > > > >>> `TableEnvironment.create(ReadableConfig)` and add a > `table.planner` > > > and > > > > >>> `table.execution-mode` to > > > > >>> `org.apache.flink.table.api.config.TableConfigOptions`? > > > > >>> > > > > >>> 2) Execution file > > > > >>> > > > > >>> Did you have a look at the Appendix of FLIP-84 [1] including the > > > > mailing > > > > >>> list thread at that time? Could you further elaborate how the > > > > >>> multi-statement execution should work for a unified > batch/streaming > > > > >>> story? According to our past discussions, each line in an > execution > > > > file > > > > >>> should be executed blocking which means a streaming query needs a > > > > >>> statement set to execute multiple INSERT INTO statement, correct? > > We > > > > >>> should also offer this functionality in > > > > >>> `TableEnvironment.executeMultiSql()`. Whether > > `sql-client.job.detach` > > > > is > > > > >>> SQL Client specific needs to be determined, it could also be a > > > general > > > > >>> `table.multi-sql-async` option? > > > > >>> > > > > >>> 3) DELETE JAR > > > > >>> > > > > >>> Shouldn't the opposite of "ADD" be "REMOVE"? "DELETE" sounds like > > one > > > > is > > > > >>> actively deleting the JAR in the corresponding path. > > > > >>> > > > > >>> 4) LIST JAR > > > > >>> > > > > >>> This should be `SHOW JARS` according to other SQL commands such > as > > > > `SHOW > > > > >>> CATALOGS`, `SHOW TABLES`, etc. [2]. > > > > >>> > > > > >>> 5) EXPLAIN [ExplainDetail[, ExplainDetail]*] > > > > >>> > > > > >>> We should keep the details in sync with > > > > >>> `org.apache.flink.table.api.ExplainDetail` and avoid confusion > > about > > > > >>> differently named ExplainDetails. I would vote for > `ESTIMATED_COST` > > > > >>> instead of `COST`. I'm sure the original author had a reason why > to > > > > call > > > > >>> it that way. > > > > >>> > > > > >>> 6) Implementation details > > > > >>> > > > > >>> It would be nice to understand how we plan to implement the given > > > > >>> features. Most of the commands and config options should go into > > > > >>> TableEnvironment and SqlParser directly, correct? This way users > > > have a > > > > >>> unified way of using Flink SQL. TableEnvironment would provide a > > > > similar > > > > >>> user experience in notebooks or interactive programs than the SQL > > > > Client. > > > > >>> > > > > >>> [1] > > > > >>> > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 > > > > >>> [2] > > > > >>> > > > > >>> > > > > >> > > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/show.html > > > > >>> > > > > >>> Regards, > > > > >>> Timo > > > > >>> > > > > >>> > > > > >>> On 02.02.21 10:13, Shengkai Fang wrote: > > > > >>>> Sorry for the typo. I mean `RESET` is much better rather than > > > `UNSET`. > > > > >>>> > > > > >>>> Shengkai Fang <fskm...@gmail.com> 于2021年2月2日周二 下午4:44写道: > > > > >>>> > > > > >>>>> Hi, Jingsong. > > > > >>>>> > > > > >>>>> Thanks for your reply. I think `UNSET` is much better. > > > > >>>>> > > > > >>>>> 1. We don't need to introduce another command `UNSET`. `RESET` > is > > > > >>>>> supported in the current sql client now. Our proposal just > > extends > > > > its > > > > >>>>> grammar and allow users to reset the specified keys. > > > > >>>>> 2. Hive beeline also uses `RESET` to set the key to the default > > > > >>> value[1]. > > > > >>>>> I think it is more friendly for batch users. > > > > >>>>> > > > > >>>>> Best, > > > > >>>>> Shengkai > > > > >>>>> > > > > >>>>> [1] > > > > >>> > > https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients > > > > >>>>> > > > > >>>>> Jingsong Li <jingsongl...@gmail.com> 于2021年2月2日周二 下午1:56写道: > > > > >>>>> > > > > >>>>>> Thanks for the proposal, yes, sql-client is too outdated. +1 > for > > > > >>>>>> improving it. > > > > >>>>>> > > > > >>>>>> About "SET" and "RESET", Why not be "SET" and "UNSET"? > > > > >>>>>> > > > > >>>>>> Best, > > > > >>>>>> Jingsong > > > > >>>>>> > > > > >>>>>> On Mon, Feb 1, 2021 at 2:46 PM Rui Li <lirui.fu...@gmail.com> > > > > wrote: > > > > >>>>>> > > > > >>>>>>> Thanks Shengkai for the update! The proposed changes look > good > > to > > > > >> me. > > > > >>>>>>> > > > > >>>>>>> On Fri, Jan 29, 2021 at 8:26 PM Shengkai Fang < > > fskm...@gmail.com > > > > > > > > >>> wrote: > > > > >>>>>>> > > > > >>>>>>>> Hi, Rui. > > > > >>>>>>>> You are right. I have already modified the FLIP. > > > > >>>>>>>> > > > > >>>>>>>> The main changes: > > > > >>>>>>>> > > > > >>>>>>>> # -f parameter has no restriction about the statement type. > > > > >>>>>>>> Sometimes, users use the pipe to redirect the result of > > queries > > > to > > > > >>>>>>> debug > > > > >>>>>>>> when submitting job by -f parameter. It's much convenient > > > > comparing > > > > >>> to > > > > >>>>>>>> writing INSERT INTO statements. > > > > >>>>>>>> > > > > >>>>>>>> # Add a new sql client option `sql-client.job.detach` . > > > > >>>>>>>> Users prefer to execute jobs one by one in the batch mode. > > Users > > > > >> can > > > > >>>>>>> set > > > > >>>>>>>> this option false and the client will process the next job > > until > > > > >> the > > > > >>>>>>>> current job finishes. The default value of this option is > > false, > > > > >>> which > > > > >>>>>>>> means the client will execute the next job when the current > > job > > > is > > > > >>>>>>>> submitted. > > > > >>>>>>>> > > > > >>>>>>>> Best, > > > > >>>>>>>> Shengkai > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>> Rui Li <lirui.fu...@gmail.com> 于2021年1月29日周五 下午4:52写道: > > > > >>>>>>>> > > > > >>>>>>>>> Hi Shengkai, > > > > >>>>>>>>> > > > > >>>>>>>>> Regarding #2, maybe the -f options in flink and hive have > > > > >> different > > > > >>>>>>>>> implications, and we should clarify the behavior. For > > example, > > > if > > > > >>> the > > > > >>>>>>>>> client just submits the job and exits, what happens if the > > file > > > > >>>>>>> contains > > > > >>>>>>>>> two INSERT statements? I don't think we should treat them > as > > a > > > > >>>>>>> statement > > > > >>>>>>>>> set, because users should explicitly write BEGIN STATEMENT > > SET > > > in > > > > >>> that > > > > >>>>>>>>> case. And the client shouldn't asynchronously submit the > two > > > > jobs, > > > > >>>>>>> because > > > > >>>>>>>>> the 2nd may depend on the 1st, right? > > > > >>>>>>>>> > > > > >>>>>>>>> On Fri, Jan 29, 2021 at 4:30 PM Shengkai Fang < > > > fskm...@gmail.com > > > > > > > > > >>>>>>> wrote: > > > > >>>>>>>>> > > > > >>>>>>>>>> Hi Rui, > > > > >>>>>>>>>> Thanks for your feedback. I agree with your suggestions. > > > > >>>>>>>>>> > > > > >>>>>>>>>> For the suggestion 1: Yes. we are plan to strengthen the > set > > > > >>>>>>> command. In > > > > >>>>>>>>>> the implementation, it will just put the key-value into > the > > > > >>>>>>>>>> `Configuration`, which will be used to generate the table > > > > config. > > > > >>> If > > > > >>>>>>> hive > > > > >>>>>>>>>> supports to read the setting from the table config, users > > are > > > > >> able > > > > >>>>>>> to set > > > > >>>>>>>>>> the hive-related settings. > > > > >>>>>>>>>> > > > > >>>>>>>>>> For the suggestion 2: The -f parameter will submit the job > > and > > > > >>> exit. > > > > >>>>>>> If > > > > >>>>>>>>>> the queries never end, users have to cancel the job by > > > > >> themselves, > > > > >>>>>>> which is > > > > >>>>>>>>>> not reliable(people may forget their jobs). In most case, > > > > queries > > > > >>>>>>> are used > > > > >>>>>>>>>> to analyze the data. Users should use queries in the > > > interactive > > > > >>>>>>> mode. > > > > >>>>>>>>>> > > > > >>>>>>>>>> Best, > > > > >>>>>>>>>> Shengkai > > > > >>>>>>>>>> > > > > >>>>>>>>>> Rui Li <lirui.fu...@gmail.com> 于2021年1月29日周五 下午3:18写道: > > > > >>>>>>>>>> > > > > >>>>>>>>>>> Thanks Shengkai for bringing up this discussion. I think > it > > > > >>> covers a > > > > >>>>>>>>>>> lot of useful features which will dramatically improve > the > > > > >>>>>>> usability of our > > > > >>>>>>>>>>> SQL Client. I have two questions regarding the FLIP. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> 1. Do you think we can let users set arbitrary > > configurations > > > > >> via > > > > >>>>>>> the > > > > >>>>>>>>>>> SET command? A connector may have its own configurations > > and > > > we > > > > >>>>>>> don't have > > > > >>>>>>>>>>> a way to dynamically change such configurations in SQL > > > Client. > > > > >> For > > > > >>>>>>> example, > > > > >>>>>>>>>>> users may want to be able to change hive conf when using > > hive > > > > >>>>>>> connector [1]. > > > > >>>>>>>>>>> 2. Any reason why we have to forbid queries in SQL files > > > > >> specified > > > > >>>>>>> with > > > > >>>>>>>>>>> the -f option? Hive supports a similar -f option but > allows > > > > >>> queries > > > > >>>>>>> in the > > > > >>>>>>>>>>> file. And a common use case is to run some query and > > redirect > > > > >> the > > > > >>>>>>> results > > > > >>>>>>>>>>> to a file. So I think maybe flink users would like to do > > the > > > > >> same, > > > > >>>>>>>>>>> especially in batch scenarios. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-20590 > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> On Fri, Jan 29, 2021 at 10:46 AM Sebastian Liu < > > > > >>>>>>> liuyang0...@gmail.com> > > > > >>>>>>>>>>> wrote: > > > > >>>>>>>>>>> > > > > >>>>>>>>>>>> Hi Shengkai, > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> Glad to see this improvement. And I have some additional > > > > >>>>>>> suggestions: > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> #1. Unify the TableEnvironment in ExecutionContext to > > > > >>>>>>>>>>>> StreamTableEnvironment for both streaming and batch sql. > > > > >>>>>>>>>>>> #2. Improve the way of results retrieval: sql client > > collect > > > > >> the > > > > >>>>>>>>>>>> results > > > > >>>>>>>>>>>> locally all at once using accumulators at present, > > > > >>>>>>>>>>>> which may have memory issues in JM or Local for > > the > > > > big > > > > >>> query > > > > >>>>>>>>>>>> result. > > > > >>>>>>>>>>>> Accumulator is only suitable for testing purpose. > > > > >>>>>>>>>>>> We may change to use SelectTableSink, which is > > based > > > > >>>>>>>>>>>> on CollectSinkOperatorCoordinator. > > > > >>>>>>>>>>>> #3. Do we need to consider Flink SQL gateway which is in > > > > >> FLIP-91. > > > > >>>>>>> Seems > > > > >>>>>>>>>>>> that this FLIP has not moved forward for a long time. > > > > >>>>>>>>>>>> Provide a long running service out of the box to > > > > >>> facilitate > > > > >>>>>>> the > > > > >>>>>>>>>>>> sql > > > > >>>>>>>>>>>> submission is necessary. > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> What do you think of these? > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> [1] > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> > > > > >>>>>>> > > > > >>> > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> Shengkai Fang <fskm...@gmail.com> 于2021年1月28日周四 > 下午8:54写道: > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>>> Hi devs, > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> Jark and I want to start a discussion about > FLIP-163:SQL > > > > >> Client > > > > >>>>>>>>>>>>> Improvements. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> Many users have complained about the problems of the > sql > > > > >> client. > > > > >>>>>>> For > > > > >>>>>>>>>>>>> example, users can not register the table proposed by > > > > FLIP-95. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> The main changes in this FLIP: > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> - use -i parameter to specify the sql file to > initialize > > > the > > > > >>>>>>> table > > > > >>>>>>>>>>>>> environment and deprecated YAML file; > > > > >>>>>>>>>>>>> - add -f to submit sql file and deprecated '-u' > > parameter; > > > > >>>>>>>>>>>>> - add more interactive commands, e.g ADD JAR; > > > > >>>>>>>>>>>>> - support statement set syntax; > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> For more detailed changes, please refer to FLIP-163[1]. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> Look forward to your feedback. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> Best, > > > > >>>>>>>>>>>>> Shengkai > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> [1] > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>> > > > > >>>>>>> > > > > >>> > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-163%3A+SQL+Client+Improvements > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> -- > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> *With kind regards > > > > >>>>>>>>>>>> > > ------------------------------------------------------------ > > > > >>>>>>>>>>>> Sebastian Liu 刘洋 > > > > >>>>>>>>>>>> Institute of Computing Technology, Chinese Academy of > > > Science > > > > >>>>>>>>>>>> Mobile\WeChat: +86—15201613655 > > > > >>>>>>>>>>>> E-mail: liuyang0...@gmail.com <liuyang0...@gmail.com> > > > > >>>>>>>>>>>> QQ: 3239559* > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> -- > > > > >>>>>>>>>>> Best regards! > > > > >>>>>>>>>>> Rui Li > > > > >>>>>>>>>>> > > > > >>>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> -- > > > > >>>>>>>>> Best regards! > > > > >>>>>>>>> Rui Li > > > > >>>>>>>>> > > > > >>>>>>>> > > > > >>>>>>> > > > > >>>>>>> -- > > > > >>>>>>> Best regards! > > > > >>>>>>> Rui Li > > > > >>>>>>> > > > > >>>>>> > > > > >>>>>> > > > > >>>>>> -- > > > > >>>>>> Best, Jingsong Lee > > > > >>>>>> > > > > >>>>> > > > > >>>> > > > > >>> > > > > >>> > > > > >> > > > > > > > > > > > > > > > > > > > -- > > > Best regards! > > > Rui Li > > > > > >