I'm +1 for either: 1. introduce a sql client specific option, or 2. Introduce a table config option and make it apply to both table module & sql client.
It would be the FLIP owner's call to decide. Best, Kurt On Mon, Mar 1, 2021 at 3:25 PM Timo Walther <twal...@apache.org> wrote: > We could also think about reading this config option in Table API. The > effect would be to call `await()` directly in an execute call. I could > also imagine this to be useful esp. when you fire a lot of insert into > queries. We had the case before that users where confused that the > execution happens asynchronously, such an option could prevent this to > happen again. > > Regards, > Timo > > On 01.03.21 05:14, Kurt Young wrote: > > I also asked some users about their opinion that if we introduce some > > config prefixed with "table" but doesn't > > have affection with methods in Table API and SQL. All of them are kind of > > shocked by such question, asking > > why we would do anything like this. > > > > This kind of reaction actually doesn't surprise me a lot, so I jumped in > > and challenged this config option even > > after the FLIP had already been accepted. > > > > If we only have to define the execution behavior for multiple statements > in > > SQL client, we should only introduce > > a config option which would tell users it's affection scope by its name. > > Prefixing with "table" is definitely not a good > > idea here. > > > > Best, > > Kurt > > > > > > On Fri, Feb 26, 2021 at 9:39 PM Leonard Xu <xbjt...@gmail.com> wrote: > > > >> Hi, all > >> > >> Look like there’s only one divergence about option [ table | sql-client > >> ].dml-sync in this thread, correct me if I’m wrong. > >> > >> 1. Leaving the context of this thread, from a user's perspective, > >> the table.xx configurations should take effect in Table API & SQL, > >> the sql-client.xx configurations should only take effect in sql-client. > >> In my(the user's) opinion, other explanations are counterintuitive. > >> > >> 2. It should be pointed out that both all existed table.xx > configurations > >> like table.exec.state.ttl, table.optimizer.agg-phase-strategy, > >> table.local-time-zone,etc.. and the proposed sql-client.xx > configurations > >> like sql-client.verbose, sql-client.execution.max-table-result.rows > >> comply with this convention. > >> > >> 3. Considering the portability to support different CLI tools > (sql-client, > >> sql-gateway, etc.), I prefer table.dml-sync. > >> > >> In addition, I think sql-client/sql-gateway/other CLI tools can be > placed > >> out of flink-table module even in an external project, this should not > >> affect our conclusion. > >> > >> > >> Hope this can help you. > >> > >> > >> Best, > >> Leonard > >> > >> > >> > >>> 在 2021年2月25日,18:51,Shengkai Fang <fskm...@gmail.com> 写道: > >>> > >>> Hi, everyone. > >>> > >>> I do some summaries about the discussion about the option. If the > summary > >>> has errors, please correct me. > >>> > >>> `table.dml-sync`: > >>> - take effect for `executeMultiSql` and sql client > >>> - benefit: SQL script portability. One script for all platforms. > >>> - drawback: Don't work for `TableEnvironment#executeSql`. > >>> > >>> `table.multi-dml-sync`: > >>> - take effect for `executeMultiSql` and sql client > >>> - benefit: SQL script portability > >>> - drawback: It's confused when the sql script has one dml statement but > >>> need to set option `table.multi-dml-sync` > >>> > >>> `client.dml-sync`: > >>> - take effect for sql client only > >>> - benefit: clear definition. > >>> - drawback: Every platform needs to define its own option. Bad SQL > script > >>> portability. > >>> > >>> Just as Jark said, I think the `table.dml-sync` is a good choice if we > >> can > >>> extend its scope and make this option works for `executeSql`. > >>> It's straightforward and users can use this option now in table api. > The > >>> drawback is the `TableResult#await` plays the same role as the option. > >> I > >>> don't think the drawback is really critical because many systems have > >>> commands play the same role with the different names. > >>> > >>> Best, > >>> Shengkai > >>> > >>> Timo Walther <twal...@apache.org> 于2021年2月25日周四 下午4:23写道: > >>> > >>>> The `table.` prefix is meant to be a general option in the table > >>>> ecosystem. Not necessarily attached to Table API or SQL Client. That's > >>>> why SQL Client is also located in the `flink-table` module. > >>>> > >>>> My main concern is the SQL script portability. Declaring the > sync/async > >>>> behavior will happen in many SQL scripts. And users should be easily > >>>> switch from SQL Client to some commercial product without the need of > >>>> changing the script again. > >>>> > >>>> Sure, we can change from `sql-client.dml-sync` to `table.dml-sync` > later > >>>> but that would mean introducing future confusion. An app name (what > >>>> `sql-client` kind of is) should not be part of a config option key if > >>>> other apps will need the same kind of option. > >>>> > >>>> Regards, > >>>> Timo > >>>> > >>>> > >>>> On 24.02.21 08:59, Jark Wu wrote: > >>>>>> From my point of view, I also prefer "sql-client.dml-sync", > >>>>> because the behavior of this configuration is very clear. > >>>>> Even if we introduce a new config in the future, e.g. > `table.dml-sync`, > >>>>> we can also deprecate the sql-client one. > >>>>> > >>>>> Introducing a "table." configuration without any implementation > >>>>> will confuse users a lot, as they expect it should take effect on > >>>>> the Table API. > >>>>> > >>>>> If we want to introduce an unified "table.dml-sync" option, I prefer > >>>>> it should be implemented on Table API and affect all the DMLs on > >>>>> Table API (`tEnv.executeSql`, `Table.executeInsert`, `StatementSet`), > >>>>> as I have mentioned before [1]. > >>>>> > >>>>>> It would be very straightforward that it affects all the DMLs on SQL > >> CLI > >>>>> and > >>>>> TableEnvironment (including `executeSql`, `StatementSet`, > >>>>> `Table#executeInsert`, etc.). > >>>>> This can also make SQL CLI easy to support this configuration by > >> passing > >>>>> through to the TableEnv. > >>>>> > >>>>> Best, > >>>>> Jark > >>>>> > >>>>> > >>>>> [1]: > >>>>> > >>>> > >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-163-SQL-Client-Improvements-tp48354p48665.html > >>>>> > >>>>> > >>>>> On Wed, 24 Feb 2021 at 10:39, Kurt Young <ykt...@gmail.com> wrote: > >>>>> > >>>>>> If we all agree the option should only be handled by sql client, > then > >>>> why > >>>>>> don't we > >>>>>> just call it `sql-client.dml-sync`? As you said, calling it > >>>>>> `table.dml-sync` but has no > >>>>>> affection in `TableEnv.executeSql("INSERT INTO")` will also cause a > >> big > >>>>>> confusion for > >>>>>> users. > >>>>>> > >>>>>> The only concern I saw is if we introduce > >>>>>> "TableEnvironment.executeMultiSql()" in the > >>>>>> future, how do we control the synchronization between statements? > TBH > >> I > >>>>>> don't really > >>>>>> see a strong requirement for such interfaces. Right now, we have a > >>>> pretty > >>>>>> clear semantic > >>>>>> of `TableEnv.executeSql`, and it's very convenient for users if they > >>>> want > >>>>>> to execute multiple > >>>>>> sql statements. They can simulate either synced or async execution > >> with > >>>>>> this building block. > >>>>>> > >>>>>> This will introduce slight overhead for users, but compared to the > >>>>>> confusion we might > >>>>>> cause if we introduce such a method of our own, I think it's better > to > >>>> wait > >>>>>> for some more > >>>>>> feedback. > >>>>>> > >>>>>> Best, > >>>>>> Kurt > >>>>>> > >>>>>> > >>>>>> On Tue, Feb 23, 2021 at 9:45 PM Timo Walther <twal...@apache.org> > >>>> wrote: > >>>>>> > >>>>>>> Hi Kurt, > >>>>>>> > >>>>>>> we can also shorten it to `table.dml-sync` if that would help. Then > >> it > >>>>>>> would confuse users that do a regular `.executeSql("INSERT INTO")` > >> in a > >>>>>>> notebook session. > >>>>>>> > >>>>>>> In any case users will need to learn the semantics of this option. > >>>>>>> `table.multi-dml-sync` should be described as "If a you are in a > >> multi > >>>>>>> statement environment, execute DMLs synchrounous.". I don't have a > >>>>>>> strong opinion on shortening it to `table.dml-sync`. > >>>>>>> > >>>>>>> Just to clarify the implementation: The option should be handled by > >> the > >>>>>>> SQL Client only, but the name can be shared accross platforms. > >>>>>>> > >>>>>>> Regards, > >>>>>>> Timo > >>>>>>> > >>>>>>> > >>>>>>> On 23.02.21 09:54, Kurt Young wrote: > >>>>>>>> Sorry for the late reply, but I'm confused by > >> `table.multi-dml-sync`. > >>>>>>>> > >>>>>>>> IIUC this config will take effect with 2 use cases: > >>>>>>>> 1. SQL client, either interactive mode or executing multiple > >>>> statements > >>>>>>> via > >>>>>>>> -f. In most cases, > >>>>>>>> there will be only one INSERT INTO statement but we are > controlling > >>>> the > >>>>>>>> sync/async behavior > >>>>>>>> with "*multi-dml*-sync". I think this will confuse a lot of users. > >>>>>>> Besides, > >>>>>>>> > >>>>>>>> 2. TableEnvironment#executeMultiSql(), but this is future work, we > >> are > >>>>>>> also > >>>>>>>> not sure if we will > >>>>>>>> really introduce this in the future. > >>>>>>>> > >>>>>>>> I would prefer to introduce this option for only sql client. For > >>>>>>> platforms > >>>>>>>> Timo mentioned which > >>>>>>>> need to control such behavior, I think it's easy and flexible to > >>>>>>> introduce > >>>>>>>> one on their own. > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Kurt > >>>>>>>> > >>>>>>>> > >>>>>>>> On Sat, Feb 20, 2021 at 10:23 AM Shengkai Fang <fskm...@gmail.com > > > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi everyone. > >>>>>>>>> > >>>>>>>>> Sorry for the late response. > >>>>>>>>> > >>>>>>>>> For `execution.runtime-mode`, I think it's much better than > >>>>>>>>> `table.execution.mode`. Thanks for Timo's suggestions! > >>>>>>>>> > >>>>>>>>> For `SHOW CREATE TABLE`, I'm +1 with Jark's comments. We should > >>>>>> clarify > >>>>>>> the > >>>>>>>>> usage of the SHOW CREATE TABLE statements. It should be allowed > to > >>>>>>> specify > >>>>>>>>> the table that is fully qualified and only works for the table > that > >>>> is > >>>>>>>>> created by the sql statements. > >>>>>>>>> > >>>>>>>>> I have updated the FLIP with suggestions. It seems we have > reached > >> a > >>>>>>>>> consensus, I'd like to start a formal vote for the FLIP. > >>>>>>>>> > >>>>>>>>> Please vote +1 to approve the FLIP, or -1 with a comment. > >>>>>>>>> > >>>>>>>>> Best, > >>>>>>>>> Shengkai > >>>>>>>>> > >>>>>>>>> Jark Wu <imj...@gmail.com> 于2021年2月15日周一 下午10:50写道: > >>>>>>>>> > >>>>>>>>>> Hi Ingo, > >>>>>>>>>> > >>>>>>>>>> 1) I think you are right, the table path should be > >> fully-qualified. > >>>>>>>>>> > >>>>>>>>>> 2) I think this is also a good point. The SHOW CREATE TABLE > >>>>>>>>>> only aims to print DDL for the tables registered using SQL > CREATE > >>>>>> TABLE > >>>>>>>>>> DDL. > >>>>>>>>>> If a table is registered using Table API, e.g. > >>>>>>>>>> `StreamTableEnvironment#createTemporaryView(String, > DataStream)`, > >>>>>>>>>> currently it's not possible to print DDL for such tables. > >>>>>>>>>> I think we should point it out in the FLIP. > >>>>>>>>>> > >>>>>>>>>> Best, > >>>>>>>>>> Jark > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Mon, 15 Feb 2021 at 21:33, Ingo Bürk <i...@ververica.com> > >> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi all, > >>>>>>>>>>> > >>>>>>>>>>> I have a couple questions about the SHOW CREATE TABLE > statement. > >>>>>>>>>>> > >>>>>>>>>>> 1) Contrary to the example in the FLIP I think the returned DDL > >>>>>> should > >>>>>>>>>>> always have the table identifier fully-qualified. Otherwise the > >> DDL > >>>>>>>>>> depends > >>>>>>>>>>> on the current context (catalog/database), which could be > >>>>>> surprising, > >>>>>>>>>>> especially since "the same" table can behave differently if > >> created > >>>>>> in > >>>>>>>>>>> different catalogs. > >>>>>>>>>>> 2) How should this handle tables which cannot be fully > >>>> characterized > >>>>>>> by > >>>>>>>>>>> properties only? I don't know if there's an example for this > yet, > >>>>>> but > >>>>>>>>>>> hypothetically this is not currently a requirement, right? This > >>>>>> isn't > >>>>>>>>> as > >>>>>>>>>>> much of a problem if this syntax is SQL-client-specific, but if > >>>> it's > >>>>>>>>>>> general Flink SQL syntax we should consider this (one way or > >>>>>> another). > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Regards > >>>>>>>>>>> Ingo > >>>>>>>>>>> > >>>>>>>>>>> On Fri, Feb 12, 2021 at 3:53 PM Timo Walther < > twal...@apache.org > >>> > >>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi Shengkai, > >>>>>>>>>>>> > >>>>>>>>>>>> thanks for updating the FLIP. > >>>>>>>>>>>> > >>>>>>>>>>>> I have one last comment for the option `table.execution.mode`. > >>>>>> Should > >>>>>>>>>> we > >>>>>>>>>>>> already use the global Flink option `execution.runtime-mode` > >>>>>> instead? > >>>>>>>>>>>> > >>>>>>>>>>>> We are using Flink's options where possible (e.g. ` > >> pipeline.name` > >>>>>>>>> and > >>>>>>>>>>>> `parallism.default`) why not also for batch/streaming mode? > >>>>>>>>>>>> > >>>>>>>>>>>> The description of the option matches to the Blink planner > >>>>>> behavior: > >>>>>>>>>>>> > >>>>>>>>>>>> ``` > >>>>>>>>>>>> Among other things, this controls task scheduling, network > >> shuffle > >>>>>>>>>>>> behavior, and time semantics. > >>>>>>>>>>>> ``` > >>>>>>>>>>>> > >>>>>>>>>>>> Regards, > >>>>>>>>>>>> Timo > >>>>>>>>>>>> > >>>>>>>>>>>> On 10.02.21 06:30, Shengkai Fang wrote: > >>>>>>>>>>>>> Hi, guys. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I have updated the FLIP. It seems we have reached agreement. > >>>>>> Maybe > >>>>>>>>>> we > >>>>>>>>>>>> can > >>>>>>>>>>>>> start the vote soon. If anyone has other questions, please > >> leave > >>>>>>>>> your > >>>>>>>>>>>>> comments. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Best, > >>>>>>>>>>>>> Shengkai > >>>>>>>>>>>>> > >>>>>>>>>>>>> Rui Li <lirui.fu...@gmail.com>于2021年2月9日 周二下午7:52写道: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi guys, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> The conclusion sounds good to me. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Tue, Feb 9, 2021 at 5:39 PM Shengkai Fang < > >> fskm...@gmail.com > >>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi, Timo, Jark. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I am fine with the new option name. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>> Shengkai > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Timo Walther <twal...@apache.org>于2021年2月9日 周二下午5:35写道: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Yes, `TableEnvironment#executeMultiSql()` can be future > >> work. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> @Rui, Shengkai: Are you also fine with this conclusion? > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>> Timo > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On 09.02.21 10:14, Jark Wu wrote: > >>>>>>>>>>>>>>>>> I'm fine with `table.multi-dml-sync`. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> My previous concern about "multi" is that DML in CLI > looks > >>>>>> like > >>>>>>>>>>>>>> single > >>>>>>>>>>>>>>>>> statement. > >>>>>>>>>>>>>>>>> But we can treat CLI as a multi-line accepting statements > >>>> from > >>>>>>>>>>>>>> opening > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>> closing. > >>>>>>>>>>>>>>>>> Thus, I'm fine with `table.multi-dml-sync`. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> So the conclusion is `table.multi-dml-sync` (false by > >>>>>> default), > >>>>>>>>>> and > >>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>> will > >>>>>>>>>>>>>>>>> support this config > >>>>>>>>>>>>>>>>> in SQL CLI first, will support it in > >>>>>>>>>>>>>> TableEnvironment#executeMultiSql() > >>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>> the future, right? > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>> Jark > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Tue, 9 Feb 2021 at 16:37, Timo Walther < > >>>> twal...@apache.org > >>>>>>> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hi everyone, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I understand Rui's concerns. `table.dml-sync` should not > >>>>>> apply > >>>>>>>>>> to > >>>>>>>>>>>>>>>>>> regular `executeSql`. Actually, this option makes only > >> sense > >>>>>>>>>> when > >>>>>>>>>>>>>>>>>> executing multi statements. Once we have a > >>>>>>>>>>>>>>>>>> `TableEnvironment.executeMultiSql()` this config could > be > >>>>>>>>>>>>>> considered. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Maybe we can find a better generic name? Other platforms > >>>> will > >>>>>>>>>> also > >>>>>>>>>>>>>>> need > >>>>>>>>>>>>>>>>>> to have this config option, which is why I would like to > >>>>>>>>> avoid a > >>>>>>>>>>> SQL > >>>>>>>>>>>>>>>>>> Client specific option. Otherwise every platform has to > >> come > >>>>>>>>> up > >>>>>>>>>>> with > >>>>>>>>>>>>>>>>>> this important config option separately. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Maybe `table.multi-dml-sync` `table.multi-stmt-sync`? Or > >>>>>> other > >>>>>>>>>>>>>>> opinions? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>>>>>> Timo > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On 09.02.21 08:50, Shengkai Fang wrote: > >>>>>>>>>>>>>>>>>>> Hi, all. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I think it may cause user confused. The main problem is > >> we > >>>>>>>>>> have > >>>>>>>>>>> no > >>>>>>>>>>>>>>>> means > >>>>>>>>>>>>>>>>>>> to detect the conflict configuration, e.g. users set > the > >>>>>>>>> option > >>>>>>>>>>>>>> true > >>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>> use `TableResult#await` together. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>> Shengkai. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> Best regards! > >>>>>>>>>>>>>> Rui Li > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >> > >> > > > >