Hi, everyone. I do some summaries about the discussion about the option. If the summary has errors, please correct me.
`table.dml-sync`: - take effect for `executeMultiSql` and sql client - benefit: SQL script portability. One script for all platforms. - drawback: Don't work for `TableEnvironment#executeSql`. `table.multi-dml-sync`: - take effect for `executeMultiSql` and sql client - benefit: SQL script portability - drawback: It's confused when the sql script has one dml statement but need to set option `table.multi-dml-sync` `client.dml-sync`: - take effect for sql client only - benefit: clear definition. - drawback: Every platform needs to define its own option. Bad SQL script portability. Just as Jark said, I think the `table.dml-sync` is a good choice if we can extend its scope and make this option works for `executeSql`. It's straightforward and users can use this option now in table api. The drawback is the `TableResult#await` plays the same role as the option. I don't think the drawback is really critical because many systems have commands play the same role with the different names. Best, Shengkai Timo Walther <twal...@apache.org> 于2021年2月25日周四 下午4:23写道: > The `table.` prefix is meant to be a general option in the table > ecosystem. Not necessarily attached to Table API or SQL Client. That's > why SQL Client is also located in the `flink-table` module. > > My main concern is the SQL script portability. Declaring the sync/async > behavior will happen in many SQL scripts. And users should be easily > switch from SQL Client to some commercial product without the need of > changing the script again. > > Sure, we can change from `sql-client.dml-sync` to `table.dml-sync` later > but that would mean introducing future confusion. An app name (what > `sql-client` kind of is) should not be part of a config option key if > other apps will need the same kind of option. > > Regards, > Timo > > > On 24.02.21 08:59, Jark Wu wrote: > >>From my point of view, I also prefer "sql-client.dml-sync", > > because the behavior of this configuration is very clear. > > Even if we introduce a new config in the future, e.g. `table.dml-sync`, > > we can also deprecate the sql-client one. > > > > Introducing a "table." configuration without any implementation > > will confuse users a lot, as they expect it should take effect on > > the Table API. > > > > If we want to introduce an unified "table.dml-sync" option, I prefer > > it should be implemented on Table API and affect all the DMLs on > > Table API (`tEnv.executeSql`, `Table.executeInsert`, `StatementSet`), > > as I have mentioned before [1]. > > > >> It would be very straightforward that it affects all the DMLs on SQL CLI > > and > > TableEnvironment (including `executeSql`, `StatementSet`, > > `Table#executeInsert`, etc.). > > This can also make SQL CLI easy to support this configuration by passing > > through to the TableEnv. > > > > Best, > > Jark > > > > > > [1]: > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-163-SQL-Client-Improvements-tp48354p48665.html > > > > > > On Wed, 24 Feb 2021 at 10:39, Kurt Young <ykt...@gmail.com> wrote: > > > >> If we all agree the option should only be handled by sql client, then > why > >> don't we > >> just call it `sql-client.dml-sync`? As you said, calling it > >> `table.dml-sync` but has no > >> affection in `TableEnv.executeSql("INSERT INTO")` will also cause a big > >> confusion for > >> users. > >> > >> The only concern I saw is if we introduce > >> "TableEnvironment.executeMultiSql()" in the > >> future, how do we control the synchronization between statements? TBH I > >> don't really > >> see a strong requirement for such interfaces. Right now, we have a > pretty > >> clear semantic > >> of `TableEnv.executeSql`, and it's very convenient for users if they > want > >> to execute multiple > >> sql statements. They can simulate either synced or async execution with > >> this building block. > >> > >> This will introduce slight overhead for users, but compared to the > >> confusion we might > >> cause if we introduce such a method of our own, I think it's better to > wait > >> for some more > >> feedback. > >> > >> Best, > >> Kurt > >> > >> > >> On Tue, Feb 23, 2021 at 9:45 PM Timo Walther <twal...@apache.org> > wrote: > >> > >>> Hi Kurt, > >>> > >>> we can also shorten it to `table.dml-sync` if that would help. Then it > >>> would confuse users that do a regular `.executeSql("INSERT INTO")` in a > >>> notebook session. > >>> > >>> In any case users will need to learn the semantics of this option. > >>> `table.multi-dml-sync` should be described as "If a you are in a multi > >>> statement environment, execute DMLs synchrounous.". I don't have a > >>> strong opinion on shortening it to `table.dml-sync`. > >>> > >>> Just to clarify the implementation: The option should be handled by the > >>> SQL Client only, but the name can be shared accross platforms. > >>> > >>> Regards, > >>> Timo > >>> > >>> > >>> On 23.02.21 09:54, Kurt Young wrote: > >>>> Sorry for the late reply, but I'm confused by `table.multi-dml-sync`. > >>>> > >>>> IIUC this config will take effect with 2 use cases: > >>>> 1. SQL client, either interactive mode or executing multiple > statements > >>> via > >>>> -f. In most cases, > >>>> there will be only one INSERT INTO statement but we are controlling > the > >>>> sync/async behavior > >>>> with "*multi-dml*-sync". I think this will confuse a lot of users. > >>> Besides, > >>>> > >>>> 2. TableEnvironment#executeMultiSql(), but this is future work, we are > >>> also > >>>> not sure if we will > >>>> really introduce this in the future. > >>>> > >>>> I would prefer to introduce this option for only sql client. For > >>> platforms > >>>> Timo mentioned which > >>>> need to control such behavior, I think it's easy and flexible to > >>> introduce > >>>> one on their own. > >>>> > >>>> Best, > >>>> Kurt > >>>> > >>>> > >>>> On Sat, Feb 20, 2021 at 10:23 AM Shengkai Fang <fskm...@gmail.com> > >>> wrote: > >>>> > >>>>> Hi everyone. > >>>>> > >>>>> Sorry for the late response. > >>>>> > >>>>> For `execution.runtime-mode`, I think it's much better than > >>>>> `table.execution.mode`. Thanks for Timo's suggestions! > >>>>> > >>>>> For `SHOW CREATE TABLE`, I'm +1 with Jark's comments. We should > >> clarify > >>> the > >>>>> usage of the SHOW CREATE TABLE statements. It should be allowed to > >>> specify > >>>>> the table that is fully qualified and only works for the table that > is > >>>>> created by the sql statements. > >>>>> > >>>>> I have updated the FLIP with suggestions. It seems we have reached a > >>>>> consensus, I'd like to start a formal vote for the FLIP. > >>>>> > >>>>> Please vote +1 to approve the FLIP, or -1 with a comment. > >>>>> > >>>>> Best, > >>>>> Shengkai > >>>>> > >>>>> Jark Wu <imj...@gmail.com> 于2021年2月15日周一 下午10:50写道: > >>>>> > >>>>>> Hi Ingo, > >>>>>> > >>>>>> 1) I think you are right, the table path should be fully-qualified. > >>>>>> > >>>>>> 2) I think this is also a good point. The SHOW CREATE TABLE > >>>>>> only aims to print DDL for the tables registered using SQL CREATE > >> TABLE > >>>>>> DDL. > >>>>>> If a table is registered using Table API, e.g. > >>>>>> `StreamTableEnvironment#createTemporaryView(String, DataStream)`, > >>>>>> currently it's not possible to print DDL for such tables. > >>>>>> I think we should point it out in the FLIP. > >>>>>> > >>>>>> Best, > >>>>>> Jark > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Mon, 15 Feb 2021 at 21:33, Ingo Bürk <i...@ververica.com> wrote: > >>>>>> > >>>>>>> Hi all, > >>>>>>> > >>>>>>> I have a couple questions about the SHOW CREATE TABLE statement. > >>>>>>> > >>>>>>> 1) Contrary to the example in the FLIP I think the returned DDL > >> should > >>>>>>> always have the table identifier fully-qualified. Otherwise the DDL > >>>>>> depends > >>>>>>> on the current context (catalog/database), which could be > >> surprising, > >>>>>>> especially since "the same" table can behave differently if created > >> in > >>>>>>> different catalogs. > >>>>>>> 2) How should this handle tables which cannot be fully > characterized > >>> by > >>>>>>> properties only? I don't know if there's an example for this yet, > >> but > >>>>>>> hypothetically this is not currently a requirement, right? This > >> isn't > >>>>> as > >>>>>>> much of a problem if this syntax is SQL-client-specific, but if > it's > >>>>>>> general Flink SQL syntax we should consider this (one way or > >> another). > >>>>>>> > >>>>>>> > >>>>>>> Regards > >>>>>>> Ingo > >>>>>>> > >>>>>>> On Fri, Feb 12, 2021 at 3:53 PM Timo Walther <twal...@apache.org> > >>>>> wrote: > >>>>>>> > >>>>>>>> Hi Shengkai, > >>>>>>>> > >>>>>>>> thanks for updating the FLIP. > >>>>>>>> > >>>>>>>> I have one last comment for the option `table.execution.mode`. > >> Should > >>>>>> we > >>>>>>>> already use the global Flink option `execution.runtime-mode` > >> instead? > >>>>>>>> > >>>>>>>> We are using Flink's options where possible (e.g. `pipeline.name` > >>>>> and > >>>>>>>> `parallism.default`) why not also for batch/streaming mode? > >>>>>>>> > >>>>>>>> The description of the option matches to the Blink planner > >> behavior: > >>>>>>>> > >>>>>>>> ``` > >>>>>>>> Among other things, this controls task scheduling, network shuffle > >>>>>>>> behavior, and time semantics. > >>>>>>>> ``` > >>>>>>>> > >>>>>>>> Regards, > >>>>>>>> Timo > >>>>>>>> > >>>>>>>> On 10.02.21 06:30, Shengkai Fang wrote: > >>>>>>>>> Hi, guys. > >>>>>>>>> > >>>>>>>>> I have updated the FLIP. It seems we have reached agreement. > >> Maybe > >>>>>> we > >>>>>>>> can > >>>>>>>>> start the vote soon. If anyone has other questions, please leave > >>>>> your > >>>>>>>>> comments. > >>>>>>>>> > >>>>>>>>> Best, > >>>>>>>>> Shengkai > >>>>>>>>> > >>>>>>>>> Rui Li <lirui.fu...@gmail.com>于2021年2月9日 周二下午7:52写道: > >>>>>>>>> > >>>>>>>>>> Hi guys, > >>>>>>>>>> > >>>>>>>>>> The conclusion sounds good to me. > >>>>>>>>>> > >>>>>>>>>> On Tue, Feb 9, 2021 at 5:39 PM Shengkai Fang <fskm...@gmail.com > > > >>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi, Timo, Jark. > >>>>>>>>>>> > >>>>>>>>>>> I am fine with the new option name. > >>>>>>>>>>> > >>>>>>>>>>> Best, > >>>>>>>>>>> Shengkai > >>>>>>>>>>> > >>>>>>>>>>> Timo Walther <twal...@apache.org>于2021年2月9日 周二下午5:35写道: > >>>>>>>>>>> > >>>>>>>>>>>> Yes, `TableEnvironment#executeMultiSql()` can be future work. > >>>>>>>>>>>> > >>>>>>>>>>>> @Rui, Shengkai: Are you also fine with this conclusion? > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> Timo > >>>>>>>>>>>> > >>>>>>>>>>>> On 09.02.21 10:14, Jark Wu wrote: > >>>>>>>>>>>>> I'm fine with `table.multi-dml-sync`. > >>>>>>>>>>>>> > >>>>>>>>>>>>> My previous concern about "multi" is that DML in CLI looks > >> like > >>>>>>>>>> single > >>>>>>>>>>>>> statement. > >>>>>>>>>>>>> But we can treat CLI as a multi-line accepting statements > from > >>>>>>>>>> opening > >>>>>>>>>>> to > >>>>>>>>>>>>> closing. > >>>>>>>>>>>>> Thus, I'm fine with `table.multi-dml-sync`. > >>>>>>>>>>>>> > >>>>>>>>>>>>> So the conclusion is `table.multi-dml-sync` (false by > >> default), > >>>>>> and > >>>>>>>>>> we > >>>>>>>>>>>> will > >>>>>>>>>>>>> support this config > >>>>>>>>>>>>> in SQL CLI first, will support it in > >>>>>>>>>> TableEnvironment#executeMultiSql() > >>>>>>>>>>>> in > >>>>>>>>>>>>> the future, right? > >>>>>>>>>>>>> > >>>>>>>>>>>>> Best, > >>>>>>>>>>>>> Jark > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, 9 Feb 2021 at 16:37, Timo Walther < > twal...@apache.org > >>> > >>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi everyone, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I understand Rui's concerns. `table.dml-sync` should not > >> apply > >>>>>> to > >>>>>>>>>>>>>> regular `executeSql`. Actually, this option makes only sense > >>>>>> when > >>>>>>>>>>>>>> executing multi statements. Once we have a > >>>>>>>>>>>>>> `TableEnvironment.executeMultiSql()` this config could be > >>>>>>>>>> considered. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Maybe we can find a better generic name? Other platforms > will > >>>>>> also > >>>>>>>>>>> need > >>>>>>>>>>>>>> to have this config option, which is why I would like to > >>>>> avoid a > >>>>>>> SQL > >>>>>>>>>>>>>> Client specific option. Otherwise every platform has to come > >>>>> up > >>>>>>> with > >>>>>>>>>>>>>> this important config option separately. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Maybe `table.multi-dml-sync` `table.multi-stmt-sync`? Or > >> other > >>>>>>>>>>> opinions? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>> Timo > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On 09.02.21 08:50, Shengkai Fang wrote: > >>>>>>>>>>>>>>> Hi, all. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I think it may cause user confused. The main problem is we > >>>>>> have > >>>>>>> no > >>>>>>>>>>>> means > >>>>>>>>>>>>>>> to detect the conflict configuration, e.g. users set the > >>>>> option > >>>>>>>>>> true > >>>>>>>>>>>> and > >>>>>>>>>>>>>>> use `TableResult#await` together. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>> Shengkai. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Best regards! > >>>>>>>>>> Rui Li > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >>> > >> > > > >