Hi, everyone.

I do some summaries about the discussion about the option. If the summary
has errors, please correct me.

`table.dml-sync`:
- take effect for `executeMultiSql` and sql client
- benefit: SQL script portability. One script for all platforms.
- drawback: Don't work for `TableEnvironment#executeSql`.

`table.multi-dml-sync`:
- take effect for `executeMultiSql` and sql client
- benefit: SQL script portability
- drawback: It's confused when the sql script has one dml statement but
need to set option `table.multi-dml-sync`

`client.dml-sync`:
- take effect for sql client only
- benefit: clear definition.
- drawback: Every platform needs to define its own option. Bad SQL script
portability.

Just as Jark said, I think the `table.dml-sync` is a good choice if we can
extend its scope and make this option works for `executeSql`.
It's straightforward and users can use this option now in table api.  The
drawback is the  `TableResult#await` plays the same role as the option.  I
don't think the drawback is really critical because many systems have
commands play the same role with the different names.

Best,
Shengkai

Timo Walther <twal...@apache.org> 于2021年2月25日周四 下午4:23写道:

> The `table.` prefix is meant to be a general option in the table
> ecosystem. Not necessarily attached to Table API or SQL Client. That's
> why SQL Client is also located in the `flink-table` module.
>
> My main concern is the SQL script portability. Declaring the sync/async
> behavior will happen in many SQL scripts. And users should be easily
> switch from SQL Client to some commercial product without the need of
> changing the script again.
>
> Sure, we can change from `sql-client.dml-sync` to `table.dml-sync` later
> but that would mean introducing future confusion. An app name (what
> `sql-client` kind of is) should not be part of a config option key if
> other apps will need the same kind of option.
>
> Regards,
> Timo
>
>
> On 24.02.21 08:59, Jark Wu wrote:
> >>From my point of view, I also prefer "sql-client.dml-sync",
> > because the behavior of this configuration is very clear.
> > Even if we introduce a new config in the future, e.g. `table.dml-sync`,
> > we can also deprecate the sql-client one.
> >
> > Introducing a "table."  configuration without any implementation
> > will confuse users a lot, as they expect it should take effect on
> > the Table API.
> >
> > If we want to introduce an unified "table.dml-sync" option, I prefer
> > it should be implemented on Table API and affect all the DMLs on
> > Table API (`tEnv.executeSql`, `Table.executeInsert`, `StatementSet`),
> > as I have mentioned before [1].
> >
> >> It would be very straightforward that it affects all the DMLs on SQL CLI
> > and
> > TableEnvironment (including `executeSql`, `StatementSet`,
> > `Table#executeInsert`, etc.).
> > This can also make SQL CLI easy to support this configuration by passing
> > through to the TableEnv.
> >
> > Best,
> > Jark
> >
> >
> > [1]:
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-163-SQL-Client-Improvements-tp48354p48665.html
> >
> >
> > On Wed, 24 Feb 2021 at 10:39, Kurt Young <ykt...@gmail.com> wrote:
> >
> >> If we all agree the option should only be handled by sql client, then
> why
> >> don't we
> >> just call it `sql-client.dml-sync`? As you said, calling it
> >> `table.dml-sync` but has no
> >> affection in `TableEnv.executeSql("INSERT INTO")` will also cause a big
> >> confusion for
> >> users.
> >>
> >> The only concern I saw is if we introduce
> >> "TableEnvironment.executeMultiSql()" in the
> >> future, how do we control the synchronization between statements? TBH I
> >> don't really
> >> see a strong requirement for such interfaces. Right now, we have a
> pretty
> >> clear semantic
> >> of `TableEnv.executeSql`, and it's very convenient for users if they
> want
> >> to execute multiple
> >> sql statements. They can simulate either synced or async execution with
> >> this building block.
> >>
> >> This will introduce slight overhead for users, but compared to the
> >> confusion we might
> >> cause if we introduce such a method of our own, I think it's better to
> wait
> >> for some more
> >> feedback.
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Tue, Feb 23, 2021 at 9:45 PM Timo Walther <twal...@apache.org>
> wrote:
> >>
> >>> Hi Kurt,
> >>>
> >>> we can also shorten it to `table.dml-sync` if that would help. Then it
> >>> would confuse users that do a regular `.executeSql("INSERT INTO")` in a
> >>> notebook session.
> >>>
> >>> In any case users will need to learn the semantics of this option.
> >>> `table.multi-dml-sync` should be described as "If a you are in a multi
> >>> statement environment, execute DMLs synchrounous.". I don't have a
> >>> strong opinion on shortening it to `table.dml-sync`.
> >>>
> >>> Just to clarify the implementation: The option should be handled by the
> >>> SQL Client only, but the name can be shared accross platforms.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>>
> >>> On 23.02.21 09:54, Kurt Young wrote:
> >>>> Sorry for the late reply, but I'm confused by `table.multi-dml-sync`.
> >>>>
> >>>> IIUC this config will take effect with 2 use cases:
> >>>> 1. SQL client, either interactive mode or executing multiple
> statements
> >>> via
> >>>> -f. In most cases,
> >>>> there will be only one INSERT INTO statement but we are controlling
> the
> >>>> sync/async behavior
> >>>> with "*multi-dml*-sync". I think this will confuse a lot of users.
> >>> Besides,
> >>>>
> >>>> 2. TableEnvironment#executeMultiSql(), but this is future work, we are
> >>> also
> >>>> not sure if we will
> >>>> really introduce this in the future.
> >>>>
> >>>> I would prefer to introduce this option for only sql client. For
> >>> platforms
> >>>> Timo mentioned which
> >>>> need to control such behavior, I think it's easy and flexible to
> >>> introduce
> >>>> one on their own.
> >>>>
> >>>> Best,
> >>>> Kurt
> >>>>
> >>>>
> >>>> On Sat, Feb 20, 2021 at 10:23 AM Shengkai Fang <fskm...@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Hi everyone.
> >>>>>
> >>>>> Sorry for the late response.
> >>>>>
> >>>>> For `execution.runtime-mode`, I think it's much better than
> >>>>> `table.execution.mode`. Thanks for Timo's suggestions!
> >>>>>
> >>>>> For `SHOW CREATE TABLE`, I'm +1 with Jark's comments. We should
> >> clarify
> >>> the
> >>>>> usage of the SHOW CREATE TABLE statements. It should be allowed to
> >>> specify
> >>>>> the table that is fully qualified and only works for the table that
> is
> >>>>> created by the sql statements.
> >>>>>
> >>>>> I have updated the FLIP with suggestions. It seems we have reached a
> >>>>> consensus, I'd like to start a formal vote for the FLIP.
> >>>>>
> >>>>> Please vote +1 to approve the FLIP, or -1 with a comment.
> >>>>>
> >>>>> Best,
> >>>>> Shengkai
> >>>>>
> >>>>> Jark Wu <imj...@gmail.com> 于2021年2月15日周一 下午10:50写道:
> >>>>>
> >>>>>> Hi Ingo,
> >>>>>>
> >>>>>> 1) I think you are right, the table path should be fully-qualified.
> >>>>>>
> >>>>>> 2) I think this is also a good point. The SHOW CREATE TABLE
> >>>>>> only aims to print DDL for the tables registered using SQL CREATE
> >> TABLE
> >>>>>> DDL.
> >>>>>> If a table is registered using Table API,  e.g.
> >>>>>> `StreamTableEnvironment#createTemporaryView(String, DataStream)`,
> >>>>>> currently it's not possible to print DDL for such tables.
> >>>>>> I think we should point it out in the FLIP.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jark
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Mon, 15 Feb 2021 at 21:33, Ingo Bürk <i...@ververica.com> wrote:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I have a couple questions about the SHOW CREATE TABLE statement.
> >>>>>>>
> >>>>>>> 1) Contrary to the example in the FLIP I think the returned DDL
> >> should
> >>>>>>> always have the table identifier fully-qualified. Otherwise the DDL
> >>>>>> depends
> >>>>>>> on the current context (catalog/database), which could be
> >> surprising,
> >>>>>>> especially since "the same" table can behave differently if created
> >> in
> >>>>>>> different catalogs.
> >>>>>>> 2) How should this handle tables which cannot be fully
> characterized
> >>> by
> >>>>>>> properties only? I don't know if there's an example for this yet,
> >> but
> >>>>>>> hypothetically this is not currently a requirement, right? This
> >> isn't
> >>>>> as
> >>>>>>> much of a problem if this syntax is SQL-client-specific, but if
> it's
> >>>>>>> general Flink SQL syntax we should consider this (one way or
> >> another).
> >>>>>>>
> >>>>>>>
> >>>>>>> Regards
> >>>>>>> Ingo
> >>>>>>>
> >>>>>>> On Fri, Feb 12, 2021 at 3:53 PM Timo Walther <twal...@apache.org>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Shengkai,
> >>>>>>>>
> >>>>>>>> thanks for updating the FLIP.
> >>>>>>>>
> >>>>>>>> I have one last comment for the option `table.execution.mode`.
> >> Should
> >>>>>> we
> >>>>>>>> already use the global Flink option `execution.runtime-mode`
> >> instead?
> >>>>>>>>
> >>>>>>>> We are using Flink's options where possible (e.g. `pipeline.name`
> >>>>> and
> >>>>>>>> `parallism.default`) why not also for batch/streaming mode?
> >>>>>>>>
> >>>>>>>> The description of the option matches to the Blink planner
> >> behavior:
> >>>>>>>>
> >>>>>>>> ```
> >>>>>>>> Among other things, this controls task scheduling, network shuffle
> >>>>>>>> behavior, and time semantics.
> >>>>>>>> ```
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Timo
> >>>>>>>>
> >>>>>>>> On 10.02.21 06:30, Shengkai Fang wrote:
> >>>>>>>>> Hi, guys.
> >>>>>>>>>
> >>>>>>>>> I have updated the FLIP.  It seems we have reached agreement.
> >> Maybe
> >>>>>> we
> >>>>>>>> can
> >>>>>>>>> start the vote soon. If anyone has other questions, please leave
> >>>>> your
> >>>>>>>>> comments.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Shengkai
> >>>>>>>>>
> >>>>>>>>> Rui Li <lirui.fu...@gmail.com>于2021年2月9日 周二下午7:52写道:
> >>>>>>>>>
> >>>>>>>>>> Hi guys,
> >>>>>>>>>>
> >>>>>>>>>> The conclusion sounds good to me.
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Feb 9, 2021 at 5:39 PM Shengkai Fang <fskm...@gmail.com
> >
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi, Timo, Jark.
> >>>>>>>>>>>
> >>>>>>>>>>> I am fine with the new option name.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Shengkai
> >>>>>>>>>>>
> >>>>>>>>>>> Timo Walther <twal...@apache.org>于2021年2月9日 周二下午5:35写道:
> >>>>>>>>>>>
> >>>>>>>>>>>> Yes, `TableEnvironment#executeMultiSql()` can be future work.
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Rui, Shengkai: Are you also fine with this conclusion?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Timo
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 09.02.21 10:14, Jark Wu wrote:
> >>>>>>>>>>>>> I'm fine with `table.multi-dml-sync`.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> My previous concern about "multi" is that DML in CLI looks
> >> like
> >>>>>>>>>> single
> >>>>>>>>>>>>> statement.
> >>>>>>>>>>>>> But we can treat CLI as a multi-line accepting statements
> from
> >>>>>>>>>> opening
> >>>>>>>>>>> to
> >>>>>>>>>>>>> closing.
> >>>>>>>>>>>>> Thus, I'm fine with `table.multi-dml-sync`.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So the conclusion is `table.multi-dml-sync` (false by
> >> default),
> >>>>>> and
> >>>>>>>>>> we
> >>>>>>>>>>>> will
> >>>>>>>>>>>>> support this config
> >>>>>>>>>>>>> in SQL CLI first, will support it in
> >>>>>>>>>> TableEnvironment#executeMultiSql()
> >>>>>>>>>>>> in
> >>>>>>>>>>>>> the future, right?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, 9 Feb 2021 at 16:37, Timo Walther <
> twal...@apache.org
> >>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I understand Rui's concerns. `table.dml-sync` should not
> >> apply
> >>>>>> to
> >>>>>>>>>>>>>> regular `executeSql`. Actually, this option makes only sense
> >>>>>> when
> >>>>>>>>>>>>>> executing multi statements. Once we have a
> >>>>>>>>>>>>>> `TableEnvironment.executeMultiSql()` this config could be
> >>>>>>>>>> considered.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Maybe we can find a better generic name? Other platforms
> will
> >>>>>> also
> >>>>>>>>>>> need
> >>>>>>>>>>>>>> to have this config option, which is why I would like to
> >>>>> avoid a
> >>>>>>> SQL
> >>>>>>>>>>>>>> Client specific option. Otherwise every platform has to come
> >>>>> up
> >>>>>>> with
> >>>>>>>>>>>>>> this important config option separately.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Maybe `table.multi-dml-sync` `table.multi-stmt-sync`? Or
> >> other
> >>>>>>>>>>> opinions?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 09.02.21 08:50, Shengkai Fang wrote:
> >>>>>>>>>>>>>>> Hi, all.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think it may cause user confused. The main problem is  we
> >>>>>> have
> >>>>>>> no
> >>>>>>>>>>>> means
> >>>>>>>>>>>>>>> to detect the conflict configuration, e.g. users set the
> >>>>> option
> >>>>>>>>>> true
> >>>>>>>>>>>> and
> >>>>>>>>>>>>>>> use `TableResult#await` together.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Shengkai.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best regards!
> >>>>>>>>>> Rui Li
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
>
>

Reply via email to