I'm +1 for either:
1. introduce a sql client specific option, or
2. Introduce a table config option and make it apply to both table module &
sql client.

It would be the FLIP owner's call to decide.

Best,
Kurt


On Mon, Mar 1, 2021 at 3:25 PM Timo Walther <twal...@apache.org> wrote:

> We could also think about reading this config option in Table API. The
> effect would be to call `await()` directly in an execute call. I could
> also imagine this to be useful esp. when you fire a lot of insert into
> queries. We had the case before that users where confused that the
> execution happens asynchronously, such an option could prevent this to
> happen again.
>
> Regards,
> Timo
>
> On 01.03.21 05:14, Kurt Young wrote:
> > I also asked some users about their opinion that if we introduce some
> > config prefixed with "table" but doesn't
> > have affection with methods in Table API and SQL. All of them are kind of
> > shocked by such question, asking
> > why we would do anything like this.
> >
> > This kind of reaction actually doesn't surprise me a lot, so I jumped in
> > and challenged this config option even
> > after the FLIP had already been accepted.
> >
> > If we only have to define the execution behavior for multiple statements
> in
> > SQL client, we should only introduce
> > a config option which would tell users it's affection scope by its name.
> > Prefixing with "table" is definitely not a good
> > idea here.
> >
> > Best,
> > Kurt
> >
> >
> > On Fri, Feb 26, 2021 at 9:39 PM Leonard Xu <xbjt...@gmail.com> wrote:
> >
> >> Hi, all
> >>
> >> Look like there’s only one divergence about option [ table | sql-client
> >> ].dml-sync in this thread, correct me if I’m wrong.
> >>
> >> 1. Leaving the context of this thread, from a user's perspective,
> >> the table.xx configurations should take effect in Table API & SQL,
> >> the sql-client.xx configurations should only take effect in sql-client.
> >>   In my(the user's) opinion, other explanations are counterintuitive.
> >>
> >> 2.  It should be pointed out that both all existed table.xx
> configurations
> >> like table.exec.state.ttl, table.optimizer.agg-phase-strategy,
> >> table.local-time-zone,etc..  and the proposed sql-client.xx
> configurations
> >> like sql-client.verbose, sql-client.execution.max-table-result.rows
> >> comply with this convention.
> >>
> >> 3. Considering the portability to support different CLI tools
> (sql-client,
> >> sql-gateway, etc.), I prefer table.dml-sync.
> >>
> >> In addition, I think sql-client/sql-gateway/other CLI tools can be
> placed
> >> out of flink-table module even in an external project, this should not
> >> affect our conclusion.
> >>
> >>
> >> Hope this can help you.
> >>
> >>
> >> Best,
> >> Leonard
> >>
> >>
> >>
> >>> 在 2021年2月25日,18:51,Shengkai Fang <fskm...@gmail.com> 写道:
> >>>
> >>> Hi, everyone.
> >>>
> >>> I do some summaries about the discussion about the option. If the
> summary
> >>> has errors, please correct me.
> >>>
> >>> `table.dml-sync`:
> >>> - take effect for `executeMultiSql` and sql client
> >>> - benefit: SQL script portability. One script for all platforms.
> >>> - drawback: Don't work for `TableEnvironment#executeSql`.
> >>>
> >>> `table.multi-dml-sync`:
> >>> - take effect for `executeMultiSql` and sql client
> >>> - benefit: SQL script portability
> >>> - drawback: It's confused when the sql script has one dml statement but
> >>> need to set option `table.multi-dml-sync`
> >>>
> >>> `client.dml-sync`:
> >>> - take effect for sql client only
> >>> - benefit: clear definition.
> >>> - drawback: Every platform needs to define its own option. Bad SQL
> script
> >>> portability.
> >>>
> >>> Just as Jark said, I think the `table.dml-sync` is a good choice if we
> >> can
> >>> extend its scope and make this option works for `executeSql`.
> >>> It's straightforward and users can use this option now in table api.
> The
> >>> drawback is the  `TableResult#await` plays the same role as the option.
> >> I
> >>> don't think the drawback is really critical because many systems have
> >>> commands play the same role with the different names.
> >>>
> >>> Best,
> >>> Shengkai
> >>>
> >>> Timo Walther <twal...@apache.org> 于2021年2月25日周四 下午4:23写道:
> >>>
> >>>> The `table.` prefix is meant to be a general option in the table
> >>>> ecosystem. Not necessarily attached to Table API or SQL Client. That's
> >>>> why SQL Client is also located in the `flink-table` module.
> >>>>
> >>>> My main concern is the SQL script portability. Declaring the
> sync/async
> >>>> behavior will happen in many SQL scripts. And users should be easily
> >>>> switch from SQL Client to some commercial product without the need of
> >>>> changing the script again.
> >>>>
> >>>> Sure, we can change from `sql-client.dml-sync` to `table.dml-sync`
> later
> >>>> but that would mean introducing future confusion. An app name (what
> >>>> `sql-client` kind of is) should not be part of a config option key if
> >>>> other apps will need the same kind of option.
> >>>>
> >>>> Regards,
> >>>> Timo
> >>>>
> >>>>
> >>>> On 24.02.21 08:59, Jark Wu wrote:
> >>>>>>  From my point of view, I also prefer "sql-client.dml-sync",
> >>>>> because the behavior of this configuration is very clear.
> >>>>> Even if we introduce a new config in the future, e.g.
> `table.dml-sync`,
> >>>>> we can also deprecate the sql-client one.
> >>>>>
> >>>>> Introducing a "table."  configuration without any implementation
> >>>>> will confuse users a lot, as they expect it should take effect on
> >>>>> the Table API.
> >>>>>
> >>>>> If we want to introduce an unified "table.dml-sync" option, I prefer
> >>>>> it should be implemented on Table API and affect all the DMLs on
> >>>>> Table API (`tEnv.executeSql`, `Table.executeInsert`, `StatementSet`),
> >>>>> as I have mentioned before [1].
> >>>>>
> >>>>>> It would be very straightforward that it affects all the DMLs on SQL
> >> CLI
> >>>>> and
> >>>>> TableEnvironment (including `executeSql`, `StatementSet`,
> >>>>> `Table#executeInsert`, etc.).
> >>>>> This can also make SQL CLI easy to support this configuration by
> >> passing
> >>>>> through to the TableEnv.
> >>>>>
> >>>>> Best,
> >>>>> Jark
> >>>>>
> >>>>>
> >>>>> [1]:
> >>>>>
> >>>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-163-SQL-Client-Improvements-tp48354p48665.html
> >>>>>
> >>>>>
> >>>>> On Wed, 24 Feb 2021 at 10:39, Kurt Young <ykt...@gmail.com> wrote:
> >>>>>
> >>>>>> If we all agree the option should only be handled by sql client,
> then
> >>>> why
> >>>>>> don't we
> >>>>>> just call it `sql-client.dml-sync`? As you said, calling it
> >>>>>> `table.dml-sync` but has no
> >>>>>> affection in `TableEnv.executeSql("INSERT INTO")` will also cause a
> >> big
> >>>>>> confusion for
> >>>>>> users.
> >>>>>>
> >>>>>> The only concern I saw is if we introduce
> >>>>>> "TableEnvironment.executeMultiSql()" in the
> >>>>>> future, how do we control the synchronization between statements?
> TBH
> >> I
> >>>>>> don't really
> >>>>>> see a strong requirement for such interfaces. Right now, we have a
> >>>> pretty
> >>>>>> clear semantic
> >>>>>> of `TableEnv.executeSql`, and it's very convenient for users if they
> >>>> want
> >>>>>> to execute multiple
> >>>>>> sql statements. They can simulate either synced or async execution
> >> with
> >>>>>> this building block.
> >>>>>>
> >>>>>> This will introduce slight overhead for users, but compared to the
> >>>>>> confusion we might
> >>>>>> cause if we introduce such a method of our own, I think it's better
> to
> >>>> wait
> >>>>>> for some more
> >>>>>> feedback.
> >>>>>>
> >>>>>> Best,
> >>>>>> Kurt
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Feb 23, 2021 at 9:45 PM Timo Walther <twal...@apache.org>
> >>>> wrote:
> >>>>>>
> >>>>>>> Hi Kurt,
> >>>>>>>
> >>>>>>> we can also shorten it to `table.dml-sync` if that would help. Then
> >> it
> >>>>>>> would confuse users that do a regular `.executeSql("INSERT INTO")`
> >> in a
> >>>>>>> notebook session.
> >>>>>>>
> >>>>>>> In any case users will need to learn the semantics of this option.
> >>>>>>> `table.multi-dml-sync` should be described as "If a you are in a
> >> multi
> >>>>>>> statement environment, execute DMLs synchrounous.". I don't have a
> >>>>>>> strong opinion on shortening it to `table.dml-sync`.
> >>>>>>>
> >>>>>>> Just to clarify the implementation: The option should be handled by
> >> the
> >>>>>>> SQL Client only, but the name can be shared accross platforms.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Timo
> >>>>>>>
> >>>>>>>
> >>>>>>> On 23.02.21 09:54, Kurt Young wrote:
> >>>>>>>> Sorry for the late reply, but I'm confused by
> >> `table.multi-dml-sync`.
> >>>>>>>>
> >>>>>>>> IIUC this config will take effect with 2 use cases:
> >>>>>>>> 1. SQL client, either interactive mode or executing multiple
> >>>> statements
> >>>>>>> via
> >>>>>>>> -f. In most cases,
> >>>>>>>> there will be only one INSERT INTO statement but we are
> controlling
> >>>> the
> >>>>>>>> sync/async behavior
> >>>>>>>> with "*multi-dml*-sync". I think this will confuse a lot of users.
> >>>>>>> Besides,
> >>>>>>>>
> >>>>>>>> 2. TableEnvironment#executeMultiSql(), but this is future work, we
> >> are
> >>>>>>> also
> >>>>>>>> not sure if we will
> >>>>>>>> really introduce this in the future.
> >>>>>>>>
> >>>>>>>> I would prefer to introduce this option for only sql client. For
> >>>>>>> platforms
> >>>>>>>> Timo mentioned which
> >>>>>>>> need to control such behavior, I think it's easy and flexible to
> >>>>>>> introduce
> >>>>>>>> one on their own.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Kurt
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Sat, Feb 20, 2021 at 10:23 AM Shengkai Fang <fskm...@gmail.com
> >
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi everyone.
> >>>>>>>>>
> >>>>>>>>> Sorry for the late response.
> >>>>>>>>>
> >>>>>>>>> For `execution.runtime-mode`, I think it's much better than
> >>>>>>>>> `table.execution.mode`. Thanks for Timo's suggestions!
> >>>>>>>>>
> >>>>>>>>> For `SHOW CREATE TABLE`, I'm +1 with Jark's comments. We should
> >>>>>> clarify
> >>>>>>> the
> >>>>>>>>> usage of the SHOW CREATE TABLE statements. It should be allowed
> to
> >>>>>>> specify
> >>>>>>>>> the table that is fully qualified and only works for the table
> that
> >>>> is
> >>>>>>>>> created by the sql statements.
> >>>>>>>>>
> >>>>>>>>> I have updated the FLIP with suggestions. It seems we have
> reached
> >> a
> >>>>>>>>> consensus, I'd like to start a formal vote for the FLIP.
> >>>>>>>>>
> >>>>>>>>> Please vote +1 to approve the FLIP, or -1 with a comment.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Shengkai
> >>>>>>>>>
> >>>>>>>>> Jark Wu <imj...@gmail.com> 于2021年2月15日周一 下午10:50写道:
> >>>>>>>>>
> >>>>>>>>>> Hi Ingo,
> >>>>>>>>>>
> >>>>>>>>>> 1) I think you are right, the table path should be
> >> fully-qualified.
> >>>>>>>>>>
> >>>>>>>>>> 2) I think this is also a good point. The SHOW CREATE TABLE
> >>>>>>>>>> only aims to print DDL for the tables registered using SQL
> CREATE
> >>>>>> TABLE
> >>>>>>>>>> DDL.
> >>>>>>>>>> If a table is registered using Table API,  e.g.
> >>>>>>>>>> `StreamTableEnvironment#createTemporaryView(String,
> DataStream)`,
> >>>>>>>>>> currently it's not possible to print DDL for such tables.
> >>>>>>>>>> I think we should point it out in the FLIP.
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Jark
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, 15 Feb 2021 at 21:33, Ingo Bürk <i...@ververica.com>
> >> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi all,
> >>>>>>>>>>>
> >>>>>>>>>>> I have a couple questions about the SHOW CREATE TABLE
> statement.
> >>>>>>>>>>>
> >>>>>>>>>>> 1) Contrary to the example in the FLIP I think the returned DDL
> >>>>>> should
> >>>>>>>>>>> always have the table identifier fully-qualified. Otherwise the
> >> DDL
> >>>>>>>>>> depends
> >>>>>>>>>>> on the current context (catalog/database), which could be
> >>>>>> surprising,
> >>>>>>>>>>> especially since "the same" table can behave differently if
> >> created
> >>>>>> in
> >>>>>>>>>>> different catalogs.
> >>>>>>>>>>> 2) How should this handle tables which cannot be fully
> >>>> characterized
> >>>>>>> by
> >>>>>>>>>>> properties only? I don't know if there's an example for this
> yet,
> >>>>>> but
> >>>>>>>>>>> hypothetically this is not currently a requirement, right? This
> >>>>>> isn't
> >>>>>>>>> as
> >>>>>>>>>>> much of a problem if this syntax is SQL-client-specific, but if
> >>>> it's
> >>>>>>>>>>> general Flink SQL syntax we should consider this (one way or
> >>>>>> another).
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Regards
> >>>>>>>>>>> Ingo
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Feb 12, 2021 at 3:53 PM Timo Walther <
> twal...@apache.org
> >>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Shengkai,
> >>>>>>>>>>>>
> >>>>>>>>>>>> thanks for updating the FLIP.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I have one last comment for the option `table.execution.mode`.
> >>>>>> Should
> >>>>>>>>>> we
> >>>>>>>>>>>> already use the global Flink option `execution.runtime-mode`
> >>>>>> instead?
> >>>>>>>>>>>>
> >>>>>>>>>>>> We are using Flink's options where possible (e.g. `
> >> pipeline.name`
> >>>>>>>>> and
> >>>>>>>>>>>> `parallism.default`) why not also for batch/streaming mode?
> >>>>>>>>>>>>
> >>>>>>>>>>>> The description of the option matches to the Blink planner
> >>>>>> behavior:
> >>>>>>>>>>>>
> >>>>>>>>>>>> ```
> >>>>>>>>>>>> Among other things, this controls task scheduling, network
> >> shuffle
> >>>>>>>>>>>> behavior, and time semantics.
> >>>>>>>>>>>> ```
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Timo
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 10.02.21 06:30, Shengkai Fang wrote:
> >>>>>>>>>>>>> Hi, guys.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I have updated the FLIP.  It seems we have reached agreement.
> >>>>>> Maybe
> >>>>>>>>>> we
> >>>>>>>>>>>> can
> >>>>>>>>>>>>> start the vote soon. If anyone has other questions, please
> >> leave
> >>>>>>>>> your
> >>>>>>>>>>>>> comments.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>> Shengkai
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Rui Li <lirui.fu...@gmail.com>于2021年2月9日 周二下午7:52写道:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi guys,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The conclusion sounds good to me.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Feb 9, 2021 at 5:39 PM Shengkai Fang <
> >> fskm...@gmail.com
> >>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi, Timo, Jark.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I am fine with the new option name.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Shengkai
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Timo Walther <twal...@apache.org>于2021年2月9日 周二下午5:35写道:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Yes, `TableEnvironment#executeMultiSql()` can be future
> >> work.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> @Rui, Shengkai: Are you also fine with this conclusion?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 09.02.21 10:14, Jark Wu wrote:
> >>>>>>>>>>>>>>>>> I'm fine with `table.multi-dml-sync`.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> My previous concern about "multi" is that DML in CLI
> looks
> >>>>>> like
> >>>>>>>>>>>>>> single
> >>>>>>>>>>>>>>>>> statement.
> >>>>>>>>>>>>>>>>> But we can treat CLI as a multi-line accepting statements
> >>>> from
> >>>>>>>>>>>>>> opening
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> closing.
> >>>>>>>>>>>>>>>>> Thus, I'm fine with `table.multi-dml-sync`.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> So the conclusion is `table.multi-dml-sync` (false by
> >>>>>> default),
> >>>>>>>>>> and
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>> support this config
> >>>>>>>>>>>>>>>>> in SQL CLI first, will support it in
> >>>>>>>>>>>>>> TableEnvironment#executeMultiSql()
> >>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> the future, right?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Tue, 9 Feb 2021 at 16:37, Timo Walther <
> >>>> twal...@apache.org
> >>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I understand Rui's concerns. `table.dml-sync` should not
> >>>>>> apply
> >>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> regular `executeSql`. Actually, this option makes only
> >> sense
> >>>>>>>>>> when
> >>>>>>>>>>>>>>>>>> executing multi statements. Once we have a
> >>>>>>>>>>>>>>>>>> `TableEnvironment.executeMultiSql()` this config could
> be
> >>>>>>>>>>>>>> considered.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Maybe we can find a better generic name? Other platforms
> >>>> will
> >>>>>>>>>> also
> >>>>>>>>>>>>>>> need
> >>>>>>>>>>>>>>>>>> to have this config option, which is why I would like to
> >>>>>>>>> avoid a
> >>>>>>>>>>> SQL
> >>>>>>>>>>>>>>>>>> Client specific option. Otherwise every platform has to
> >> come
> >>>>>>>>> up
> >>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>> this important config option separately.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Maybe `table.multi-dml-sync` `table.multi-stmt-sync`? Or
> >>>>>> other
> >>>>>>>>>>>>>>> opinions?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 09.02.21 08:50, Shengkai Fang wrote:
> >>>>>>>>>>>>>>>>>>> Hi, all.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I think it may cause user confused. The main problem is
> >> we
> >>>>>>>>>> have
> >>>>>>>>>>> no
> >>>>>>>>>>>>>>>> means
> >>>>>>>>>>>>>>>>>>> to detect the conflict configuration, e.g. users set
> the
> >>>>>>>>> option
> >>>>>>>>>>>>>> true
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> use `TableResult#await` together.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>> Shengkai.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> Best regards!
> >>>>>>>>>>>>>> Rui Li
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>
> >>
> >
>
>

Reply via email to