Re: [DISCUSS] FLIP-84 Feedback Summary

Fabian Hueske Wed, 29 Apr 2020 01:57:22 -0700

Hi Godfrey,

Thanks for starting this discussion!


In my mind, WATERMARK is a property (or constraint) of a field, just like
PRIMARY KEY.
Take this example from MySQL:

mysql> CREATE TABLE people (id INT NOT NULL, name VARCHAR(128) NOT NULL,
age INT, PRIMARY KEY (id));
Query OK, 0 rows affected (0.06 sec)

mysql> describe people;
+-------+--------------+------+-----+---------+-------+
| Field | Type         | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| id    | int          | NO   | PRI | NULL    |       |
| name  | varchar(128) | NO   |     | NULL    |       |
| age   | int          | YES  |     | NULL    |       |
+-------+--------------+------+-----+---------+-------+
3 rows in set (0.01 sec)

Here, PRIMARY KEY is marked in the Key column of the id field.
We could do the same for watermarks by adding a Watermark column.

Best, Fabian


Am Mi., 29. Apr. 2020 um 10:43 Uhr schrieb godfrey he <godfre...@gmail.com>:

> Hi everyone,
>
> I would like to bring up a discussion about the result type of describe
> statement,
> which is introduced in FLIP-84[1].
> In previous version, we define the result type of `describe` statement is a
> single column as following
>
> Statement
>
> Result Schema
>
> Result Value
>
> Result Kind
>
> Examples
>
> DESCRIBE xx
>
> field name: result
>
> field type: VARCHAR(n)
>
> (n is the max length of values)
>
> describe the detail of an object
>
> (single row)
>
> SUCCESS_WITH_CONTENT
>
> DESCRIBE table_name
>
> for "describe table_name", the result value is the `toString` value of
> `TableSchema`, which is an unstructured data.
> It's hard to for user to use this info.
>
> for example:
>
> TableSchema schema = TableSchema.builder()
>    .field("f0", DataTypes.BIGINT())
>    .field("f1", DataTypes.ROW(
>       DataTypes.FIELD("q1", DataTypes.STRING()),
>       DataTypes.FIELD("q2", DataTypes.TIMESTAMP(3))))
>    .field("f2", DataTypes.STRING())
>    .field("f3", DataTypes.BIGINT(), "f0 + 1")
>    .watermark("f1.q2", WATERMARK_EXPRESSION, WATERMARK_DATATYPE)
>    .build();
>
> its `toString` value is:
> root
>  |-- f0: BIGINT
>  |-- f1: ROW<`q1` STRING, `q2` TIMESTAMP(3)>
>  |-- f2: STRING
>  |-- f3: BIGINT AS f0 + 1
>  |-- WATERMARK FOR f1.q2 AS now()
>
> For hive, MySQL, etc., the describe result is table form including field
> names and field types.
> which is more familiar with users.
> TableSchema[2] has watermark expression and compute column, we should also
> put them into the table:
> for compute column, it's a column level, we add a new column named `expr`.
>  for watermark expression, it's a table level, we add a special row named
> `WATERMARK` to represent it.
>
> The result will look like about above example:
>
> name
>
> type
>
> expr
>
> f0
>
> BIGINT
>
> (NULL)
>
> f1
>
> ROW<`q1` STRING, `q2` TIMESTAMP(3)>
>
> (NULL)
>
> f2
>
> STRING
>
> NULL
>
> f3
>
> BIGINT
>
> f0 + 1
>
> WATERMARK
>
> (NULL)
>
> f1.q2 AS now()
>
> now there is a pr FLINK-17112 [3] to implement DESCRIBE statement.
>
> What do you think about this update?
> Any feedback are welcome~
>
> Best,
> Godfrey
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
> [2]
>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/api/TableSchema.java
> [3] https://github.com/apache/flink/pull/11892
>
>
> godfrey he <godfre...@gmail.com> 于2020年4月6日周一 下午10:38写道：
>
> > Hi Timo,
> >
> > Sorry for the late reply, and thanks for your correction.
> > I missed DQL for job submission scenario.
> > I'll fix the document right away.
> >
> > Best,
> > Godfrey
> >
> > Timo Walther <twal...@apache.org> 于2020年4月3日周五 下午9:53写道：
> >
> >> Hi Godfrey,
> >>
> >> I'm sorry to jump in again but I still need to clarify some things
> >> around TableResult.
> >>
> >> The FLIP says:
> >> "For DML, this method returns TableResult until the job is submitted.
> >> For other statements, TableResult is returned until the execution is
> >> finished."
> >>
> >> I thought we agreed on making every execution async? This also means
> >> returning a TableResult for DQLs even though the execution is not done
> >> yet. People need access to the JobClient also for batch jobs in order to
> >> cancel long lasting queries. If people want to wait for the completion
> >> they can hook into JobClient or collect().
> >>
> >> Can we rephrase this part to:
> >>
> >> The FLIP says:
> >> "For DML and DQL, this method returns TableResult once the job has been
> >> submitted. For DDL and DCL statements, TableResult is returned once the
> >> operation has finished."
> >>
> >> Regards,
> >> Timo
> >>
> >>
> >> On 02.04.20 05:27, godfrey he wrote:
> >> > Hi Aljoscha, Dawid, Timo,
> >> >
> >> > Thanks so much for the detailed explanation.
> >> > Agree with you that the multiline story is not completed now, and we
> can
> >> > keep discussion.
> >> > I will add current discussions and conclusions to the FLIP.
> >> >
> >> > Best,
> >> > Godfrey
> >> >
> >> >
> >> >
> >> > Timo Walther <twal...@apache.org> 于2020年4月1日周三 下午11:27写道：
> >> >
> >> >> Hi Godfrey,
> >> >>
> >> >> first of all, I agree with Dawid. The multiline story is not
> completed
> >> >> by this FLIP. It just verifies the big picture.
> >> >>
> >> >> 1. "control the execution logic through the proposed method if they
> >> know
> >> >> what the statements are"
> >> >>
> >> >> This is a good point that also Fabian raised in the linked google
> doc.
> >> I
> >> >> could also imagine to return a more complicated POJO when calling
> >> >> `executeMultiSql()`.
> >> >>
> >> >> The POJO would include some `getSqlProperties()` such that a platform
> >> >> gets insights into the query before executing. We could also trigger
> >> the
> >> >> execution more explicitly instead of hiding it behind an iterator.
> >> >>
> >> >> 2. "there are some special commands introduced in SQL client"
> >> >>
> >> >> For platforms and SQL Client specific commands, we could offer a hook
> >> to
> >> >> the parser or a fallback parser in case the regular table environment
> >> >> parser cannot deal with the statement.
> >> >>
> >> >> However, all of that is future work and can be discussed in a
> separate
> >> >> FLIP.
> >> >>
> >> >> 3. +1 for the `Iterator` instead of `Iterable`.
> >> >>
> >> >> 4. "we should convert the checked exception to unchecked exception"
> >> >>
> >> >> Yes, I meant using a runtime exception instead of a checked
> exception.
> >> >> There was no consensus on putting the exception into the
> `TableResult`.
> >> >>
> >> >> Regards,
> >> >> Timo
> >> >>
> >> >> On 01.04.20 15:35, Dawid Wysakowicz wrote:
> >> >>> When considering the multi-line support I think it is helpful to
> start
> >> >>> with a use case in mind. In my opinion consumers of this method will
> >> be:
> >> >>>
> >> >>>   1. sql-client
> >> >>>   2. third-part sql based platforms
> >> >>>
> >> >>> @Godfrey As for the quit/source/... commands. I think those belong
> to
> >> >>> the responsibility of aforementioned. I think they should not be
> >> >>> understandable by the TableEnvironment. What would quit on a
> >> >>> TableEnvironment do? Moreover I think such commands should be
> prefixed
> >> >>> appropriately. I think it's a common practice to e.g. prefix those
> >> with
> >> >>> ! or : to say they are meta commands of the tool rather than a
> query.
> >> >>>
> >> >>> I also don't necessarily understand why platform users need to know
> >> the
> >> >>> kind of the query to use the proposed method. They should get the
> type
> >> >>> from the TableResult#ResultKind. If the ResultKind is SUCCESS, it
> was
> >> a
> >> >>> DCL/DDL. If SUCCESS_WITH_CONTENT it was a DML/DQL. If that's not
> >> enough
> >> >>> we can enrich the TableResult with more explicit kind of query, but
> so
> >> >>> far I don't see such a need.
> >> >>>
> >> >>> @Kurt In those cases I would assume the developers want to present
> >> >>> results of the queries anyway. Moreover I think it is safe to assume
> >> >>> they can adhere to such a contract that the results must be
> iterated.
> >> >>>
> >> >>> For direct users of TableEnvironment/Table API this method does not
> >> make
> >> >>> much sense anyway, in my opinion. I think we can rather safely
> assume
> >> in
> >> >>> this scenario they do not want to submit multiple queries at a
> single
> >> >> time.
> >> >>>
> >> >>> Best,
> >> >>>
> >> >>> Dawid
> >> >>>
> >> >>>
> >> >>> On 01/04/2020 15:07, Kurt Young wrote:
> >> >>>> One comment to `executeMultilineSql`, I'm afraid sometimes user
> might
> >> >>>> forget to
> >> >>>> iterate the returned iterators, e.g. user submits a bunch of DDLs
> and
> >> >>>> expect the
> >> >>>> framework will execute them one by one. But it didn't.
> >> >>>>
> >> >>>> Best,
> >> >>>> Kurt
> >> >>>>
> >> >>>>
> >> >>>> On Wed, Apr 1, 2020 at 5:10 PM Aljoscha Krettek<
> aljos...@apache.org>
> >> >> wrote:
> >> >>>>
> >> >>>>> Agreed to what Dawid and Timo said.
> >> >>>>>
> >> >>>>> To answer your question about multi line SQL: no, we don't think
> we
> >> >> need
> >> >>>>> this in Flink 1.11, we only wanted to make sure that the
> interfaces
> >> >> that
> >> >>>>> we now put in place will potentially allow this in the future.
> >> >>>>>
> >> >>>>> Best,
> >> >>>>> Aljoscha
> >> >>>>>
> >> >>>>> On 01.04.20 09:31, godfrey he wrote:
> >> >>>>>> Hi, Timo & Dawid,
> >> >>>>>>
> >> >>>>>> Thanks so much for the effort of `multiline statements
> supporting`,
> >> >>>>>> I have a few questions about this method:
> >> >>>>>>
> >> >>>>>> 1. users can well control the execution logic through the
> proposed
> >> >> method
> >> >>>>>>     if they know what the statements are (a statement is a DDL, a
> >> DML
> >> >> or
> >> >>>>>> others).
> >> >>>>>> but if a statement is from a file, that means users do not know
> >> what
> >> >> the
> >> >>>>>> statements are,
> >> >>>>>> the execution behavior is unclear.
> >> >>>>>> As a platform user, I think this method is hard to use, unless
> the
> >> >>>>> platform
> >> >>>>>> defines
> >> >>>>>> a set of rule about the statements order, such as: no select in
> the
> >> >>>>> middle,
> >> >>>>>> dml must be at tail of sql file (which may be the most case in
> >> product
> >> >>>>>> env).
> >> >>>>>> Otherwise the platform must parse the sql first, then know what
> the
> >> >>>>>> statements are.
> >> >>>>>> If do like that, the platform can handle all cases through
> >> >> `executeSql`
> >> >>>>> and
> >> >>>>>> `StatementSet`.
> >> >>>>>>
> >> >>>>>> 2. SQL client can't also use `executeMultilineSql` to supports
> >> >> multiline
> >> >>>>>> statements,
> >> >>>>>>     because there are some special commands introduced in SQL
> >> client,
> >> >>>>>> such as `quit`, `source`, `load jar` (not exist now, but maybe we
> >> need
> >> >>>>> this
> >> >>>>>> command
> >> >>>>>>     to support dynamic table source and udf).
> >> >>>>>> Does TableEnvironment also supports those commands?
> >> >>>>>>
> >> >>>>>> 3. btw, we must have this feature in release-1.11? I find there
> are
> >> >> few
> >> >>>>>> user cases
> >> >>>>>>     in the feedback document which behavior is unclear now.
> >> >>>>>>
> >> >>>>>> regarding to "change the return value from `Iterable<Row` to
> >> >>>>>> `Iterator<Row`",
> >> >>>>>> I couldn't agree more with this change. Just as Dawid mentioned
> >> >>>>>> "The contract of the Iterable#iterator is that it returns a new
> >> >> iterator
> >> >>>>>> each time,
> >> >>>>>>     which effectively means we can iterate the results multiple
> >> >> times.",
> >> >>>>>> we does not provide iterate the results multiple times.
> >> >>>>>> If we want do that, the client must buffer all results. but it's
> >> >>>>> impossible
> >> >>>>>> for streaming job.
> >> >>>>>>
> >> >>>>>> Best,
> >> >>>>>> Godfrey
> >> >>>>>>
> >> >>>>>> Dawid Wysakowicz<dwysakow...@apache.org>  于2020年4月1日周三 上午3:14写道：
> >> >>>>>>
> >> >>>>>>> Thank you Timo for the great summary! It covers (almost) all the
> >> >> topics.
> >> >>>>>>> Even though in the end we are not suggesting much changes to the
> >> >> current
> >> >>>>>>> state of FLIP I think it is important to lay out all possible
> use
> >> >> cases
> >> >>>>>>> so that we do not change the execution model every release.
> >> >>>>>>>
> >> >>>>>>> There is one additional thing we discussed. Could we change the
> >> >> result
> >> >>>>>>> type of TableResult#collect to Iterator<Row>? Even though those
> >> >>>>>>> interfaces do not differ much. I think Iterator better describes
> >> that
> >> >>>>>>> the results might not be materialized on the client side, but
> can
> >> be
> >> >>>>>>> retrieved on a per record basis. The contract of the
> >> >> Iterable#iterator
> >> >>>>>>> is that it returns a new iterator each time, which effectively
> >> means
> >> >> we
> >> >>>>>>> can iterate the results multiple times. Iterating the results is
> >> not
> >> >>>>>>> possible when we don't retrieve all the results from the cluster
> >> at
> >> >>>>> once.
> >> >>>>>>> I think we should also use Iterator for
> >> >>>>>>> TableEnvironment#executeMultilineSql(String statements):
> >> >>>>>>> Iterator<TableResult>.
> >> >>>>>>>
> >> >>>>>>> Best,
> >> >>>>>>>
> >> >>>>>>> Dawid
> >> >>>>>>>
> >> >>>>>>> On 31/03/2020 19:27, Timo Walther wrote:
> >> >>>>>>>> Hi Godfrey,
> >> >>>>>>>>
> >> >>>>>>>> Aljoscha, Dawid, Klou, and I had another discussion around
> >> FLIP-84.
> >> >> In
> >> >>>>>>>> particular, we discussed how the current status of the FLIP and
> >> the
> >> >>>>>>>> future requirements around multiline statements, async/sync,
> >> >> collect()
> >> >>>>>>>> fit together.
> >> >>>>>>>>
> >> >>>>>>>> We also updated the FLIP-84 Feedback Summary document [1] with
> >> some
> >> >>>>>>>> use cases.
> >> >>>>>>>>
> >> >>>>>>>> We believe that we found a good solution that also fits to what
> >> is
> >> >> in
> >> >>>>>>>> the current FLIP. So no bigger changes necessary, which is
> great!
> >> >>>>>>>>
> >> >>>>>>>> Our findings were:
> >> >>>>>>>>
> >> >>>>>>>> 1. Async vs sync submission of Flink jobs:
> >> >>>>>>>>
> >> >>>>>>>> Having a blocking `execute()` in DataStream API was rather a
> >> >> mistake.
> >> >>>>>>>> Instead all submissions should be async because this allows
> >> >> supporting
> >> >>>>>>>> both modes if necessary. Thus, submitting all queries async
> >> sounds
> >> >>>>>>>> good to us. If users want to run a job sync, they can use the
> >> >>>>>>>> JobClient and wait for completion (or collect() in case of
> batch
> >> >> jobs).
> >> >>>>>>>>
> >> >>>>>>>> 2. Multi-statement execution:
> >> >>>>>>>>
> >> >>>>>>>> For the multi-statement execution, we don't see a
> contradication
> >> >> with
> >> >>>>>>>> the async execution behavior. We imagine a method like:
> >> >>>>>>>>
> >> >>>>>>>> TableEnvironment#executeMultilineSql(String statements):
> >> >>>>>>>> Iterable<TableResult>
> >> >>>>>>>>
> >> >>>>>>>> Where the `Iterator#next()` method would trigger the next
> >> statement
> >> >>>>>>>> submission. This allows a caller to decide synchronously when
> to
> >> >>>>>>>> submit statements async to the cluster. Thus, a service such as
> >> the
> >> >>>>>>>> SQL Client can handle the result of each statement individually
> >> and
> >> >>>>>>>> process statement by statement sequentially.
> >> >>>>>>>>
> >> >>>>>>>> 3. The role of TableResult and result retrieval in general
> >> >>>>>>>>
> >> >>>>>>>> `TableResult` is similar to `JobClient`. Instead of returning a
> >> >>>>>>>> `CompletableFuture` of something, it is a concrete util class
> >> where
> >> >>>>>>>> some methods have the behavior of completable future (e.g.
> >> >> collect(),
> >> >>>>>>>> print()) and some are already completed (getTableSchema(),
> >> >>>>>>>> getResultKind()).
> >> >>>>>>>>
> >> >>>>>>>> `StatementSet#execute()` returns a single `TableResult` because
> >> the
> >> >>>>>>>> order is undefined in a set and all statements have the same
> >> schema.
> >> >>>>>>>> Its `collect()` will return a row for each executed `INSERT
> >> INTO` in
> >> >>>>>>>> the order of statement definition.
> >> >>>>>>>>
> >> >>>>>>>> For simple `SELECT * FROM ...`, the query execution might block
> >> >> until
> >> >>>>>>>> `collect()` is called to pull buffered rows from the job (from
> >> >>>>>>>> socket/REST API what ever we will use in the future). We can
> say
> >> >> that
> >> >>>>>>>> a statement finished successfully, when the
> >> >> `collect#Iterator#hasNext`
> >> >>>>>>>> has returned false.
> >> >>>>>>>>
> >> >>>>>>>> I hope this summarizes our discussion @Dawid/Aljoscha/Klou?
> >> >>>>>>>>
> >> >>>>>>>> It would be great if we can add these findings to the FLIP
> >> before we
> >> >>>>>>>> start voting.
> >> >>>>>>>>
> >> >>>>>>>> One minor thing: some `execute()` methods still throw a checked
> >> >>>>>>>> exception; can we remove that from the FLIP? Also the above
> >> >> mentioned
> >> >>>>>>>> `Iterator#next()` would trigger an execution without throwing a
> >> >>>>>>>> checked exception.
> >> >>>>>>>>
> >> >>>>>>>> Thanks,
> >> >>>>>>>> Timo
> >> >>>>>>>>
> >> >>>>>>>> [1]
> >> >>>>>>>>
> >> >>>>>
> >> >>
> >>
> https://docs.google.com/document/d/1ueLjQWRPdLTFB_TReAyhseAX-1N3j4WYWD0F02Uau0E/edit#
> >> >>>>>>>> On 31.03.20 06:28, godfrey he wrote:
> >> >>>>>>>>> Hi, Timo & Jark
> >> >>>>>>>>>
> >> >>>>>>>>> Thanks for your explanation.
> >> >>>>>>>>> Agree with you that async execution should always be async,
> >> >>>>>>>>> and sync execution scenario can be covered  by async
> execution.
> >> >>>>>>>>> It helps provide an unified entry point for batch and
> streaming.
> >> >>>>>>>>> I think we can also use sync execution for some testing.
> >> >>>>>>>>> So, I agree with you that we provide `executeSql` method and
> >> it's
> >> >>>>> async
> >> >>>>>>>>> method.
> >> >>>>>>>>> If we want sync method in the future, we can add method named
> >> >>>>>>>>> `executeSqlSync`.
> >> >>>>>>>>>
> >> >>>>>>>>> I think we've reached an agreement. I will update the
> document,
> >> and
> >> >>>>>>>>> start
> >> >>>>>>>>> voting process.
> >> >>>>>>>>>
> >> >>>>>>>>> Best,
> >> >>>>>>>>> Godfrey
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> Jark Wu<imj...@gmail.com>  于2020年3月31日周二 上午12:46写道：
> >> >>>>>>>>>
> >> >>>>>>>>>> Hi,
> >> >>>>>>>>>>
> >> >>>>>>>>>> I didn't follow the full discussion.
> >> >>>>>>>>>> But I share the same concern with Timo that streaming queries
> >> >> should
> >> >>>>>>>>>> always
> >> >>>>>>>>>> be async.
> >> >>>>>>>>>> Otherwise, I can image it will cause a lot of confusion and
> >> >> problems
> >> >>>>> if
> >> >>>>>>>>>> users don't deeply keep the "sync" in mind (e.g. client
> hangs).
> >> >>>>>>>>>> Besides, the streaming mode is still the majority use cases
> of
> >> >> Flink
> >> >>>>>>>>>> and
> >> >>>>>>>>>> Flink SQL. We should put the usability at a high priority.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Best,
> >> >>>>>>>>>> Jark
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> On Mon, 30 Mar 2020 at 23:27, Timo Walther<
> twal...@apache.org>
> >> >>>>> wrote:
> >> >>>>>>>>>>> Hi Godfrey,
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> maybe I wasn't expressing my biggest concern enough in my
> last
> >> >> mail.
> >> >>>>>>>>>>> Even in a singleline and sync execution, I think that
> >> streaming
> >> >>>>>>>>>>> queries
> >> >>>>>>>>>>> should not block the execution. Otherwise it is not possible
> >> to
> >> >> call
> >> >>>>>>>>>>> collect() or print() on them afterwards.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> "there are too many things need to discuss for multiline":
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> True, I don't want to solve all of them right now. But what
> I
> >> >> know
> >> >>>>> is
> >> >>>>>>>>>>> that our newly introduced methods should fit into a
> multiline
> >> >>>>>>>>>>> execution.
> >> >>>>>>>>>>> There is no big difference of calling `executeSql(A),
> >> >>>>>>>>>>> executeSql(B)` and
> >> >>>>>>>>>>> processing a multiline file `A;\nB;`.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> I think the example that you mentioned can simply be
> undefined
> >> >> for
> >> >>>>>>>>>>> now.
> >> >>>>>>>>>>> Currently, no catalog is modifying data but just metadata.
> >> This
> >> >> is a
> >> >>>>>>>>>>> separate discussion.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> "result of the second statement is indeterministic":
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> Sure this is indeterministic. But this is the implementers
> >> fault
> >> >>>>>>>>>>> and we
> >> >>>>>>>>>>> cannot forbid such pipelines.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> How about we always execute streaming queries async? It
> would
> >> >>>>> unblock
> >> >>>>>>>>>>> executeSql() and multiline statements.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> Having a `executeSqlAsync()` is useful for batch. However, I
> >> >> don't
> >> >>>>>>>>>>> want
> >> >>>>>>>>>>> `sync/async` be the new batch/stream flag. The execution
> >> behavior
> >> >>>>>>>>>>> should
> >> >>>>>>>>>>> come from the query itself.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> Regards,
> >> >>>>>>>>>>> Timo
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> On 30.03.20 11:12, godfrey he wrote:
> >> >>>>>>>>>>>> Hi Timo,
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Agree with you that streaming queries is our top priority,
> >> >>>>>>>>>>>> but I think there are too many things need to discuss for
> >> >> multiline
> >> >>>>>>>>>>>> statements:
> >> >>>>>>>>>>>> e.g.
> >> >>>>>>>>>>>> 1. what's the behaivor of DDL and DML mixing for async
> >> >> execution:
> >> >>>>>>>>>>>> create table t1 xxx;
> >> >>>>>>>>>>>> create table t2 xxx;
> >> >>>>>>>>>>>> insert into t2 select * from t1 where xxx;
> >> >>>>>>>>>>>> drop table t1; // t1 may be a MySQL table, the data will
> >> also be
> >> >>>>>>>>>> deleted.
> >> >>>>>>>>>>>> t1 is dropped when "insert" job is running.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> 2. what's the behaivor of unified scenario for async
> >> execution:
> >> >>>>>>>>>>>> (as you
> >> >>>>>>>>>>>> mentioned)
> >> >>>>>>>>>>>> INSERT INTO t1 SELECT * FROM s;
> >> >>>>>>>>>>>> INSERT INTO t2 SELECT * FROM s JOIN t1 EMIT STREAM;
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> The result of the second statement is indeterministic,
> >> because
> >> >> the
> >> >>>>>>>>>> first
> >> >>>>>>>>>>>> statement maybe is running.
> >> >>>>>>>>>>>> I think we need to put a lot of effort to define the
> >> behavior of
> >> >>>>>>>>>>> logically
> >> >>>>>>>>>>>> related queries.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> In this FLIP, I suggest we only handle single statement,
> and
> >> we
> >> >>>>> also
> >> >>>>>>>>>>>> introduce an async execute method
> >> >>>>>>>>>>>> which is more important and more often used for users.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Dor the sync methods (like `TableEnvironment.executeSql`
> and
> >> >>>>>>>>>>>> `StatementSet.execute`),
> >> >>>>>>>>>>>> the result will be returned until the job is finished. The
> >> >>>>> following
> >> >>>>>>>>>>>> methods will be introduced in this FLIP:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>       /**
> >> >>>>>>>>>>>>        * Asynchronously execute the given single statement
> >> >>>>>>>>>>>>        */
> >> >>>>>>>>>>>> TableEnvironment.executeSqlAsync(String statement):
> >> TableResult
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> /**
> >> >>>>>>>>>>>>       * Asynchronously execute the dml statements as a
> batch
> >> >>>>>>>>>>>>       */
> >> >>>>>>>>>>>> StatementSet.executeAsync(): TableResult
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> public interface TableResult {
> >> >>>>>>>>>>>>         /**
> >> >>>>>>>>>>>>          * return JobClient for DQL and DML in async mode,
> >> else
> >> >>>>> return
> >> >>>>>>>>>>>> Optional.empty
> >> >>>>>>>>>>>>          */
> >> >>>>>>>>>>>>         Optional<JobClient> getJobClient();
> >> >>>>>>>>>>>> }
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> what do you think?
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Best,
> >> >>>>>>>>>>>> Godfrey
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Timo Walther<twal...@apache.org>  于2020年3月26日周四 下午9:15写道：
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>> Hi Godfrey,
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> executing streaming queries must be our top priority
> because
> >> >> this
> >> >>>>> is
> >> >>>>>>>>>>>>> what distinguishes Flink from competitors. If we change
> the
> >> >>>>>>>>>>>>> execution
> >> >>>>>>>>>>>>> behavior, we should think about the other cases as well to
> >> not
> >> >>>>> break
> >> >>>>>>>>>> the
> >> >>>>>>>>>>>>> API a third time.
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> I fear that just having an async execute method will not
> be
> >> >> enough
> >> >>>>>>>>>>>>> because users should be able to mix streaming and batch
> >> queries
> >> >>>>> in a
> >> >>>>>>>>>>>>> unified scenario.
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> If I remember it correctly, we had some discussions in the
> >> past
> >> >>>>>>>>>>>>> about
> >> >>>>>>>>>>>>> what decides about the execution mode of a query.
> >> Currently, we
> >> >>>>>>>>>>>>> would
> >> >>>>>>>>>>>>> like to let the query decide, not derive it from the
> >> sources.
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> So I could image a multiline pipeline as:
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> USE CATALOG 'mycat';
> >> >>>>>>>>>>>>> INSERT INTO t1 SELECT * FROM s;
> >> >>>>>>>>>>>>> INSERT INTO t2 SELECT * FROM s JOIN t1 EMIT STREAM;
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> For executeMultilineSql():
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> sync because regular SQL
> >> >>>>>>>>>>>>> sync because regular Batch SQL
> >> >>>>>>>>>>>>> async because Streaming SQL
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> For executeAsyncMultilineSql():
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> async because everything should be async
> >> >>>>>>>>>>>>> async because everything should be async
> >> >>>>>>>>>>>>> async because everything should be async
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> What we should not start for executeAsyncMultilineSql():
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> sync because DDL
> >> >>>>>>>>>>>>> async because everything should be async
> >> >>>>>>>>>>>>> async because everything should be async
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> What are you thoughts here?
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> Regards,
> >> >>>>>>>>>>>>> Timo
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> On 26.03.20 12:50, godfrey he wrote:
> >> >>>>>>>>>>>>>> Hi Timo,
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> I agree with you that streaming queries mostly need async
> >> >>>>>>>>>>>>>> execution.
> >> >>>>>>>>>>>>>> In fact, our original plan is only introducing sync
> >> methods in
> >> >>>>> this
> >> >>>>>>>>>>> FLIP,
> >> >>>>>>>>>>>>>> and async methods (like "executeSqlAsync") will be
> >> introduced
> >> >> in
> >> >>>>>>>>>>>>>> the
> >> >>>>>>>>>>>>> future
> >> >>>>>>>>>>>>>> which is mentioned in the appendix.
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Maybe the async methods also need to be considered in
> this
> >> >> FLIP.
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> I think sync methods is also useful for streaming which
> >> can be
> >> >>>>> used
> >> >>>>>>>>>> to
> >> >>>>>>>>>>>>> run
> >> >>>>>>>>>>>>>> bounded source.
> >> >>>>>>>>>>>>>> Maybe we should check whether all sources are bounded in
> >> sync
> >> >>>>>>>>>> execution
> >> >>>>>>>>>>>>>> mode.
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Also, if we block for streaming queries, we could never
> >> >> support
> >> >>>>>>>>>>>>>>> multiline files. Because the first INSERT INTO would
> block
> >> >> the
> >> >>>>>>>>>> further
> >> >>>>>>>>>>>>>>> execution.
> >> >>>>>>>>>>>>>> agree with you, we need async method to submit multiline
> >> >> files,
> >> >>>>>>>>>>>>>> and files should be limited that the DQL and DML should
> be
> >> >>>>>>>>>>>>>> always in
> >> >>>>>>>>>>> the
> >> >>>>>>>>>>>>>> end for streaming.
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Best,
> >> >>>>>>>>>>>>>> Godfrey
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Timo Walther<twal...@apache.org>  于2020年3月26日周四
> 下午4:29写道：
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Hi Godfrey,
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> having control over the job after submission is a
> >> requirement
> >> >>>>> that
> >> >>>>>>>>>> was
> >> >>>>>>>>>>>>>>> requested frequently (some examples are [1], [2]). Users
> >> >> would
> >> >>>>>>>>>>>>>>> like
> >> >>>>>>>>>> to
> >> >>>>>>>>>>>>>>> get insights about the running or completed job.
> Including
> >> >> the
> >> >>>>>>>>>> jobId,
> >> >>>>>>>>>>>>>>> jobGraph etc., the JobClient summarizes these
> properties.
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> It is good to have a discussion about
> >> >> synchronous/asynchronous
> >> >>>>>>>>>>>>>>> submission now to have a complete execution picture.
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> I thought we submit streaming queries mostly async and
> >> just
> >> >>>>>>>>>>>>>>> wait for
> >> >>>>>>>>>>> the
> >> >>>>>>>>>>>>>>> successful submission. If we block for streaming
> queries,
> >> how
> >> >>>>>>>>>>>>>>> can we
> >> >>>>>>>>>>>>>>> collect() or print() results?
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Also, if we block for streaming queries, we could never
> >> >> support
> >> >>>>>>>>>>>>>>> multiline files. Because the first INSERT INTO would
> block
> >> >> the
> >> >>>>>>>>>> further
> >> >>>>>>>>>>>>>>> execution.
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> If we decide to block entirely on streaming queries, we
> >> need
> >> >> the
> >> >>>>>>>>>> async
> >> >>>>>>>>>>>>>>> execution methods in the design already. However, I
> would
> >> >>>>>>>>>>>>>>> rather go
> >> >>>>>>>>>>> for
> >> >>>>>>>>>>>>>>> non-blocking streaming queries. Also with the `EMIT
> >> STREAM`
> >> >> key
> >> >>>>>>>>>>>>>>> word
> >> >>>>>>>>>>> in
> >> >>>>>>>>>>>>>>> mind that we might add to SQL statements soon.
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Regards,
> >> >>>>>>>>>>>>>>> Timo
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> [1]https://issues.apache.org/jira/browse/FLINK-16761
> >> >>>>>>>>>>>>>>> [2]https://issues.apache.org/jira/browse/FLINK-12214
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> On 25.03.20 16:30, godfrey he wrote:
> >> >>>>>>>>>>>>>>>> Hi Timo,
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> Thanks for the updating.
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> Regarding to "multiline statement support", I'm also
> fine
> >> >> that
> >> >>>>>>>>>>>>>>>> `TableEnvironment.executeSql()` only supports single
> line
> >> >>>>>>>>>> statement,
> >> >>>>>>>>>>>>> and
> >> >>>>>>>>>>>>>>> we
> >> >>>>>>>>>>>>>>>> can support multiline statement later (needs more
> >> discussion
> >> >>>>>>>>>>>>>>>> about
> >> >>>>>>>>>>>>> this).
> >> >>>>>>>>>>>>>>>> Regarding to "StatementSet.explian()", I don't have
> >> strong
> >> >>>>>>>>>>>>>>>> opinions
> >> >>>>>>>>>>>>> about
> >> >>>>>>>>>>>>>>>> that.
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> Regarding to "TableResult.getJobClient()", I think it's
> >> >>>>>>>>>> unnecessary.
> >> >>>>>>>>>>>>> The
> >> >>>>>>>>>>>>>>>> reason is: first, many statements (e.g. DDL, show xx,
> use
> >> >> xx)
> >> >>>>>>>>>>>>>>>> will
> >> >>>>>>>>>>> not
> >> >>>>>>>>>>>>>>>> submit a Flink job. second,
> >> `TableEnvironment.executeSql()`
> >> >> and
> >> >>>>>>>>>>>>>>>> `StatementSet.execute()` are synchronous method,
> >> >> `TableResult`
> >> >>>>>>>>>>>>>>>> will
> >> >>>>>>>>>>> be
> >> >>>>>>>>>>>>>>>> returned only after the job is finished or failed.
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> Regarding to "whether StatementSet.execute() needs to
> >> throw
> >> >>>>>>>>>>>>> exception", I
> >> >>>>>>>>>>>>>>>> think we should choose a unified way to tell whether
> the
> >> >>>>>>>>>>>>>>>> execution
> >> >>>>>>>>>> is
> >> >>>>>>>>>>>>>>>> successful. If `TableResult` contains ERROR kind
> >> >> (non-runtime
> >> >>>>>>>>>>>>> exception),
> >> >>>>>>>>>>>>>>>> users need to not only check the result but also catch
> >> the
> >> >>>>>>>>>>>>>>>> runtime
> >> >>>>>>>>>>>>>>>> exception in their code. or `StatementSet.execute()`
> does
> >> >> not
> >> >>>>>>>>>>>>>>>> throw
> >> >>>>>>>>>>> any
> >> >>>>>>>>>>>>>>>> exception (including runtime exception), all exception
> >> >>>>>>>>>>>>>>>> messages are
> >> >>>>>>>>>>> in
> >> >>>>>>>>>>>>>>> the
> >> >>>>>>>>>>>>>>>> result.  I prefer "StatementSet.execute() needs to
> throw
> >> >>>>>>>>>> exception".
> >> >>>>>>>>>>> cc
> >> >>>>>>>>>>>>>>> @Jark
> >> >>>>>>>>>>>>>>>> Wu<imj...@gmail.com>
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> I will update the agreed parts to the document first.
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> Best,
> >> >>>>>>>>>>>>>>>> Godfrey
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>> Timo Walther<twal...@apache.org>  于2020年3月25日周三
> >> >>>>>>>>>>>>>>>> 下午6:51写道：
> >> >>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> Hi Godfrey,
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> thanks for starting the discussion on the mailing
> list.
> >> And
> >> >>>>>>>>>>>>>>>>> sorry
> >> >>>>>>>>>>>>> again
> >> >>>>>>>>>>>>>>>>> for the late reply to FLIP-84. I have updated the
> Google
> >> >> doc
> >> >>>>> one
> >> >>>>>>>>>>> more
> >> >>>>>>>>>>>>>>>>> time to incorporate the offline discussions.
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>         From Dawid's and my view, it is fine to
> >> postpone the
> >> >>>>>>>>>>>>>>>>> multiline
> >> >>>>>>>>>>>>> support
> >> >>>>>>>>>>>>>>>>> to a separate method. This can be future work even
> >> though
> >> >> we
> >> >>>>>>>>>>>>>>>>> will
> >> >>>>>>>>>>> need
> >> >>>>>>>>>>>>>>>>> it rather soon.
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> If there are no objections, I suggest to update the
> >> FLIP-84
> >> >>>>>>>>>>>>>>>>> again
> >> >>>>>>>>>>> and
> >> >>>>>>>>>>>>>>>>> have another voting process.
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> Thanks,
> >> >>>>>>>>>>>>>>>>> Timo
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>> On 25.03.20 11:17, godfrey he wrote:
> >> >>>>>>>>>>>>>>>>>> Hi community,
> >> >>>>>>>>>>>>>>>>>> Timo, Fabian and Dawid have some feedbacks about
> >> >> FLIP-84[1].
> >> >>>>>>>>>>>>>>>>>> The
> >> >>>>>>>>>>>>>>>>> feedbacks
> >> >>>>>>>>>>>>>>>>>> are all about new introduced methods. We had a
> >> discussion
> >> >>>>>>>>>>> yesterday,
> >> >>>>>>>>>>>>>>> and
> >> >>>>>>>>>>>>>>>>>> most of feedbacks have been agreed upon. Here is the
> >> >>>>>>>>>>>>>>>>>> conclusions:
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> *1. about proposed methods in `TableEnvironment`:*
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> the original proposed methods:
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> TableEnvironment.createDmlBatch(): DmlBatch
> >> >>>>>>>>>>>>>>>>>> TableEnvironment.executeStatement(String statement):
> >> >>>>>>>>>>>>>>>>>> ResultTable
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> the new proposed methods:
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> // we should not use abbreviations in the API, and
> the
> >> >> term
> >> >>>>>>>>>> "Batch"
> >> >>>>>>>>>>>>> is
> >> >>>>>>>>>>>>>>>>>> easily confused with batch/streaming processing
> >> >>>>>>>>>>>>>>>>>> TableEnvironment.createStatementSet(): StatementSet
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> // every method that takes SQL should have `Sql` in
> its
> >> >> name
> >> >>>>>>>>>>>>>>>>>> // supports multiline statement ???
> >> >>>>>>>>>>>>>>>>>> TableEnvironment.executeSql(String statement):
> >> TableResult
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> // new methods. supports explaining DQL and DML
> >> >>>>>>>>>>>>>>>>>> TableEnvironment.explainSql(String statement,
> >> >>>>> ExplainDetail...
> >> >>>>>>>>>>>>>>> details):
> >> >>>>>>>>>>>>>>>>>> String
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> *2. about proposed related classes:*
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> the original proposed classes:
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> interface DmlBatch {
> >> >>>>>>>>>>>>>>>>>>             void addInsert(String insert);
> >> >>>>>>>>>>>>>>>>>>             void addInsert(String targetPath, Table
> >> table);
> >> >>>>>>>>>>>>>>>>>>             ResultTable execute() throws Exception ;
> >> >>>>>>>>>>>>>>>>>>             String explain(boolean extended);
> >> >>>>>>>>>>>>>>>>>> }
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> public interface ResultTable {
> >> >>>>>>>>>>>>>>>>>>             TableSchema getResultSchema();
> >> >>>>>>>>>>>>>>>>>>             Iterable<Row> getResultRows();
> >> >>>>>>>>>>>>>>>>>> }
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> the new proposed classes:
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> interface StatementSet {
> >> >>>>>>>>>>>>>>>>>>             // every method that takes SQL should
> have
> >> >> `Sql` in
> >> >>>>>>>>>>>>>>>>>> its
> >> >>>>>>>>>>> name
> >> >>>>>>>>>>>>>>>>>>             // return StatementSet instance for
> fluent
> >> >>>>> programming
> >> >>>>>>>>>>>>>>>>>>             addInsertSql(String statement):
> >> StatementSet
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>             // return StatementSet instance for
> fluent
> >> >>>>> programming
> >> >>>>>>>>>>>>>>>>>>             addInsert(String tablePath, Table table):
> >> >>>>> StatementSet
> >> >>>>>>>>>>>>>>>>>>             // new method. support overwrite mode
> >> >>>>>>>>>>>>>>>>>>             addInsert(String tablePath, Table table,
> >> >> boolean
> >> >>>>>>>>>>> overwrite):
> >> >>>>>>>>>>>>>>>>>> StatementSet
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>             explain(): String
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>             // new method. supports adding more
> details
> >> >> for the
> >> >>>>>>>>>> result
> >> >>>>>>>>>>>>>>>>>>             explain(ExplainDetail... extraDetails):
> >> String
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>             // throw exception ???
> >> >>>>>>>>>>>>>>>>>>             execute(): TableResult
> >> >>>>>>>>>>>>>>>>>> }
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> interface TableResult {
> >> >>>>>>>>>>>>>>>>>>             getTableSchema(): TableSchema
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>             // avoid custom parsing of an "OK" row in
> >> >>>>> programming
> >> >>>>>>>>>>>>>>>>>>             getResultKind(): ResultKind
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>             // instead of `get` make it explicit that
> >> this
> >> >> is
> >> >>>>>>>>>>>>>>>>>> might
> >> >>>>>>>>>> be
> >> >>>>>>>>>>>>>>>>> triggering
> >> >>>>>>>>>>>>>>>>>> an expensive operation
> >> >>>>>>>>>>>>>>>>>>             collect(): Iterable<Row>
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>             // for fluent programming
> >> >>>>>>>>>>>>>>>>>>             print(): Unit
> >> >>>>>>>>>>>>>>>>>> }
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> enum ResultKind {
> >> >>>>>>>>>>>>>>>>>>             SUCCESS, // for DDL, DCL and statements
> >> with a
> >> >>>>> simple
> >> >>>>>>>>>> "OK"
> >> >>>>>>>>>>>>>>>>>>             SUCCESS_WITH_CONTENT, // rows with
> >> important
> >> >>>>>>>>>>>>>>>>>> content are
> >> >>>>>>>>>>>>>>> available
> >> >>>>>>>>>>>>>>>>>> (DML, DQL)
> >> >>>>>>>>>>>>>>>>>> }
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> *3. new proposed methods in `Table`*
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> `Table.insertInto()` will be deprecated, and the
> >> following
> >> >>>>>>>>>> methods
> >> >>>>>>>>>>>>> are
> >> >>>>>>>>>>>>>>>>>> introduced:
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> Table.executeInsert(String tablePath): TableResult
> >> >>>>>>>>>>>>>>>>>> Table.executeInsert(String tablePath, boolean
> >> overwrite):
> >> >>>>>>>>>>> TableResult
> >> >>>>>>>>>>>>>>>>>> Table.explain(ExplainDetail... details): String
> >> >>>>>>>>>>>>>>>>>> Table.execute(): TableResult
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> There are two issues need further discussion, one is
> >> >> whether
> >> >>>>>>>>>>>>>>>>>> `TableEnvironment.executeSql(String statement):
> >> >> TableResult`
> >> >>>>>>>>>> needs
> >> >>>>>>>>>>> to
> >> >>>>>>>>>>>>>>>>>> support multiline statement (or whether
> >> `TableEnvironment`
> >> >>>>>>>>>>>>>>>>>> needs
> >> >>>>>>>>>> to
> >> >>>>>>>>>>>>>>>>> support
> >> >>>>>>>>>>>>>>>>>> multiline statement), and another one is whether
> >> >>>>>>>>>>>>>>> `StatementSet.execute()`
> >> >>>>>>>>>>>>>>>>>> needs to throw exception.
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> please refer to the feedback document [2] for the
> >> details.
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> Any suggestions are warmly welcomed!
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>>>> [1]
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>
> >> >>
> >>
> https://wiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
> >> >>>>>>>>>>>>>>>>>> [2]
> >> >>>>>>>>>>>>>>>>>>
> >> >>>>>
> >> >>
> >>
> https://docs.google.com/document/d/1ueLjQWRPdLTFB_TReAyhseAX-1N3j4WYWD0F02Uau0E/edit
> >> >>>>>>>>>>>>>>>>>> Best,
> >> >>>>>>>>>>>>>>>>>> Godfrey
> >> >>>>>>>>>>>>>>>>>>
> >> >>
> >> >>
> >> >
> >>
> >>
>

Re: [DISCUSS] FLIP-84 Feedback Summary

Reply via email to