Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

2019-11-07 Thread Dawid Wysakowicz
Hi,

Thank you for the comments Kostas, Timo, Aljoscha. I also like the
pipeline/execution naming. I tried to apply most of your suggestions
Aljoscha.

There are a few cases when I did not. You mentioned a few options that
are already present, and I planned to reuse the existing options
(latencyTrackingInterval, latencyTrackingInterval, setParallelism etc.)

I would still expose the two options from your "Maybe don’t expose"
section. They are currently exposed in the Table API module (the initial
motivation of this FLIP to enable passing the config from Table module).
Moreover I think it is important for users to have an option to
configure the kryo serializers in a way.

I updated the FLIP's wiki page and will start voting on it.

Best,

Dawid

On 18/10/2019 17:19, Aljoscha Krettek wrote:
> Hi,
>
> In general, I’m also for “execution" compared to just “exec”. For some of 
> these options, though, I’m wondering whether “pipeline.” or 
> “job.” makes more sense. Over time, a lot of things have accumulated 
> in ExecutionConfig but a lot of them are not execution related, I think. For 
> example, auto-type-registration would make more sense as 
> “pipeline.auto-type-registration”. For some other options, I think we should 
> consider not exposing them via the configuration if we don’t think that we 
> want to have them in the long term.
>
> I’ll try to categorise what I think:
>
> Don’t expose:
>  - defaultInputDependencyConstraint (I think this is an internal flag for the 
> Blink runner)
>  - executionMode (I think this is also Blink internals)
>  - printProgressDuringExecution (I don’t know if this flag still does 
> anything)
>
> Maybe don’t expose:
>  - defaultKryoSerializerClasses
>  - setGlobalJobParameters (if we expose it it should be “pipeline”)
>
> pipeline/job:
>  - autoTypeRegistration
>  - autoWatermarkInterval
>  - closureCleaner
>  - disableGenericTypes
>  - enableAutoGeneratedUIDs
>  - forceAvro
>  - forceKryo
>  - setMaxParallelism
>  - setParallelism
>  - objectReuse (this one is hard, could be execution)
>  - registeredKryoTypes
>  - registeredPojoTypes
>  - timeCharacteristic
>  - isChainingEnabled
>  - cachedFile
>
> execution:
>  - latencyTrackingInterval
>  - setRestartStrategy
>  - taskCancellationIntervalMillis
>  - taskCancellationTimeoutMillis
>  - bufferTimeout
>
> checkpointing: (this might be “execution.checkpointing”)
>  - useSnapshotCompression
>  - 
>  - defaultStateBackend
>
> What do you think?
>
> Best,
> Aljoscha
>
>
>> On 17. Oct 2019, at 09:32, Timo Walther  wrote:
>>
>> Sounds good to me.
>>
>> Thanks,
>>
>> Timo
>>
>>
>> On 17.10.19 09:30, Kostas Kloudas wrote:
>>> Hi Timo,
>>>
>>> I agree that distinguishing between "executor" and "execution" when
>>> scanning through a configuration file can be difficult. These names
>>> were mainly influenced by the fact that FLIP-73 introduced the
>>> "Executor".
>>> In addition, I agree that "deployment" or "deploy" sound good
>>> alternatives. Between the two, I would go with "deployment" (although
>>> I like more the "deploy" as it is more imperative) for the simple
>>> reason that we do not use verbs anywhere else (I think) in config
>>> options.
>>>
>>> Now for the "exec" or "execution", personally I like the longer
>>> version as it is clearer.
>>>
>>> So, to summarise, I would vote for "deployment", "execution", and
>>> "pipeline" for job invariants, like the jars.
>>>
>>> What do you think?
>>>
>>> Cheers,
>>> Kostas
>>>
>>> On Wed, Oct 16, 2019 at 5:28 PM Timo Walther  wrote:
 Hi Kostas,

 can we still discuss the naming of the properties? For me, having
 "execution" and "exector" as prefixes might be confusing in the future
 and difficult to identify if you scan through a list of properties.

 How about `deployment` and `execution`? Or `deployer` and `exec`?

 Regards,
 Timo

 On 16.10.19 16:31, Kostas Kloudas wrote:
> Hi all,
>
> Thanks for opening the discussion!
>
> I like the idea, so +1 from my side and actually this is aligned with
> our intensions for the FLIP-73 effort.
>
> For the naming convention of the parameters introduced in the FLIP, my
> proposal would be have the full word "execution" instead of the
> shorter "exec".
> The reason for this, is that in the context of FLIP-73, we are also
> planning to introduce some new configuration parameters and the
> convention we
> are currently using is the following:
>
> pipeline.***: for job parameters that will not change between
> executions of the same job, e.g. the jar location
> executor.***: for parameters relevant to the instantiation of the
> correct executor, e.g. YARN, detached, etc
> execution.***: for parameters that are relevant to a specific
> execution of a given pipeline, e.g. parallelism or savepoint settings
>
> I understand that sometimes the boundaries may not be that clear for a
> parameter but I hope 

[VOTE] FLIP-59: Enable execution configuration from Configuration object

2019-11-07 Thread Dawid Wysakowicz
Hello,

please vote for FLIP-59
.


The discussion thread can be found here
.


This vote will be open for at least 72 hours and requires consensus to
be accepted.

Best,
Dawid



signature.asc
Description: OpenPGP digital signature


[jira] [Created] (FLINK-14651) Set default value of config option jobmanager.scheduler to "ng"

2019-11-07 Thread Gary Yao (Jira)
Gary Yao created FLINK-14651:


 Summary: Set default value of config option jobmanager.scheduler 
to "ng"
 Key: FLINK-14651
 URL: https://issues.apache.org/jira/browse/FLINK-14651
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Gary Yao
Assignee: Gary Yao
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-69: Flink SQL DDL Enhancement

2019-11-07 Thread Rui Li
I see, thanks for the clarification. In current implementation, it seems
just a duplicate of comment. So I'd prefer not to display it for DESCRIBE
DATABASE, because 1) users have no control over the content and 2) it's
totally redundant. We can add it in the future when we come up with
something more meaningful. What do you think?

On Thu, Nov 7, 2019 at 3:54 PM Terry Wang  wrote:

> Hi Rui~
>
> Description of the database is obtained from
> `CatalogDatabase#getDescription()` method, which is implement by
> CatalogDatebaseImpl. Users don’t need to specify the description.
>
> Best,
> Terry Wang
>
>
>
> > 2019年11月7日 15:40,Rui Li  写道:
> >
> > Thanks Terry for driving this forward.
> > Got one question about DESCRIBE DATABASE: the results display comment and
> > description of a database. While comment can be specified when a database
> > is created, I don't see how users can specify description of the
> database?
> >
> > On Thu, Nov 7, 2019 at 4:16 AM Bowen Li  wrote:
> >
> >> Thanks.
> >>
> >> As Terry and I discussed offline yesterday, we added a new section to
> >> explain the detailed implementation plan.
> >>
> >> +1 (binding) from me.
> >>
> >> Bowen
> >>
> >> On Tue, Nov 5, 2019 at 6:33 PM Terry Wang  wrote:
> >>
> >>> Hi Bowen:
> >>> Thanks for your feedback.
> >>> Your opinion convinced me and I just remove the section about catalog
> >>> create statement and also remove `DBPROPERTIES` `PROPERTIES` from alter
> >>> DDLs.
> >>> Open to more comments or votes :) !
> >>>
> >>> Best,
> >>> Terry Wang
> >>>
> >>>
> >>>
>  2019年11月6日 07:22,Bowen Li  写道:
> 
>  Hi Terry,
> 
>  I went over the FLIP in detail again. The FLIP mostly LGTM. A couple
> >>> issues:
> 
>  - since we on't plan to support catalog ddl, can you remove them from
> >> the
>  FLIP?
>  - I found there are some discrepancies in proposed database and table
> >>> DDLs.
>  For db ddl, the create db syntax proposes specifying k-v properties
>  following "WITH". However, alter db ddl comes with a keyword
> >>> "DBPROPERTIES":
> 
>  CREATE  DATABASE [ IF NOT EXISTS ] [ catalogName.] dataBaseName [
> >> COMMENT
>  database_comment ]
>  [*WITH *( name=value [, name=value]*)]
> 
> 
>  ALTER  DATABASE  [ catalogName.] dataBaseName SET *DBPROPERTIES* (
>  name=value [, name=value]*)
> 
> 
>    IIUIC, are you borrowing syntax from Hive? Note that Hive's db
> >> create
>  ddl comes with "DBPROPERTIES" though - "CREATE (DATABASE|SCHEMA) [IF
> >> NOT
>  EXISTS] database_name ...  [*WITH DBPROPERTIES* (k=v, ...)];" [1]
> 
>   The same applies to table ddl. The proposed alter table ddl comes
> >> with
>  "SET *PROPERTIES* (...)", however, Flink's existing table create ddl
> >>> since
>  1.9 [2] doesn't have "PROPERTIES" keyword. As opposed to Hive's
> syntax,
>  both create and alter table ddl comes with "TBLPROPERTIES" [1].
> 
>   I feel it's better to be consistent among our DDLs. One option is to
>  just remove the "PROPERTIES" and "DBPROPERTIES" keywords in proposed
> >>> syntax.
> 
>  [1]
> >> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
>  [2]
> 
> >>>
> >>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html#specifying-a-ddl
> 
>  On Tue, Nov 5, 2019 at 12:54 PM Peter Huang <
> >> huangzhenqiu0...@gmail.com>
>  wrote:
> 
> > +1 for the enhancement.
> >
> > On Tue, Nov 5, 2019 at 11:04 AM Xuefu Z  wrote:
> >
> >> +1 to the long missing feature in Flink SQL.
> >>
> >> On Tue, Nov 5, 2019 at 6:32 AM Terry Wang 
> >> wrote:
> >>
> >>> Hi all,
> >>>
> >>> I would like to start the vote for FLIP-69[1] which is discussed
> and
> >>> reached consensus in the discussion thread[2].
> >>>
> >>> The vote will be open for at least 72 hours. I'll try to close it
> by
> >>> 2019-11-08 14:30 UTC, unless there is an objection or not enough
> >>> votes.
> >>>
> >>> [1]
> >>>
> >>
> >
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+69+-+Flink+SQL+DDL+Enhancement
> >>> <
> >>>
> >>
> >
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+69+-+Flink+SQL+DDL+Enhancement
> 
> >>> [2]
> >>>
> >>
> >
> >>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-69-Flink-SQL-DDL-Enhancement-td33090.html
> >>> <
> >>>
> >>
> >
> >>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-69-Flink-SQL-DDL-Enhancement-td33090.html
> 
> >>> Best,
> >>> Terry Wang
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Xuefu Zhang
> >>
> >> "In Honey We Trust!"
> >>
> >
> >>>
> >>>
> >>
> >
> >
> > --
> > Best regards!
> > Rui Li
>
>

-- 
Best regards!
Rui Li


HadoopInputFormat Custom Partitioning

2019-11-07 Thread Dominik Wosiński
Hey,
I wanted to ask if the *HadoopInputFormat* does currently support some
custom partitioning scheme ? Say I have 200 files in HDFS each having the
partitioning key in name, can we ATM use HadoopInputFormat to distribute
reading to multiple TaskManagers using the key ??


Best Regards,
Dom.


Re: [VOTE] FLIP-59: Enable execution configuration from Configuration object

2019-11-07 Thread tison
Hi Dawid,

I'm afraid that you list the wrong FLIP page. Although the content is
FLIP-59 but it directs to FLIP-67.

Best,
tison.


Dawid Wysakowicz  于2019年11月7日周四 下午5:04写道:

> Hello,
>
> please vote for FLIP-59
> 
> .
>
>
> The discussion thread can be found here <
> 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-59-Enable-execution-configuration-from-Configuration-object-td32359.html
> >
> .
>
>
> This vote will be open for at least 72 hours and requires consensus to be
> accepted.
>
> Best,
> Dawid
>


Re: [VOTE] FLIP-59: Enable execution configuration from Configuration object

2019-11-07 Thread Dawid Wysakowicz
Thank you tison. You are right. I did not update the hyperlinks. Sorry
for that. Once again then:

please vote for FLIP-59
https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object.


The discussion thread can be found here
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-59-Enable-execution-configuration-from-Configuration-object-td32359.html

This vote will be open for at least 72 hours and requires consensus to
be accepted.

Best, Dawid

On 07/11/2019 10:29, tison wrote:
> Hi Dawid,
>
> I'm afraid that you list the wrong FLIP page. Although the content is
> FLIP-59 but it directs to FLIP-67.
>
> Best,
> tison.
>
>
> Dawid Wysakowicz  于2019年11月7日周四 下午5:04写道:
>
>> Hello,
>>
>> please vote for FLIP-59
>> 
>> .
>>
>>
>> The discussion thread can be found here <
>> 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-59-Enable-execution-configuration-from-Configuration-object-td32359.html
>> .
>>
>>
>> This vote will be open for at least 72 hours and requires consensus to be
>> accepted.
>>
>> Best,
>> Dawid
>>



signature.asc
Description: OpenPGP digital signature


Re: [VOTE] FLIP-69: Flink SQL DDL Enhancement

2019-11-07 Thread Terry Wang
Hi Rui~
What you suggested makes sense, remove description and detailed description 
from `DESCRIBE DATABASE`.
Open to more comments and votes :)

Best,
Terry Wang



> 2019年11月7日 17:15,Rui Li  写道:
> 
> I see, thanks for the clarification. In current implementation, it seems
> just a duplicate of comment. So I'd prefer not to display it for DESCRIBE
> DATABASE, because 1) users have no control over the content and 2) it's
> totally redundant. We can add it in the future when we come up with
> something more meaningful. What do you think?
> 
> On Thu, Nov 7, 2019 at 3:54 PM Terry Wang  wrote:
> 
>> Hi Rui~
>> 
>> Description of the database is obtained from
>> `CatalogDatabase#getDescription()` method, which is implement by
>> CatalogDatebaseImpl. Users don’t need to specify the description.
>> 
>> Best,
>> Terry Wang
>> 
>> 
>> 
>>> 2019年11月7日 15:40,Rui Li  写道:
>>> 
>>> Thanks Terry for driving this forward.
>>> Got one question about DESCRIBE DATABASE: the results display comment and
>>> description of a database. While comment can be specified when a database
>>> is created, I don't see how users can specify description of the
>> database?
>>> 
>>> On Thu, Nov 7, 2019 at 4:16 AM Bowen Li  wrote:
>>> 
 Thanks.
 
 As Terry and I discussed offline yesterday, we added a new section to
 explain the detailed implementation plan.
 
 +1 (binding) from me.
 
 Bowen
 
 On Tue, Nov 5, 2019 at 6:33 PM Terry Wang  wrote:
 
> Hi Bowen:
> Thanks for your feedback.
> Your opinion convinced me and I just remove the section about catalog
> create statement and also remove `DBPROPERTIES` `PROPERTIES` from alter
> DDLs.
> Open to more comments or votes :) !
> 
> Best,
> Terry Wang
> 
> 
> 
>> 2019年11月6日 07:22,Bowen Li  写道:
>> 
>> Hi Terry,
>> 
>> I went over the FLIP in detail again. The FLIP mostly LGTM. A couple
> issues:
>> 
>> - since we on't plan to support catalog ddl, can you remove them from
 the
>> FLIP?
>> - I found there are some discrepancies in proposed database and table
> DDLs.
>> For db ddl, the create db syntax proposes specifying k-v properties
>> following "WITH". However, alter db ddl comes with a keyword
> "DBPROPERTIES":
>> 
>> CREATE  DATABASE [ IF NOT EXISTS ] [ catalogName.] dataBaseName [
 COMMENT
>> database_comment ]
>> [*WITH *( name=value [, name=value]*)]
>> 
>> 
>> ALTER  DATABASE  [ catalogName.] dataBaseName SET *DBPROPERTIES* (
>> name=value [, name=value]*)
>> 
>> 
>>  IIUIC, are you borrowing syntax from Hive? Note that Hive's db
 create
>> ddl comes with "DBPROPERTIES" though - "CREATE (DATABASE|SCHEMA) [IF
 NOT
>> EXISTS] database_name ...  [*WITH DBPROPERTIES* (k=v, ...)];" [1]
>> 
>> The same applies to table ddl. The proposed alter table ddl comes
 with
>> "SET *PROPERTIES* (...)", however, Flink's existing table create ddl
> since
>> 1.9 [2] doesn't have "PROPERTIES" keyword. As opposed to Hive's
>> syntax,
>> both create and alter table ddl comes with "TBLPROPERTIES" [1].
>> 
>> I feel it's better to be consistent among our DDLs. One option is to
>> just remove the "PROPERTIES" and "DBPROPERTIES" keywords in proposed
> syntax.
>> 
>> [1]
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
>> [2]
>> 
> 
 
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html#specifying-a-ddl
>> 
>> On Tue, Nov 5, 2019 at 12:54 PM Peter Huang <
 huangzhenqiu0...@gmail.com>
>> wrote:
>> 
>>> +1 for the enhancement.
>>> 
>>> On Tue, Nov 5, 2019 at 11:04 AM Xuefu Z  wrote:
>>> 
 +1 to the long missing feature in Flink SQL.
 
 On Tue, Nov 5, 2019 at 6:32 AM Terry Wang 
 wrote:
 
> Hi all,
> 
> I would like to start the vote for FLIP-69[1] which is discussed
>> and
> reached consensus in the discussion thread[2].
> 
> The vote will be open for at least 72 hours. I'll try to close it
>> by
> 2019-11-08 14:30 UTC, unless there is an objection or not enough
> votes.
> 
> [1]
> 
 
>>> 
> 
 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP+69+-+Flink+SQL+DDL+Enhancement
> <
> 
 
>>> 
> 
 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP+69+-+Flink+SQL+DDL+Enhancement
>> 
> [2]
> 
 
>>> 
> 
 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-69-Flink-SQL-DDL-Enhancement-td33090.html
> <
> 
 
>>> 
> 
 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-69-Flink-SQL-DDL-Enhancement-td33090.html
>> 
> Best,

[jira] [Created] (FLINK-14652) Refactor checkpointing related parts into one place on task side

2019-11-07 Thread Yun Tang (Jira)
Yun Tang created FLINK-14652:


 Summary: Refactor checkpointing related parts into one place on 
task side
 Key: FLINK-14652
 URL: https://issues.apache.org/jira/browse/FLINK-14652
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Task
Reporter: Yun Tang
 Fix For: 1.10.0


As suggested by [~sewen] within [review for 
PR-8693|https://github.com/apache/flink/pull/8693#issuecomment-542834147], it 
would be worthy to refactor all checkpointing parts into a single place on task 
side.

This issue focus on refactoring these parts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-07 Thread Yu Li
Thanks for the comments.

bq. I think the perf e2e test suites will also need to be designed as
supporting running on both standalone env and distributed env. will be
helpful for developing & evaluating the perf.
Agreed and marked down, the benchmark will be able to be executed in
standalone mode. On the other hand, we plan to check the result in
distributed mode to better reflect network cost for the daily run.

Best Regards,
Yu


On Mon, 4 Nov 2019 at 10:00, OpenInx  wrote:

> > The test cases are written in java and scripts in python. We propose a
> separate directory/module in parallel with flink-end-to-end-tests, with the
> > name of flink-end-to-end-perf-tests.
>
> Glad to see that the newly introduced e2e test will be written in Java.
> because  I'm re-working on the existed e2e tests suites from BASH scripts
> to Java test cases so that we can support more external system , such as
> running the testing job on yarn+flink, docker+flink, standalone+flink,
> distributed kafka cluster etc.
> BTW, I think the perf e2e test suites will also need to be designed as
> supporting running on both standalone env and distributed env. will be
> helpful
> for developing & evaluating the perf.
> Thanks.
>
> On Mon, Nov 4, 2019 at 9:31 AM aihua li  wrote:
>
> > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > statebackend.
> > I think there should be some special scenarios to test checkpoint and
> > statebackend, which will be discussed and added in the release-1.11
> >
> > > 在 2019年11月2日,上午12:13,Yun Tang  写道:
> > >
> > > By the way, do you think it's worthy to add a checkpoint mode which
> just
> > disable checkpoint to run end-to-end jobs? And when will stage2 and
> stage3
> > be discussed in more details?
> >
> >
>


Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-07 Thread Yu Li
Thanks for the suggestion Jingsong!

I've added a stage for adding more metrics in FLIP document, please check
and let me know if any further concerns. Thanks.

Best Regards,
Yu


On Mon, 4 Nov 2019 at 17:37, Jingsong Li  wrote:

> +1 for the idea. Thanks Yu for driving this.
> Just curious about that can we collect the metrics about Job scheduling and
> task launch. the speed of this part is also important.
> We can add tests for watch it too.
>
> Look forward to more batch test support.
>
> Best,
> Jingsong Lee
>
> On Mon, Nov 4, 2019 at 10:00 AM OpenInx  wrote:
>
> > > The test cases are written in java and scripts in python. We propose a
> > separate directory/module in parallel with flink-end-to-end-tests, with
> the
> > > name of flink-end-to-end-perf-tests.
> >
> > Glad to see that the newly introduced e2e test will be written in Java.
> > because  I'm re-working on the existed e2e tests suites from BASH scripts
> > to Java test cases so that we can support more external system , such as
> > running the testing job on yarn+flink, docker+flink, standalone+flink,
> > distributed kafka cluster etc.
> > BTW, I think the perf e2e test suites will also need to be designed as
> > supporting running on both standalone env and distributed env. will be
> > helpful
> > for developing & evaluating the perf.
> > Thanks.
> >
> > On Mon, Nov 4, 2019 at 9:31 AM aihua li  wrote:
> >
> > > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > > statebackend.
> > > I think there should be some special scenarios to test checkpoint and
> > > statebackend, which will be discussed and added in the release-1.11
> > >
> > > > 在 2019年11月2日,上午12:13,Yun Tang  写道:
> > > >
> > > > By the way, do you think it's worthy to add a checkpoint mode which
> > just
> > > disable checkpoint to run end-to-end jobs? And when will stage2 and
> > stage3
> > > be discussed in more details?
> > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>


[jira] [Created] (FLINK-14653) Job-related errors in snapshotState do not result in job failure

2019-11-07 Thread Maximilian Michels (Jira)
Maximilian Michels created FLINK-14653:
--

 Summary: Job-related errors in snapshotState do not result in job 
failure
 Key: FLINK-14653
 URL: https://issues.apache.org/jira/browse/FLINK-14653
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Reporter: Maximilian Michels


When users override {{snapshoteState}}, they might include logic there which is 
crucial for the correctness of their application, e.g. finalizing a transaction 
and buffering the results of that transaction, or flushing events to an 
external store. Exceptions occurring should lead to failing the job.

Currently, users must make sure to throw a {{Throwable}} because any 
{{Exception}} will be caught by the task and reported as checkpointing error, 
when it could be an application error.

It would be helpful to update the documentation and introduce a special 
exception that can be thrown for job-related failures, e.g. 
{{ApplicationError}} or similar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-07 Thread Yu Li
Thanks for the comments Biao!

bq. It seems this proposal is separated into several stages. Is there a
more detailed plan?
Good point! For stage one we'd like to try introducing the benchmark first,
so we could guard the release (hopefully starting from 1.10). For other
stages, we don't have detailed plan yet, but will add child FLIPs when
moving on and open new discussion/voting separately. I have updated the
FLIP document to better reflect this, please check it and let me know what
you think. Thanks.

Best Regards,
Yu


On Tue, 5 Nov 2019 at 10:16, Biao Liu  wrote:

> Thanks Yu for bringing this topic.
>
> +1 for this proposal. Glad to have an e2e performance testing.
>
> It seems this proposal is separated into several stages. Is there a more
> detailed plan?
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Mon, 4 Nov 2019 at 19:54, Congxian Qiu  wrote:
>
> > +1 for this idea.
> >
> > Currently, we have the micro benchmark for flink, which can help us find
> > the regressions. And I think the e2e jobs performance testing can also
> help
> > us to cover more scenarios.
> >
> > Best,
> > Congxian
> >
> >
> > Jingsong Li  于2019年11月4日周一 下午5:37写道:
> >
> > > +1 for the idea. Thanks Yu for driving this.
> > > Just curious about that can we collect the metrics about Job scheduling
> > and
> > > task launch. the speed of this part is also important.
> > > We can add tests for watch it too.
> > >
> > > Look forward to more batch test support.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Mon, Nov 4, 2019 at 10:00 AM OpenInx  wrote:
> > >
> > > > > The test cases are written in java and scripts in python. We
> propose
> > a
> > > > separate directory/module in parallel with flink-end-to-end-tests,
> with
> > > the
> > > > > name of flink-end-to-end-perf-tests.
> > > >
> > > > Glad to see that the newly introduced e2e test will be written in
> Java.
> > > > because  I'm re-working on the existed e2e tests suites from BASH
> > scripts
> > > > to Java test cases so that we can support more external system , such
> > as
> > > > running the testing job on yarn+flink, docker+flink,
> standalone+flink,
> > > > distributed kafka cluster etc.
> > > > BTW, I think the perf e2e test suites will also need to be designed
> as
> > > > supporting running on both standalone env and distributed env. will
> be
> > > > helpful
> > > > for developing & evaluating the perf.
> > > > Thanks.
> > > >
> > > > On Mon, Nov 4, 2019 at 9:31 AM aihua li 
> wrote:
> > > >
> > > > > In stage1, the checkpoint mode isn't disabled,and uses heap as the
> > > > > statebackend.
> > > > > I think there should be some special scenarios to test checkpoint
> and
> > > > > statebackend, which will be discussed and added in the release-1.11
> > > > >
> > > > > > 在 2019年11月2日,上午12:13,Yun Tang  写道:
> > > > > >
> > > > > > By the way, do you think it's worthy to add a checkpoint mode
> which
> > > > just
> > > > > disable checkpoint to run end-to-end jobs? And when will stage2 and
> > > > stage3
> > > > > be discussed in more details?
> > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> >
>


[jira] [Created] (FLINK-14654) Fix the arguments number mismatching with placeholders in log statements

2019-11-07 Thread Yun Tang (Jira)
Yun Tang created FLINK-14654:


 Summary: Fix the arguments number mismatching with placeholders in 
log statements
 Key: FLINK-14654
 URL: https://issues.apache.org/jira/browse/FLINK-14654
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.9.1
Reporter: Yun Tang
 Fix For: 1.10.0


As official Flink [java code 
style|https://flink.apache.org/contributing/code-style-and-quality-java.html#preconditions-and-log-statements]
 suggested, we should use correct log statement format. However, there existed 
13 files within current master branch that the arguments number mismatch with 
placeholders in log statements.

The error looks like:
{code:java}
LOG.warn("Failed to read native metric %s from RocksDB", property, e);
{code}
and the correct format should be
{code:java}
LOG.warn("Failed to read native metric {} from RocksDB.", property, e);
{code}
The other errors look like
{code:java}
LOG.warn("Could not find method implementations in the shaded jar. Exception: 
{}", e);
{code}
and the correct format should be
{code:java}
LOG.warn("Could not find method implementations in the shaded jar.", e);
{code}
Below is the full list of files have problems in log statements.
{code:java}
flink-contrib/flink-connector-wikiedits/src/main/java/org/apache/flink/streaming/connectors/wikiedits/WikipediaEditEventIrcStream.java
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisProducer.java
flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/PipelineErrorHandler.java
flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java
flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java
flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/ParquetPojoInputFormat.java
flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/ParquetTableSource.java
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/functions/SqlFunctionUtils.java
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/operators/values/ValuesInputFormat.java
flink-end-to-end-tests/flink-connector-gcp-pubsub-emulator-tests/src/test/java/org/apache/flink/streaming/connectors/gcp/pubsub/emulator/GCloudEmulatorManager.java
flink-connectors/flink-connector-kafka/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaTestEnvironmentImpl.java
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/ReporterSetup.java
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBNativeMetricMonitor.java{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: How long is the flink sql task state default ttl?

2019-11-07 Thread Dian Fu
It's disabled by default. 

BTW: You only need to send it to user ML and it's not necessary to send it to 
the dev ML.

> 在 2019年11月7日,下午3:36,LakeShen  写道:
> 
> Hi community, as I know I can use idle state retention time to clear the 
> flink sql task state,I have a question is that how long the flink sql task 
> state default ttl is . Thanks



[jira] [Created] (FLINK-14655) Change Type of Field jobStatusListeners from CopyOnWriteArrayList to ArrayList

2019-11-07 Thread vinoyang (Jira)
vinoyang created FLINK-14655:


 Summary: Change Type of Field jobStatusListeners from 
CopyOnWriteArrayList to ArrayList
 Key: FLINK-14655
 URL: https://issues.apache.org/jira/browse/FLINK-14655
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: vinoyang


After FLINK-11417, we made ExecutionGraph be a single-thread mode. It will no 
longer be plagued by concurrency issues. So, we can degenerate the current 
CopyOnWriteArrayList type of jobStatusListeners to a normal ArrayList type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14656) blink planner should convert catalog statistics to TableStats for permanent table instead of temporary table

2019-11-07 Thread godfrey he (Jira)
godfrey he created FLINK-14656:
--

 Summary: blink planner should convert catalog statistics to 
TableStats for permanent table instead of temporary table
 Key: FLINK-14656
 URL: https://issues.apache.org/jira/browse/FLINK-14656
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.9.1, 1.9.0
Reporter: godfrey he
 Fix For: 1.10.0


currently, blink planner will convert {{CatalogTable}} to Calcite {{Table}}, 
and convert the catalog statistics to `TableStats` in 
{{DatabaseCalciteSchema}}. However, the catalog statistics conversion is only 
for temporary table which has no any statistics now. It should be for permanent 
table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14657) Generalize and move YarnConfigUtils from flink-yarn to flink-core

2019-11-07 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14657:
--

 Summary: Generalize and move YarnConfigUtils from flink-yarn to 
flink-core
 Key: FLINK-14657
 URL: https://issues.apache.org/jira/browse/FLINK-14657
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This issue is just about moving some utility methods from flink-yarn to 
flink-core because they could be of general interest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14658) Drop ".returns()" for TypeInformation in the DataStream API

2019-11-07 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-14658:


 Summary: Drop ".returns()" for TypeInformation in the DataStream 
API
 Key: FLINK-14658
 URL: https://issues.apache.org/jira/browse/FLINK-14658
 Project: Flink
  Issue Type: Sub-task
  Components: API / DataStream
Reporter: Stephan Ewen
 Fix For: 2.0.0


The pattern to use {{.map(function).returns(type)}} to override the 
automatically extracted type is flawed and should not be used.

Instead, each transformation method should be overloaded to have a variant that 
accepts the return type as the second argument, if in case we want to override 
the type extractor.

See [FLINK-14380] for good example of why the {{.returns(type)}} pattern is 
broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Stateful Functions - Contribution Details

2019-11-07 Thread Igal Shilman
Hello everyone!

Following the successful vote to accept Stateful Functions into Flink [1],
I would like to start a discussion regarding the technical aspects of the
contribution.
Once the discussion will finalize I will summarize the results into a FLIP
and bring it up to a vote.

1) External repository name - Following the discussion conclusion of [2] we
need a name for an external repository.

proposal: flink-statefun
rational: discussed in the other thread.

2) Maven modules proposal:
2.1) group id: org.apache.flink
2.2) artifact ids: replace "stateful-functions-*" with "statefun-*".

3) Java package name: org.apache.flink.statefun.*

4) Mailing list - should we reuse the existing mailing list or have a
dedicated mailing list for stateful functions?
options:
a) Completely separate mailing list for statefun developers and users (
dev-state...@flink.apache.org and user-state...@flink.apache.org)
b) Reuse the dev and user mailing lists of Flink
c) Reuse Flink's user mailing list, but create a dedicated mailing list for
development.
d) Have a separate single list for developers and users of statefun (
state...@flink.apache.org)

proposal: (c) separate dev list: "dev-state...@flink.apache.org" and reuse
the Flink user mailing list.
rational: It is very likely that stateful functions users would encounter
the same operational issues as regular Flink users, therefore
it might be beneficial to reuse the Flink user list.

5) separate JIRA project or just component / tag?
proposal: use separate component for statefun.

Thanks,
Igal

[1]  http://mail-archives.apache.org/mod_mbox/flink-dev/201911.mbox/browser
[2]
http://mail-archives.apache.org/mod_mbox/flink-dev/201910.mbox/%3CCANC1h_vRPWs1PnRPuDe602zhX=3j713fanz0wn2dw9pzf_t...@mail.gmail.com%3E


Re: [DISCUSS] Stateful Functions - Contribution Details

2019-11-07 Thread Chesnay Schepler

[1] Does not directly link to the voting thread.

1) I skimmed all 3 threads about the stateful functions proposal and 
could not find a rational for the repository name, I'd appreciate a 
direct link to the relevant post.


2.1) +1 as we use o.a.f also for flink-shaded

3) +1 as it follows the existing package conventions for libraries.

4) b; I see no reason why we would isolate mailing lists when we haven't 
done so for the myriad of other components that are largely independent 
from each other (like SQL).
There are some practical issues here with having a separate dev ML, for 
example where to send FLIPs or release threads and ensuring they reach a 
large enough audience, which a dedicated ML would likely hinder.
I'm currently also assuming that builds/commits also go to the general 
flink MLs, making it even weirder if just dev were spliced out.


5) separate component, like "API / Statefun"

Personally I'm not sold on the "statefun" name, has this been a 
discussion item in one of the other threads?


On 07/11/2019 17:10, Igal Shilman wrote:

Hello everyone!

Following the successful vote to accept Stateful Functions into Flink [1],
I would like to start a discussion regarding the technical aspects of the
contribution.
Once the discussion will finalize I will summarize the results into a FLIP
and bring it up to a vote.

1) External repository name - Following the discussion conclusion of [2] we
need a name for an external repository.

proposal: flink-statefun
rational: discussed in the other thread.

2) Maven modules proposal:
2.1) group id: org.apache.flink
2.2) artifact ids: replace "stateful-functions-*" with "statefun-*".

3) Java package name: org.apache.flink.statefun.*

4) Mailing list - should we reuse the existing mailing list or have a
dedicated mailing list for stateful functions?
options:
a) Completely separate mailing list for statefun developers and users (
dev-state...@flink.apache.org and user-state...@flink.apache.org)
b) Reuse the dev and user mailing lists of Flink
c) Reuse Flink's user mailing list, but create a dedicated mailing list for
development.
d) Have a separate single list for developers and users of statefun (
state...@flink.apache.org)

proposal: (c) separate dev list: "dev-state...@flink.apache.org" and reuse
the Flink user mailing list.
rational: It is very likely that stateful functions users would encounter
the same operational issues as regular Flink users, therefore
it might be beneficial to reuse the Flink user list.

5) separate JIRA project or just component / tag?
proposal: use separate component for statefun.

Thanks,
Igal

[1]  http://mail-archives.apache.org/mod_mbox/flink-dev/201911.mbox/browser
[2]
http://mail-archives.apache.org/mod_mbox/flink-dev/201910.mbox/%3CCANC1h_vRPWs1PnRPuDe602zhX=3j713fanz0wn2dw9pzf_t...@mail.gmail.com%3E





[jira] [Created] (FLINK-14659) add 'LOAD MODULE' and 'UNLOAD MODULE' sql commands to sql parser

2019-11-07 Thread Bowen Li (Jira)
Bowen Li created FLINK-14659:


 Summary: add 'LOAD MODULE' and 'UNLOAD MODULE' sql commands to sql 
parser
 Key: FLINK-14659
 URL: https://issues.apache.org/jira/browse/FLINK-14659
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14660) add 'SHOW MODULES' sql command

2019-11-07 Thread Bowen Li (Jira)
Bowen Li created FLINK-14660:


 Summary: add 'SHOW MODULES' sql command
 Key: FLINK-14660
 URL: https://issues.apache.org/jira/browse/FLINK-14660
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Client
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14661) rename args of setters in sql cli Environment

2019-11-07 Thread Bowen Li (Jira)
Bowen Li created FLINK-14661:


 Summary: rename args of setters in sql cli Environment 
 Key: FLINK-14661
 URL: https://issues.apache.org/jira/browse/FLINK-14661
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Client
Affects Versions: 1.10.0
Reporter: Bowen Li


Rename 

{code:java}
public void setCatalogs(List> catalogs) { ... }

public void setTables(List> tables) { ... }

// functions
{code}

to

{code:java}
public void setCatalogs(List> catalogMap) { ... }

public void setTables(List> tableMap) { ... }

// functions
{code}

to avoid name conflicts with member variables.

This is a newbie task for anyone new to the community to pick up and work on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14662) Distinguish unknown table stats and zero

2019-11-07 Thread Kurt Young (Jira)
Kurt Young created FLINK-14662:
--

 Summary: Distinguish unknown table stats and zero
 Key: FLINK-14662
 URL: https://issues.apache.org/jira/browse/FLINK-14662
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive, Table SQL / API
Reporter: Kurt Young


Currently UNKNOWN table stats is represented with zeros, which might confuse 
with KNOWN table stats with exactly 0 row count. 

 

We can use -1 to represent UNKNOWN instead. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14663) Distinguish unknown column stats and zero

2019-11-07 Thread Kurt Young (Jira)
Kurt Young created FLINK-14663:
--

 Summary: Distinguish unknown column stats and zero
 Key: FLINK-14663
 URL: https://issues.apache.org/jira/browse/FLINK-14663
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive, Table SQL / API
Reporter: Kurt Young


When converting from hive stats to flink's column stats, we didn't check 
whether some columns stats is really set or just an initial value. For example:
{code:java}
// code placeholder
LongColumnStatsData longColStats = stats.getLongStats();
return new CatalogColumnStatisticsDataLong(
  longColStats.getLowValue(),
  longColStats.getHighValue(),
  longColStats.getNumDVs(),
  longColStats.getNumNulls());
{code}
 Hive `LongColumnStatsData` actually has information whether some stats is set 
through APIs like `isSetNumDVs()`. And the initial values are all 0, it will 
confuse us is it really 0 or just an initial value. 

 

We can use -1 to represent UNKNOWN value for column stats. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-59: Enable execution configuration from Configuration object

2019-11-07 Thread Terry Wang
Thanks for driving on this.
+1 from my side (non-binding) 
Best,
Terry Wang



> 2019年11月7日 17:34,Dawid Wysakowicz  写道:
> 
> Thank you tison. You are right. I did not update the hyperlinks. Sorry
> for that. Once again then:
> 
> please vote for FLIP-59
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object.
> 
> 
> The discussion thread can be found here
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-59-Enable-execution-configuration-from-Configuration-object-td32359.html
> 
> This vote will be open for at least 72 hours and requires consensus to
> be accepted.
> 
> Best, Dawid
> 
> On 07/11/2019 10:29, tison wrote:
>> Hi Dawid,
>> 
>> I'm afraid that you list the wrong FLIP page. Although the content is
>> FLIP-59 but it directs to FLIP-67.
>> 
>> Best,
>> tison.
>> 
>> 
>> Dawid Wysakowicz  于2019年11月7日周四 下午5:04写道:
>> 
>>> Hello,
>>> 
>>> please vote for FLIP-59
>>> 
>>> .
>>> 
>>> 
>>> The discussion thread can be found here <
>>> 
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-59-Enable-execution-configuration-from-Configuration-object-td32359.html
>>> .
>>> 
>>> 
>>> This vote will be open for at least 72 hours and requires consensus to be
>>> accepted.
>>> 
>>> Best,
>>> Dawid
>>> 
> 



[jira] [Created] (FLINK-14664) Support to reference user defined functions of external catalog in computed columns

2019-11-07 Thread Danny Chen (Jira)
Danny Chen created FLINK-14664:
--

 Summary: Support to reference user defined functions of external 
catalog in computed columns
 Key: FLINK-14664
 URL: https://issues.apache.org/jira/browse/FLINK-14664
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.9.1
Reporter: Danny Chen
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-69: Flink SQL DDL Enhancement

2019-11-07 Thread Kurt Young
Hi all,

I think we should focus to discuss the document in [DISCUSS] thread and
keep this vote thread purely for voting.

Otherwise, it's hard for others to collect feedbacks for this topic.

Best,
Kurt


On Thu, Nov 7, 2019 at 5:51 PM Terry Wang  wrote:

> Hi Rui~
> What you suggested makes sense, remove description and detailed
> description from `DESCRIBE DATABASE`.
> Open to more comments and votes :)
>
> Best,
> Terry Wang
>
>
>
> > 2019年11月7日 17:15,Rui Li  写道:
> >
> > I see, thanks for the clarification. In current implementation, it seems
> > just a duplicate of comment. So I'd prefer not to display it for DESCRIBE
> > DATABASE, because 1) users have no control over the content and 2) it's
> > totally redundant. We can add it in the future when we come up with
> > something more meaningful. What do you think?
> >
> > On Thu, Nov 7, 2019 at 3:54 PM Terry Wang  wrote:
> >
> >> Hi Rui~
> >>
> >> Description of the database is obtained from
> >> `CatalogDatabase#getDescription()` method, which is implement by
> >> CatalogDatebaseImpl. Users don’t need to specify the description.
> >>
> >> Best,
> >> Terry Wang
> >>
> >>
> >>
> >>> 2019年11月7日 15:40,Rui Li  写道:
> >>>
> >>> Thanks Terry for driving this forward.
> >>> Got one question about DESCRIBE DATABASE: the results display comment
> and
> >>> description of a database. While comment can be specified when a
> database
> >>> is created, I don't see how users can specify description of the
> >> database?
> >>>
> >>> On Thu, Nov 7, 2019 at 4:16 AM Bowen Li  wrote:
> >>>
>  Thanks.
> 
>  As Terry and I discussed offline yesterday, we added a new section to
>  explain the detailed implementation plan.
> 
>  +1 (binding) from me.
> 
>  Bowen
> 
>  On Tue, Nov 5, 2019 at 6:33 PM Terry Wang  wrote:
> 
> > Hi Bowen:
> > Thanks for your feedback.
> > Your opinion convinced me and I just remove the section about catalog
> > create statement and also remove `DBPROPERTIES` `PROPERTIES` from
> alter
> > DDLs.
> > Open to more comments or votes :) !
> >
> > Best,
> > Terry Wang
> >
> >
> >
> >> 2019年11月6日 07:22,Bowen Li  写道:
> >>
> >> Hi Terry,
> >>
> >> I went over the FLIP in detail again. The FLIP mostly LGTM. A couple
> > issues:
> >>
> >> - since we on't plan to support catalog ddl, can you remove them
> from
>  the
> >> FLIP?
> >> - I found there are some discrepancies in proposed database and
> table
> > DDLs.
> >> For db ddl, the create db syntax proposes specifying k-v properties
> >> following "WITH". However, alter db ddl comes with a keyword
> > "DBPROPERTIES":
> >>
> >> CREATE  DATABASE [ IF NOT EXISTS ] [ catalogName.] dataBaseName [
>  COMMENT
> >> database_comment ]
> >> [*WITH *( name=value [, name=value]*)]
> >>
> >>
> >> ALTER  DATABASE  [ catalogName.] dataBaseName SET *DBPROPERTIES* (
> >> name=value [, name=value]*)
> >>
> >>
> >>  IIUIC, are you borrowing syntax from Hive? Note that Hive's db
>  create
> >> ddl comes with "DBPROPERTIES" though - "CREATE (DATABASE|SCHEMA) [IF
>  NOT
> >> EXISTS] database_name ...  [*WITH DBPROPERTIES* (k=v, ...)];" [1]
> >>
> >> The same applies to table ddl. The proposed alter table ddl comes
>  with
> >> "SET *PROPERTIES* (...)", however, Flink's existing table create ddl
> > since
> >> 1.9 [2] doesn't have "PROPERTIES" keyword. As opposed to Hive's
> >> syntax,
> >> both create and alter table ddl comes with "TBLPROPERTIES" [1].
> >>
> >> I feel it's better to be consistent among our DDLs. One option is to
> >> just remove the "PROPERTIES" and "DBPROPERTIES" keywords in proposed
> > syntax.
> >>
> >> [1]
>  https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
> >> [2]
> >>
> >
> 
> >>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html#specifying-a-ddl
> >>
> >> On Tue, Nov 5, 2019 at 12:54 PM Peter Huang <
>  huangzhenqiu0...@gmail.com>
> >> wrote:
> >>
> >>> +1 for the enhancement.
> >>>
> >>> On Tue, Nov 5, 2019 at 11:04 AM Xuefu Z  wrote:
> >>>
>  +1 to the long missing feature in Flink SQL.
> 
>  On Tue, Nov 5, 2019 at 6:32 AM Terry Wang 
>  wrote:
> 
> > Hi all,
> >
> > I would like to start the vote for FLIP-69[1] which is discussed
> >> and
> > reached consensus in the discussion thread[2].
> >
> > The vote will be open for at least 72 hours. I'll try to close it
> >> by
> > 2019-11-08 14:30 UTC, unless there is an objection or not enough
> > votes.
> >
> > [1]
> >
> 
> >>>
> >
> 
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+69+-+Flink+SQL+DDL+Enhancement
> > <
> >
> >>

[jira] [Created] (FLINK-14665) Support computed column for create table statement in blink-planner

2019-11-07 Thread Danny Chen (Jira)
Danny Chen created FLINK-14665:
--

 Summary: Support computed column for create table statement in 
blink-planner
 Key: FLINK-14665
 URL: https://issues.apache.org/jira/browse/FLINK-14665
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.9.1
Reporter: Danny Chen
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP 69 - Flink SQL DDL Enhancement

2019-11-07 Thread Kurt Young
Hi,

Sorry to join this so late and thanks for proposing this FLIP. After
going through the proposal details, I would +1 for the changes.

However, the FLIP name is kind of confusing me. It says will do
DDL enhancement, and picked up a few new features to do. It looks
to me the goal and content of this FLIP is kind of random.

Each topic of this FLIP touched is super big, e.g. to enhance
alter table command. According to SQL 2011 standard, it would contains
at least so many features like:

 ::=
  ALTER TABLE  
 ::=

  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 

I'm not suggesting to do all these at once, but I also didn't see any
future plan or goals in the FLIP to describe the full picture here. We just
picked up some random chosen features to start.

But still I'm +1 to this FLIP since they are all good enhancements.

Best,
Kurt


On Tue, Nov 5, 2019 at 10:32 PM Terry Wang  wrote:

> Hi Bowen~
>
> We don’t intend to support create/drop catalog  syntax in this flip, we
> may support it if there indeed has a strong desire.
> And I’m going to kick off a vote for this flip, feel free to review again.
>
> Best,
> Terry Wang
>
>
>
> > 2019年9月26日 00:44,Xuefu Z  写道:
> >
> > Actually catalogs are more of system settings than of user objects that a
> > user might create or drop constantly. Thus, it's probably sufficient to
> set
> > up catalog information in the config file, at least for now.
> >
> > Thanks,
> > Xuefu
> >
> > On Tue, Sep 24, 2019 at 7:10 PM Terry Wang  zjuwa...@gmail.com>> wrote:
> >
> >> Thanks Bowen for your insightful comments, I’ll think twice and do
> >> corresponding improvement.
> >> After finished, I’ll update in this mailing thread again.
> >> Best,
> >> Terry Wang
> >>
> >>
> >>
> >>> 在 2019年9月25日,上午8:28,Bowen Li  写道:
> >>>
> >>> BTW, will there be a "CREATE/DROP CATALOG" DDL?
> >>>
> >>> Though it's not SQL standard, I can see it'll be useful and handy for
> >> our end users in many cases.
> >>>
> >>> On Mon, Sep 23, 2019 at 12:28 PM Bowen Li    >> bowenl...@gmail.com >> wrote:
> >>> Hi Terry,
> >>>
> >>> Thanks for driving the effort! I left some comments in the doc.
> >>>
> >>> AFAIU, the biggest motivation is to support DDLs in sql parser so that
> >> both Table API and SQL CLI can share the stack, despite that SQL CLI has
> >> already supported some commands itself. However, I don't see details on
> how
> >> SQL CLI would migrate and depend on sql parser, and how Table API and
> SQL
> >> CLI would actually share SQL parser. I'm not sure yet how much work that
> >> will take, just want to double check that you didn't include them
> because
> >> they are very trivial according to your estimate?
> >>>
> >>>
> >>> On Mon, Sep 16, 2019 at 1:46 AM Terry Wang    >> zjuwa...@gmail.com >> wrote:
> >>> Hi everyone,
> >>>
> >>> In flink 1.9, we have introduced some awesome features such as complete
> >> catalog support[1] and sql ddl support[2]. These features have been a
> >> critical integration for Flink to be able to manage data and metadata
> like
> >> a classic RDBMS and make developers more easy to construct their
> >> real-time/off-line warehouse or sth similar base on flink.
> >>>
> >>> But there is still a lack of support on how Flink SQL DDL to manage
> >> metadata and data like classic RDBMS such as `alter table rename` and
> so on.
> >>>
> >>> So I’d like to kick off a discussion on enhancing Flink Sql Ddls:
> >>>
> >>
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> <
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> >
> >> <
> >>
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> <
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> >>
> >> <
> >>
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> <
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> >
> >> <
> >>
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> <
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> >
> 
> >>>
> >>> In short, it:
> >>>- Add Catalog DDL enhancement support:  show catalogs / describe
> >> catalog / use catalog
> >>>- Add Database DDL enhancement support:  show databses / create
> >> database / drop database/ alter database
> >>>- Add Table DDL enhancement support:show tables/ describe
> >> table / alter table
> >>>- Add Function DDL enhancement support: show functions/ create
> >> function /drop function
> >>>
> >>> Looking forward to your opinions.
> >>>
> >>> Best,
> >>> Terry Wang
> >>>
> >>>
> >>>
> >>> [1]:https://issues.apache.org/jira/browse/FLIN

[jira] [Created] (FLINK-14666) support multiple catalog in flink table sql

2019-11-07 Thread yuemeng (Jira)
yuemeng created FLINK-14666:
---

 Summary: support multiple catalog in flink table sql
 Key: FLINK-14666
 URL: https://issues.apache.org/jira/browse/FLINK-14666
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.9.1, 1.9.0, 1.8.2, 1.8.0
Reporter: yuemeng


currently, calcite will only use the current catalog as schema path to validate 
sql node,
maybe this is not reasonable

{code}
tableEnvironment.useCatalog("user_catalog");
tableEnvironment.useDatabase("user_db");
 Table table = tableEnvironment.sqlQuery("SELECT action, os,count(*) as cnt 
from music_queue_3 group by action, os,tumble(proctime, INTERVAL '10' 
SECOND)"); tableEnvironment.registerTable("v1", table);
Table t2 = tableEnvironment.sqlQuery("select action, os, 1 as cnt from v1");
tableEnvironment.registerTable("v2", t2);
tableEnvironment.sqlUpdate("INSERT INTO database2.kafka_table_test1 SELECT 
action, os,cast (cnt as BIGINT) as cnt from v2");
{code}

suppose source table music_queue_3  and sink table kafka_table_test1 both in 
user_catalog
 catalog 
 but some temp table or view such as v1, v2,v3 will register in default catalog.

when we select temp table v2 and insert it into our own catalog table 
database2.kafka_table_test1 
it always failed with sql node validate, because of schema path in
catalog reader is the current catalog without default catalog,the temp table or 
view will never be Identified
















--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[VOTE] FLIP-79: Flink Function DDL Support (1.10 Release Feature Only)

2019-11-07 Thread Peter Huang
Dear All,

I would like to start the vote for 1.10 release features in FLIP-79 [1]
which is discussed and research consensus in the discussion thread [2]. For
the advanced feature, such as loading function from remote resources,
support scala/python function, we will have the further discussion after
release 1.10.

The vote will be open for at least 72 hours. If the voting passes, I will
close it by 2019-11-10 14:00 UTC.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-79+Flink+Function+DDL+Support
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-FLIP-79-Flink-Function-DDL-Support-td33965.html

Best Regards
Peter Huang


[jira] [Created] (FLINK-14667) flink1.9,1.8.2 run flinkSql fat jar ,can't load the right tableFactory (TableSourceFactory) for the kafka

2019-11-07 Thread chun111111 (Jira)
chun11 created FLINK-14667:
--

 Summary: flink1.9,1.8.2 run flinkSql fat jar ,can't load the right 
tableFactory (TableSourceFactory) for the kafka 
 Key: FLINK-14667
 URL: https://issues.apache.org/jira/browse/FLINK-14667
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.9.1, 1.8.2
Reporter: chun11






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP 69 - Flink SQL DDL Enhancement

2019-11-07 Thread Terry Wang
Hi, Kurt~

Thanks for your vote and pointing out some deficiency of this flip. I’ll try to 
avoid making similar mistakes.

Best,
Terry Wang



> 2019年11月8日 11:28,Kurt Young  写道:
> 
> Hi,
> 
> Sorry to join this so late and thanks for proposing this FLIP. After
> going through the proposal details, I would +1 for the changes.
> 
> However, the FLIP name is kind of confusing me. It says will do
> DDL enhancement, and picked up a few new features to do. It looks
> to me the goal and content of this FLIP is kind of random.
> 
> Each topic of this FLIP touched is super big, e.g. to enhance
> alter table command. According to SQL 2011 standard, it would contains
> at least so many features like:
> 
>  ::=
>  ALTER TABLE  
>  ::=
>
>  | 
>  | 
>  | 
>  | 
>  | 
>  | 
>  | 
>  | 
>  | 
> 
> I'm not suggesting to do all these at once, but I also didn't see any
> future plan or goals in the FLIP to describe the full picture here. We just
> picked up some random chosen features to start.
> 
> But still I'm +1 to this FLIP since they are all good enhancements.
> 
> Best,
> Kurt
> 
> 
> On Tue, Nov 5, 2019 at 10:32 PM Terry Wang  > wrote:
> 
>> Hi Bowen~
>> 
>> We don’t intend to support create/drop catalog  syntax in this flip, we
>> may support it if there indeed has a strong desire.
>> And I’m going to kick off a vote for this flip, feel free to review again.
>> 
>> Best,
>> Terry Wang
>> 
>> 
>> 
>>> 2019年9月26日 00:44,Xuefu Z  写道:
>>> 
>>> Actually catalogs are more of system settings than of user objects that a
>>> user might create or drop constantly. Thus, it's probably sufficient to
>> set
>>> up catalog information in the config file, at least for now.
>>> 
>>> Thanks,
>>> Xuefu
>>> 
>>> On Tue, Sep 24, 2019 at 7:10 PM Terry Wang >>  > zjuwa...@gmail.com >> wrote:
>>> 
 Thanks Bowen for your insightful comments, I’ll think twice and do
 corresponding improvement.
 After finished, I’ll update in this mailing thread again.
 Best,
 Terry Wang
 
 
 
> 在 2019年9月25日,上午8:28,Bowen Li  > 写道:
> 
> BTW, will there be a "CREATE/DROP CATALOG" DDL?
> 
> Though it's not SQL standard, I can see it'll be useful and handy for
 our end users in many cases.
> 
> On Mon, Sep 23, 2019 at 12:28 PM Bowen Li  
>> > >>> bowenl...@gmail.com  
  Hi Terry,
> 
> Thanks for driving the effort! I left some comments in the doc.
> 
> AFAIU, the biggest motivation is to support DDLs in sql parser so that
 both Table API and SQL CLI can share the stack, despite that SQL CLI has
 already supported some commands itself. However, I don't see details on
>> how
 SQL CLI would migrate and depend on sql parser, and how Table API and
>> SQL
 CLI would actually share SQL parser. I'm not sure yet how much work that
 will take, just want to double check that you didn't include them
>> because
 they are very trivial according to your estimate?
> 
> 
> On Mon, Sep 16, 2019 at 1:46 AM Terry Wang  
>> > >>> zjuwa...@gmail.com   Hi everyone,
> 
> In flink 1.9, we have introduced some awesome features such as complete
 catalog support[1] and sql ddl support[2]. These features have been a
 critical integration for Flink to be able to manage data and metadata
>> like
 a classic RDBMS and make developers more easy to construct their
 real-time/off-line warehouse or sth similar base on flink.
> 
> But there is still a lack of support on how Flink SQL DDL to manage
 metadata and data like classic RDBMS such as `alter table rename` and
>> so on.
> 
> So I’d like to kick off a discussion on enhancing Flink Sql Ddls:
> 
 
>> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
>>  
>> 
>> <
>> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
>>  
>> 
>>> 
 <
 
>> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
>>  
>> 
>> <
>> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
>>  
>> 

Re: [DISCUSS] FLIP 69 - Flink SQL DDL Enhancement

2019-11-07 Thread Kurt Young
Hi Terry,

I wouldn't say it's a mistake, I also don't have any suggestions about
the issue. I just saw this and want to point it out to bring more attention.
Maybe someone has some good opinion on that part and we can
discuss around it and have some good advises for the community in
the future.

Best,
Kurt


On Fri, Nov 8, 2019 at 2:16 PM Terry Wang  wrote:

> Hi, Kurt~
>
> Thanks for your vote and pointing out some deficiency of this flip. I’ll
> try to avoid making similar mistakes.
>
> Best,
> Terry Wang
>
>
>
> > 2019年11月8日 11:28,Kurt Young  写道:
> >
> > Hi,
> >
> > Sorry to join this so late and thanks for proposing this FLIP. After
> > going through the proposal details, I would +1 for the changes.
> >
> > However, the FLIP name is kind of confusing me. It says will do
> > DDL enhancement, and picked up a few new features to do. It looks
> > to me the goal and content of this FLIP is kind of random.
> >
> > Each topic of this FLIP touched is super big, e.g. to enhance
> > alter table command. According to SQL 2011 standard, it would contains
> > at least so many features like:
> >
> >  ::=
> >  ALTER TABLE  
> >  ::=
> >
> >  | 
> >  | 
> >  | 
> >  | 
> >  | 
> >  | 
> >  | 
> >  | 
> >  | 
> >
> > I'm not suggesting to do all these at once, but I also didn't see any
> > future plan or goals in the FLIP to describe the full picture here. We
> just
> > picked up some random chosen features to start.
> >
> > But still I'm +1 to this FLIP since they are all good enhancements.
> >
> > Best,
> > Kurt
> >
> >
> > On Tue, Nov 5, 2019 at 10:32 PM Terry Wang  zjuwa...@gmail.com>> wrote:
> >
> >> Hi Bowen~
> >>
> >> We don’t intend to support create/drop catalog  syntax in this flip, we
> >> may support it if there indeed has a strong desire.
> >> And I’m going to kick off a vote for this flip, feel free to review
> again.
> >>
> >> Best,
> >> Terry Wang
> >>
> >>
> >>
> >>> 2019年9月26日 00:44,Xuefu Z  写道:
> >>>
> >>> Actually catalogs are more of system settings than of user objects
> that a
> >>> user might create or drop constantly. Thus, it's probably sufficient to
> >> set
> >>> up catalog information in the config file, at least for now.
> >>>
> >>> Thanks,
> >>> Xuefu
> >>>
> >>> On Tue, Sep 24, 2019 at 7:10 PM Terry Wang    >> zjuwa...@gmail.com >> wrote:
> >>>
>  Thanks Bowen for your insightful comments, I’ll think twice and do
>  corresponding improvement.
>  After finished, I’ll update in this mailing thread again.
>  Best,
>  Terry Wang
> 
> 
> 
> > 在 2019年9月25日,上午8:28,Bowen Li  bowenl...@gmail.com>> 写道:
> >
> > BTW, will there be a "CREATE/DROP CATALOG" DDL?
> >
> > Though it's not SQL standard, I can see it'll be useful and handy for
>  our end users in many cases.
> >
> > On Mon, Sep 23, 2019 at 12:28 PM Bowen Li  
> >> >   bowenl...@gmail.com   bowenl...@gmail.com  > Hi Terry,
> >
> > Thanks for driving the effort! I left some comments in the doc.
> >
> > AFAIU, the biggest motivation is to support DDLs in sql parser so
> that
>  both Table API and SQL CLI can share the stack, despite that SQL CLI
> has
>  already supported some commands itself. However, I don't see details
> on
> >> how
>  SQL CLI would migrate and depend on sql parser, and how Table API and
> >> SQL
>  CLI would actually share SQL parser. I'm not sure yet how much work
> that
>  will take, just want to double check that you didn't include them
> >> because
>  they are very trivial according to your estimate?
> >
> >
> > On Mon, Sep 16, 2019 at 1:46 AM Terry Wang  
> >> >   zjuwa...@gmail.com   zjuwa...@gmail.com  > Hi everyone,
> >
> > In flink 1.9, we have introduced some awesome features such as
> complete
>  catalog support[1] and sql ddl support[2]. These features have been a
>  critical integration for Flink to be able to manage data and metadata
> >> like
>  a classic RDBMS and make developers more easy to construct their
>  real-time/off-line warehouse or sth similar base on flink.
> >
> > But there is still a lack of support on how Flink SQL DDL to manage
>  metadata and data like classic RDBMS such as `alter table rename` and
> >> so on.
> >
> > So I’d like to kick off a discussion on enhancing Flink Sql Ddls:
> >
> 
> >>
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> <
> https://docs.google.com/document/d/1mhZmx1h2ecfL0x8OzYD1n-nVRn4yE7pwk4jGed4k7kc/edit?usp=sharing
> >
> >> <
> >>
> https://docs.google.com/document/d/1mhZmx1h2ecfL0

Re: [VOTE] FLIP-69: Flink SQL DDL Enhancement

2019-11-07 Thread Kurt Young
Forgot to vote.. +1 from my side.

Best,
Kurt


On Fri, Nov 8, 2019 at 11:00 AM Kurt Young  wrote:

> Hi all,
>
> I think we should focus to discuss the document in [DISCUSS] thread and
> keep this vote thread purely for voting.
>
> Otherwise, it's hard for others to collect feedbacks for this topic.
>
> Best,
> Kurt
>
>
> On Thu, Nov 7, 2019 at 5:51 PM Terry Wang  wrote:
>
>> Hi Rui~
>> What you suggested makes sense, remove description and detailed
>> description from `DESCRIBE DATABASE`.
>> Open to more comments and votes :)
>>
>> Best,
>> Terry Wang
>>
>>
>>
>> > 2019年11月7日 17:15,Rui Li  写道:
>> >
>> > I see, thanks for the clarification. In current implementation, it seems
>> > just a duplicate of comment. So I'd prefer not to display it for
>> DESCRIBE
>> > DATABASE, because 1) users have no control over the content and 2) it's
>> > totally redundant. We can add it in the future when we come up with
>> > something more meaningful. What do you think?
>> >
>> > On Thu, Nov 7, 2019 at 3:54 PM Terry Wang  wrote:
>> >
>> >> Hi Rui~
>> >>
>> >> Description of the database is obtained from
>> >> `CatalogDatabase#getDescription()` method, which is implement by
>> >> CatalogDatebaseImpl. Users don’t need to specify the description.
>> >>
>> >> Best,
>> >> Terry Wang
>> >>
>> >>
>> >>
>> >>> 2019年11月7日 15:40,Rui Li  写道:
>> >>>
>> >>> Thanks Terry for driving this forward.
>> >>> Got one question about DESCRIBE DATABASE: the results display comment
>> and
>> >>> description of a database. While comment can be specified when a
>> database
>> >>> is created, I don't see how users can specify description of the
>> >> database?
>> >>>
>> >>> On Thu, Nov 7, 2019 at 4:16 AM Bowen Li  wrote:
>> >>>
>>  Thanks.
>> 
>>  As Terry and I discussed offline yesterday, we added a new section to
>>  explain the detailed implementation plan.
>> 
>>  +1 (binding) from me.
>> 
>>  Bowen
>> 
>>  On Tue, Nov 5, 2019 at 6:33 PM Terry Wang 
>> wrote:
>> 
>> > Hi Bowen:
>> > Thanks for your feedback.
>> > Your opinion convinced me and I just remove the section about
>> catalog
>> > create statement and also remove `DBPROPERTIES` `PROPERTIES` from
>> alter
>> > DDLs.
>> > Open to more comments or votes :) !
>> >
>> > Best,
>> > Terry Wang
>> >
>> >
>> >
>> >> 2019年11月6日 07:22,Bowen Li  写道:
>> >>
>> >> Hi Terry,
>> >>
>> >> I went over the FLIP in detail again. The FLIP mostly LGTM. A
>> couple
>> > issues:
>> >>
>> >> - since we on't plan to support catalog ddl, can you remove them
>> from
>>  the
>> >> FLIP?
>> >> - I found there are some discrepancies in proposed database and
>> table
>> > DDLs.
>> >> For db ddl, the create db syntax proposes specifying k-v properties
>> >> following "WITH". However, alter db ddl comes with a keyword
>> > "DBPROPERTIES":
>> >>
>> >> CREATE  DATABASE [ IF NOT EXISTS ] [ catalogName.] dataBaseName [
>>  COMMENT
>> >> database_comment ]
>> >> [*WITH *( name=value [, name=value]*)]
>> >>
>> >>
>> >> ALTER  DATABASE  [ catalogName.] dataBaseName SET *DBPROPERTIES* (
>> >> name=value [, name=value]*)
>> >>
>> >>
>> >>  IIUIC, are you borrowing syntax from Hive? Note that Hive's db
>>  create
>> >> ddl comes with "DBPROPERTIES" though - "CREATE (DATABASE|SCHEMA)
>> [IF
>>  NOT
>> >> EXISTS] database_name ...  [*WITH DBPROPERTIES* (k=v, ...)];" [1]
>> >>
>> >> The same applies to table ddl. The proposed alter table ddl comes
>>  with
>> >> "SET *PROPERTIES* (...)", however, Flink's existing table create
>> ddl
>> > since
>> >> 1.9 [2] doesn't have "PROPERTIES" keyword. As opposed to Hive's
>> >> syntax,
>> >> both create and alter table ddl comes with "TBLPROPERTIES" [1].
>> >>
>> >> I feel it's better to be consistent among our DDLs. One option is
>> to
>> >> just remove the "PROPERTIES" and "DBPROPERTIES" keywords in
>> proposed
>> > syntax.
>> >>
>> >> [1]
>>  https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
>> >> [2]
>> >>
>> >
>> 
>> >>
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html#specifying-a-ddl
>> >>
>> >> On Tue, Nov 5, 2019 at 12:54 PM Peter Huang <
>>  huangzhenqiu0...@gmail.com>
>> >> wrote:
>> >>
>> >>> +1 for the enhancement.
>> >>>
>> >>> On Tue, Nov 5, 2019 at 11:04 AM Xuefu Z 
>> wrote:
>> >>>
>>  +1 to the long missing feature in Flink SQL.
>> 
>>  On Tue, Nov 5, 2019 at 6:32 AM Terry Wang 
>>  wrote:
>> 
>> > Hi all,
>> >
>> > I would like to start the vote for FLIP-69[1] which is discussed
>> >> and
>> > reached consensus in the discussion thread[2].
>> >
>> > The vote will be open for at least 72 hours. I'll try to close
>> it
>> >> by
>>

[jira] [Created] (FLINK-14668) LocalExecutor#getOrCreateExecutionContext not working as expected

2019-11-07 Thread zhangwei (Jira)
zhangwei created FLINK-14668:


 Summary: LocalExecutor#getOrCreateExecutionContext not working as 
expected
 Key: FLINK-14668
 URL: https://issues.apache.org/jira/browse/FLINK-14668
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.9.1, 1.9.0
Reporter: zhangwei
 Fix For: 1.11.0


```ExecutionContext``` 
[copy](https://github.com/apache/flink/blob/b0a9afdd24fb70131b1e80d46d0ca101235a4a36/flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/gateway/local/ExecutionContext.java#L134)
 a new ```SessionContext``` in the constructor,  but ```SessionContext``` 
members do not always redo ```equals```,so 
```executionContext.getSessionContext().equals(session)``` is always false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Flunk savepoin(checkpoint) load api or debug

2019-11-07 Thread qq
Hi all,

   Thanks very much. I wants to debug checkpoint with code. Below is my code. 
Anyway I am sorry I doesn’t understand UT class. 
def demo(): Unit = {
  val env = StreamExecutionEnvironment.getExecutionEnvironment
  env.setParallelism(1)
  env.enableCheckpointing(1)
  val checkpointConfig = env.getCheckpointConfig
  checkpointConfig.setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE)
  checkpointConfig.setMinPauseBetweenCheckpoints(5000)
  checkpointConfig.setCheckpointTimeout(5000)
  checkpointConfig.setMaxConcurrentCheckpoints(1)
  
checkpointConfig.enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION)
  val fsStateBackend: StateBackend = new FsStateBackend(STATE_BACKEND)
  env.setStateBackend(fsStateBackend)
  env.setRestartStrategy(RestartStrategies.fixedDelayRestart(2, 3))

  //TODO recovery my checkpoint here or run this  job from my checkpoint
  // how to run this job with checkpoint metadata ? use CheckpointCoordinator ??
  val dataStream: DataStream[String] = 
env.addSource(streamSource).name("mysource")
  dataStream.addSink(new MySQLSink).uid("tesCheckpoint").name("mysink")
  env.execute()
}
MySQLSink:
class MySQLSink extends RichSinkFunction[String] with CheckpointedFunction {

  private val bufferSize = 50
  private var count: AtomicInteger = _
  private var cacheData: ListBuffer[String] = ListBuffer[String]()
  private var checkpointedState: ListState[(String, ListBuffer[String])] = _

  override def open(parameters: Configuration): Unit = {
count = new AtomicInteger(0)
  }

  override def invoke(jsonData: String, context: SinkFunction.Context[_]): Unit 
= {
val flag = count.getAndIncrement()
val end: Long = System.currentTimeMillis()
val result = jsonData.substring(0,jsonData.length-1) + ",\"fend\":"+end+"}";
if (flag >= bufferSize) {
  cacheData += result
  saveDataList()
  cacheData.clear()
  count.set(1)
} else {
  cacheData += result
}
  }

  def saveDataList(): Unit = {

  }

  override def close(): Unit = {
super.close()
  }

  override def snapshotState(context: FunctionSnapshotContext): Unit = {
checkpointedState.clear()
val buffer = ListBuffer[(String, ListBuffer[String])](("nlcpTestData", 
cacheData))
checkpointedState.addAll(buffer.toList.asJava)
  }

  override def initializeState(context: FunctionInitializationContext): Unit = {
val listStateDesc = new ListStateDescriptor[(String, 
ListBuffer[String])]("nlcpTestData", TypeInformation.of(new TypeHint[(String, 
ListBuffer[String])]() {}))
val stateStore: OperatorStateStore = context.getOperatorStateStore
checkpointedState = stateStore.getListState(listStateDesc)
if (context.isRestored) {
  val data = checkpointedState.get().iterator()
  while (data.hasNext) {
cacheData ++= data.next()._2
  }
}
  }

}


> 在 2019年11月7日,12:03,Congxian Qiu  写道:
> 
> Hi,
> If you just want to debug, maybe you can do this in UT class in module
> flink-runtime :) so that you do not need to handle the dependency problem,
> and access problem.
> 
> Best,
> Congxian
> 
> 
> Jark Wu  于2019年11月6日周三 下午3:39写道:
> 
>> Btw, user questions should be asked in user@f.a.o or user-zh@f.a.o. The
>> dev
>> ML is mainly used to discuss development.
>> 
>> Best,
>> Jark
>> 
>> On Wed, 6 Nov 2019 at 15:36, Jark Wu  wrote:
>> 
>>> Hi,
>>> 
>>> Savepoint.load(env, path) is in state processor API library, you should
>>> add the following dependency in your project.
>>> 
>>> 
>>>  org.apache.flink
>>>  flink-state-processor-api_2.11
>>>  1.9.1
>>> 
>>> 
>>> 
>>> You can see the docuementation for more detailed instructions [1].
>>> 
>>> Best,
>>> Jark
>>> 
>>> [1]:
>>> 
>> https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/libs/state_processor_api.html
>>> 
>>> On Wed, 6 Nov 2019 at 09:21, qq <471237...@qq.com> wrote:
>>> 
 Hi all,
   I want to load checkpoint or savepoint metadata on dev . in this case
 , I want to debug saved checkpoint metadata. And I knew flink provided a
 api which is Savepoint.load(env, path), but I can’t find it and can’t
>> use
 it. Anyone who know about this ? Could you help me ? Thanks very much;
 
 
>> 
> 



[jira] [Created] (FLINK-14669) All hadoop-2.4.1 related nightly end-to-end tests failed on travis

2019-11-07 Thread Yu Li (Jira)
Yu Li created FLINK-14669:
-

 Summary: All hadoop-2.4.1 related nightly end-to-end tests failed 
on travis
 Key: FLINK-14669
 URL: https://issues.apache.org/jira/browse/FLINK-14669
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.10.0
Reporter: Yu Li
 Attachments: image-2019-11-08-15-02-31-268.png

As titled, all hadoop 2.4.1 tests failed in build 
https://travis-ci.org/apache/flink/builds/608709634
 !image-2019-11-08-15-02-31-268.png! 

>From the log it seems to be timed out when downloading dependencies
{noformat}
/home/travis/flink_cache/40913/flink/docs/concepts/runtime.zh.md
/home/travis/flink_cache/40913/flink/docs/_config_dev_en.yml
/home/travis/flink_cache/40913/\n...
changes detected, packing new archive
uploading 
master/cache--linux-xenial-98cdbf5c3ae4db3a919a6366e5a96e33ecd8f19b8853a50e8f545bd43fc8164e--jdk-openjdk8.tgz
cache uploaded
travis_time:end:1b949da8:start=1573163710718943370,finish=1573163798721501362,duration=88002557992,event=cache
travis_fold:end:cache.2


Done. Your build exited with 1.
{noformat}
https://api.travis-ci.org/v3/job/608709640/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)