[jira] [Created] (FLINK-16794) ClassNotFoundException caused by ClassLoader.getSystemClassLoader using impertinently

2020-03-26 Thread victor.jiang (Jira)
victor.jiang created FLINK-16794:


 Summary: ClassNotFoundException caused by 
ClassLoader.getSystemClassLoader using impertinently  
 Key: FLINK-16794
 URL: https://issues.apache.org/jira/browse/FLINK-16794
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission, Runtime / REST
Affects Versions: 1.8.3, 1.8.2, 1.8.1, 1.8.0
Reporter: victor.jiang


In same containerization environment,the context classloader is not the 
SystemClassLoader,it uses the customized classloader usually for the classes 
isolation ,so the ClassNotFoundException may be caused。recommends using 
getClass/Caller/ThreadCurrentContext 's ClassLoader。

The related sources below:

1.flink-clients\src\main\java\org\apache\flink\client\program\ClusterClient.java"(690,33):
 return getAccumulators(jobID, ClassLoader.getSystemClassLoader());
2.flink-clients\src\main\java\org\apache\flink\client\program\MiniClusterClient.java"(148,33):
 return getAccumulators(jobID, ClassLoader.getSystemClassLoader());
3.flink-runtime\src\main\java\org\apache\flink\runtime\blob\BlobUtils.java"(348,66):
 return (Throwable) InstantiationUtil.deserializeObject(bytes, 
ClassLoader.getSystemClassLoader());
4.flink-runtime\src\main\java\org\apache\flink\runtime\rest\messages\json\SerializedThrowableDeserializer.java"(52,68):
 return InstantiationUtil.deserializeObject(serializedException, 
ClassLoader.getSystemClassLoader());
5.flink-runtime\src\main\java\org\apache\flink\runtime\rpc\messages\RemoteRpcInvocation.java"(118,67):
 methodInvocation = 
serializedMethodInvocation.deserializeValue(ClassLoader.getSystemClassLoader());



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16795) End to end tests timeout on Azure

2020-03-26 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-16795:
--

 Summary: End to end tests timeout on Azure
 Key: FLINK-16795
 URL: https://issues.apache.org/jira/browse/FLINK-16795
 Project: Flink
  Issue Type: Task
  Components: Build System / Azure Pipelines, Tests
Affects Versions: 1.11.0
Reporter: Robert Metzger
Assignee: Robert Metzger


Example: 
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6650&view=logs&j=08866332-78f7-59e4-4f7e-49a56faa3179
 or 
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6637&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5

{code}##[error]The job running on agent Azure Pipelines 6 ran longer than the 
maximum time of 200 minutes. For more information, see 
https://go.microsoft.com/fwlink/?linkid=2077134
{code}
and {code}##[error]The operation was canceled.{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16796) Fix The Bug of Python UDTF in SQL Query

2020-03-26 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-16796:


 Summary: Fix The Bug of Python UDTF in SQL Query
 Key: FLINK-16796
 URL: https://issues.apache.org/jira/browse/FLINK-16796
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Reporter: Huang Xingbo
 Fix For: 1.11.0


When executes Python UDTF in sql query, it will cause some problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16798) Logs from BashJavaUtils are not properly preserved and passed into TM logs.

2020-03-26 Thread Xintong Song (Jira)
Xintong Song created FLINK-16798:


 Summary: Logs from BashJavaUtils are not properly preserved and 
passed into TM logs.
 Key: FLINK-16798
 URL: https://issues.apache.org/jira/browse/FLINK-16798
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Scripts
Affects Versions: 1.11.0
Reporter: Xintong Song
 Fix For: 1.11.0


With FLINK-15519, in the TM start-up scripts, we have captured logs from 
{{BashJavaUtils}} and passed into the TM JVM process via environment variable. 
These logs will be merged with other TM logs, writing to same places respecting 
user's log configurations.

This effort was broken in FLINK-15727, where the outputs from {{BashJavaUtils}} 
 are thrown away, except for the result JVM parameters and dynamic 
configurations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16797) Flink doesn't merge multiple sinks into one DAG

2020-03-26 Thread Jeff Zhang (Jira)
Jeff Zhang created FLINK-16797:
--

 Summary: Flink doesn't merge multiple sinks into one DAG
 Key: FLINK-16797
 URL: https://issues.apache.org/jira/browse/FLINK-16797
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.10.0
Reporter: Jeff Zhang


Here's sql I used.
{code:java}
insert into sink_kafka select status, direction, cast(event_ts/10 as 
timestamp(3)) from source_kafka where status <> 'foo';
insert into sink_kafka2 select status, direction, cast(event_ts/10 as 
timestamp(3)) from source_kafka where status <> 'foo';
 {code}
Ideally flink should run these 2 sql as one dag with 2 sinks, but what I see is 
that flink won't merge them into one dag.

!https://cdn-images-1.medium.com/max/1760/1*mFu6OZivrfGUgu1UVCcy6A.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16799) add hive partition limit when read from hive

2020-03-26 Thread Jun Zhang (Jira)
Jun Zhang created FLINK-16799:
-

 Summary: add hive partition limit when read from hive
 Key: FLINK-16799
 URL: https://issues.apache.org/jira/browse/FLINK-16799
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Affects Versions: 1.10.0
Reporter: Jun Zhang
 Fix For: 1.11.0


add a partition limit when read from hive , a query will not be executed if it 
attempts to fetch more partitions per table than the limit configured. 
 
 To avoid full table scans



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16800) TypeMappingUtils#checkPhysicalLogicalTypeCompatible didn't deal with nested types

2020-03-26 Thread Zhenghua Gao (Jira)
Zhenghua Gao created FLINK-16800:


 Summary: TypeMappingUtils#checkPhysicalLogicalTypeCompatible 
didn't deal with nested types
 Key: FLINK-16800
 URL: https://issues.apache.org/jira/browse/FLINK-16800
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.10.0
Reporter: Zhenghua Gao
 Fix For: 1.11.0


the planner will use TypeMappingUtils#checkPhysicalLogicalTypeCompatible to 
validate logical schema and physical schema are compatible when translate 
CatalogSinkModifyOperation to Calcite relational expression.  The validation 
didn't deal with nested types well, which could expose the following 
ValidationException:
{code:java}
Exception in thread "main" org.apache.flink.table.api.ValidationException:
Type ARRAY> of table field 'old'
does not match with the physical type ARRAY> of the 'old' field of the TableSource return
type.
at
org.apache.flink.table.utils.TypeMappingUtils.lambda$checkPhysicalLogicalTypeCompatible$4(TypeMappingUtils.java:164)
at
org.apache.flink.table.utils.TypeMappingUtils$1.defaultMethod(TypeMappingUtils.java:277)
at
org.apache.flink.table.utils.TypeMappingUtils$1.defaultMethod(TypeMappingUtils.java:254)
at
org.apache.flink.table.types.logical.utils.LogicalTypeDefaultVisitor.visit(LogicalTypeDefaultVisitor.java:157)
at org.apache.flink.table.types.logical.ArrayType.accept(ArrayType.java:110)
at
org.apache.flink.table.utils.TypeMappingUtils.checkIfCompatible(TypeMappingUtils.java:254)
at
org.apache.flink.table.utils.TypeMappingUtils.checkPhysicalLogicalTypeCompatible(TypeMappingUtils.java:160)
at
org.apache.flink.table.utils.TypeMappingUtils.lambda$computeInCompositeType$8(TypeMappingUtils.java:232)
at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at
org.apache.flink.table.utils.TypeMappingUtils.computeInCompositeType(TypeMappingUtils.java:214)
at
org.apache.flink.table.utils.TypeMappingUtils.computePhysicalIndices(TypeMappingUtils.java:192)
at
org.apache.flink.table.utils.TypeMappingUtils.computePhysicalIndicesOrTimeAttributeMarkers(TypeMappingUtils.java:112)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.computeIndexMapping(StreamExecTableSourceScan.scala:212)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:107)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:62)
at
org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:58)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlan(StreamExecTableSourceScan.scala:62)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecExchange.translateToPlanInternal(StreamExecExchange.scala:84)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecExchange.translateToPlanInternal(StreamExecExchange.scala:44)
at
org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:58)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecExchange.translateToPlan(StreamExecExchange.scala:44)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLimit.translateToPlanInternal(StreamExecLimit.scala:161)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLimit.translateToPlanInternal(StreamExecLimit.scala:51)
at
org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:58)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLimit.translateToPlan(StreamExecLimit.scala:51)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToTransformation(StreamExecSink.scala:184)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:153)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:48)
at
org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:58)
at
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlan(StreamExecSink.scala:48)
at
org.apache.flink.table.

[jira] [Created] (FLINK-16801) PostgresCatalogITCase fails with IOException

2020-03-26 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-16801:
--

 Summary: PostgresCatalogITCase fails with IOException
 Key: FLINK-16801
 URL: https://issues.apache.org/jira/browse/FLINK-16801
 Project: Flink
  Issue Type: Task
  Components: Connectors / JDBC
Affects Versions: 1.11.0
Reporter: Robert Metzger


CI: 
https://travis-ci.org/github/apache/flink/jobs/666922577?utm_medium=notification&utm_source=slack

{code}
07:03:47.913 [INFO] Running 
org.apache.flink.api.java.io.jdbc.JDBCTableSourceITCase
07:03:50.588 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time 
elapsed: 16.693 s <<< FAILURE! - in 
org.apache.flink.api.java.io.jdbc.catalog.PostgresCatalogITCase
07:03:50.595 [ERROR] 
org.apache.flink.api.java.io.jdbc.catalog.PostgresCatalogITCase  Time elapsed: 
16.693 s  <<< ERROR!
java.io.IOException: Gave up waiting for server to start after 1ms
Caused by: java.sql.SQLException: connect failed
Caused by: java.net.ConnectException: Connection refused (Connection refused)
Thu Mar 26 07:03:50 UTC 2020 Thread[main,5,main] 
java.lang.NoSuchFieldException: DEV_NULL

Thu Mar 26 07:03:51 UTC 2020:
Booting Derby version The Apache Software Foundation - Apache Derby - 10.14.2.0 
- (1828579): instance a816c00e-0171-15a7-7fa7-0c06c410 
on database directory 
memory:/home/travis/build/apache/flink/flink-connectors/flink-jdbc/target/test 
with class loader sun.misc.Launcher$AppClassLoader@677327b6 
Loaded from 
file:/home/travis/.m2/repository/org/apache/derby/derby/10.14.2.0/derby-10.14.2.0.jar
java.vendor=Private Build
java.runtime.version=1.8.0_242-8u242-b08-0ubuntu3~16.04-b08
user.dir=/home/travis/build/apache/flink/flink-connectors/flink-jdbc/target
os.name=Linux
os.arch=amd64
os.version=4.15.0-1055-gcp
derby.system.home=null
derby.stream.error.field=org.apache.flink.api.java.io.jdbc.JDBCTestBase.DEV_NULL
Database Class Loader started - derby.database.classpath=''
07:03:51.916 [INFO] Running 
org.apache.flink.api.java.io.jdbc.JDBCLookupFunctionITCase
07:03:59.956 [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time 
elapsed: 12.041 s - in org.apache.flink.api.java.io.jdbc.JDBCTableSourceITCase
07:04:04.193 [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time 
elapsed: 12.275 s - in 
org.apache.flink.api.java.io.jdbc.JDBCLookupFunctionITCase
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16802) Set schema info in JobConf for Hive readers

2020-03-26 Thread Rui Li (Jira)
Rui Li created FLINK-16802:
--

 Summary: Set schema info in JobConf for Hive readers
 Key: FLINK-16802
 URL: https://issues.apache.org/jira/browse/FLINK-16802
 Project: Flink
  Issue Type: Task
  Components: Connectors / Hive
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-84 Feedback Summary

2020-03-26 Thread Timo Walther

Hi Godfrey,

having control over the job after submission is a requirement that was 
requested frequently (some examples are [1], [2]). Users would like to 
get insights about the running or completed job. Including the jobId, 
jobGraph etc., the JobClient summarizes these properties.


It is good to have a discussion about synchronous/asynchronous 
submission now to have a complete execution picture.


I thought we submit streaming queries mostly async and just wait for the 
successful submission. If we block for streaming queries, how can we 
collect() or print() results?


Also, if we block for streaming queries, we could never support 
multiline files. Because the first INSERT INTO would block the further 
execution.


If we decide to block entirely on streaming queries, we need the async 
execution methods in the design already. However, I would rather go for 
non-blocking streaming queries. Also with the `EMIT STREAM` key word in 
mind that we might add to SQL statements soon.


Regards,
Timo

[1] https://issues.apache.org/jira/browse/FLINK-16761
[2] https://issues.apache.org/jira/browse/FLINK-12214

On 25.03.20 16:30, godfrey he wrote:

Hi Timo,

Thanks for the updating.

Regarding to "multiline statement support", I'm also fine that
`TableEnvironment.executeSql()` only supports single line statement, and we
can support multiline statement later (needs more discussion about this).

Regarding to "StatementSet.explian()", I don't have strong opinions about
that.

Regarding to "TableResult.getJobClient()", I think it's unnecessary. The
reason is: first, many statements (e.g. DDL, show xx, use xx)  will not
submit a Flink job. second, `TableEnvironment.executeSql()` and
`StatementSet.execute()` are synchronous method, `TableResult` will be
returned only after the job is finished or failed.

Regarding to "whether StatementSet.execute() needs to throw exception", I
think we should choose a unified way to tell whether the execution is
successful. If `TableResult` contains ERROR kind (non-runtime exception),
users need to not only check the result but also catch the runtime
exception in their code. or `StatementSet.execute()` does not throw any
exception (including runtime exception), all exception messages are in the
result.  I prefer "StatementSet.execute() needs to throw exception". cc @Jark
Wu 

I will update the agreed parts to the document first.

Best,
Godfrey


Timo Walther  于2020年3月25日周三 下午6:51写道:


Hi Godfrey,

thanks for starting the discussion on the mailing list. And sorry again
for the late reply to FLIP-84. I have updated the Google doc one more
time to incorporate the offline discussions.

  From Dawid's and my view, it is fine to postpone the multiline support
to a separate method. This can be future work even though we will need
it rather soon.

If there are no objections, I suggest to update the FLIP-84 again and
have another voting process.

Thanks,
Timo


On 25.03.20 11:17, godfrey he wrote:

Hi community,
Timo, Fabian and Dawid have some feedbacks about FLIP-84[1]. The

feedbacks

are all about new introduced methods. We had a discussion yesterday, and
most of feedbacks have been agreed upon. Here is the conclusions:

*1. about proposed methods in `TableEnvironment`:*

the original proposed methods:

TableEnvironment.createDmlBatch(): DmlBatch
TableEnvironment.executeStatement(String statement): ResultTable

the new proposed methods:

// we should not use abbreviations in the API, and the term "Batch" is
easily confused with batch/streaming processing
TableEnvironment.createStatementSet(): StatementSet

// every method that takes SQL should have `Sql` in its name
// supports multiline statement ???
TableEnvironment.executeSql(String statement): TableResult

// new methods. supports explaining DQL and DML
TableEnvironment.explainSql(String statement, ExplainDetail... details):
String


*2. about proposed related classes:*

the original proposed classes:

interface DmlBatch {
  void addInsert(String insert);
  void addInsert(String targetPath, Table table);
  ResultTable execute() throws Exception ;
  String explain(boolean extended);
}

public interface ResultTable {
  TableSchema getResultSchema();
  Iterable getResultRows();
}

the new proposed classes:

interface StatementSet {
  // every method that takes SQL should have `Sql` in its name
  // return StatementSet instance for fluent programming
  addInsertSql(String statement): StatementSet

  // return StatementSet instance for fluent programming
  addInsert(String tablePath, Table table): StatementSet

  // new method. support overwrite mode
  addInsert(String tablePath, Table table, boolean overwrite):
StatementSet

  explain(): String

  // new method. supports adding more details for the result
  explain(ExplainDetail... extraDetails): String

  // throw exception ???
  execute(): TableResult
}

interface TableResult {
  getTableSchema(): TableSchema


[jira] [Created] (FLINK-16803) Need to make sure partition inherit table spec when writing to Hive tables

2020-03-26 Thread Rui Li (Jira)
Rui Li created FLINK-16803:
--

 Summary: Need to make sure partition inherit table spec when 
writing to Hive tables
 Key: FLINK-16803
 URL: https://issues.apache.org/jira/browse/FLINK-16803
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS]FLIP-113: Support SQL and planner hints

2020-03-26 Thread Timo Walther

Thanks for the update Danny. +1 for this proposal.

Regards,
Timo

On 26.03.20 04:51, Danny Chan wrote:

Thanks everyone who engaged in this discussion ~

Our goal is "Supports Dynamic Table Options for Flink SQL". After an
offline discussion with Kurt, Timo and Dawid, we have made the final
conclusion, here is the summary:


- Use comment style syntax to specify the dynamic table options: "/*+
*OPTIONS*(k1='v1', k2='v2') */"
- Have constraint on the options keys: the options that may bring in
security problems should not be allowed, i.e. Kafka connector zookeeper
endpoint URL and topic name
- Use white-list to control the allowed options for each connector,
which is more safe for future extention
- We allow to enable/disable this feature globally
- Implement based on the current code base first, and when FLIP-95 is
checked in, implement this feature based on new interface

Any suggestions are appreciated ~

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL

Best,
Danny Chan

Jark Wu  于2020年3月18日周三 下午10:38写道:


Hi everyone,

Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid it
doesn't solve the problems but increases some development and learning
burdens.

# increase development and learning burden

According to the discussion so far, we want to support overriding a subset
of options in hints which doesn't affect semantics.
With the `supportedHintOptions`, it's up to the connector developers to
decide which options will not affect semantics, and to be hint options.
However, the question is how to distinguish whether an option will *affect
semantics*? What happens if an option will affect semantics but provided as
hint options?
 From my point of view, it's not easy to distinguish. For example, the
"format.ignore-parse-error" can be a very useful dynamic option but that
will affect semantic, because the result is different (null vs exception).
Another example, the "connector.lookup.cache.*" options are also very
useful to tune jobs, however, it will also affect the job results. I can
come up many more useful options but may affect semantics.

I can see that the community will under endless discussion around "can this
option to be a hint option?",  "wether this option will affect semantics?".
You can also find that we already have different opinions on
"ignore-parse-error". Those discussion is a waste of time! That's not what
users want!
The problem is user need this, this, this options and HOW to expose them?
We should focus on that.

Then there could be two endings in the future:
1) compromise on the usability, we drop the rule that hints don't affect
semantics, allow all the useful options in the hints list.
2) stick on the rule, users will find this is a stumbling feature which
doesn't solve their problems.
 And they will be surprised why this option can't be set, but the other
could. *semantic* is hard to be understood by users.

# doesn't solve the problems

I think the purpose of this FLIP is to allow users to quickly override some
connectors' properties to tune their jobs.
However, `supportedHintOptions` is off track. It only allows a subset
options and for the users it's not *clear* which subset is allowed.

Besides, I'm not sure `supportedHintOptions` can work well for all cases.
How could you support kafka properties (`connector.properties.*`) as hint
options? Some kafka properties may affect semantics (bootstrap.servers),
some may not (max.poll.records). Besides, I think it's not possible to list
all the possible kafka properties [1].

In summary, IMO, `supportedHintOptions`
(1) it increase the complexity to develop a connector
(2) it confuses users which options can be used in hint, which are not,
they have to check the docs again and again.
(3) it doesn't solve the problems which we want to solve by this FLIP.

I think we should avoid introducing some partial solutions. Otherwise, we
will be stuck in a loop that introduce new API -> deprecate API ->
introduce new API

I personally in favor of an explicit WITH syntax after the table as a part
of the query which is mentioned by Kurt before, e.g. SELECT * from T
WITH('key' = 'value') .
It allows users to dynamically set options which can affect semantics. It
will be very flexible to solve users' problems so far.

Best,
Jark

[1]: https://kafka.apache.org/documentation/#consumerconfigs

On Wed, 18 Mar 2020 at 21:44, Danny Chan  wrote:


My POC is here for the hints options merge [1].

Personally, I have no strong objections for splitting hints with the
CatalogTable, the only cons is a more complex implementation but the
concept is more clear, and I have updated the WIKI.

I think it would be nice if we can support the format “ignore-parse

error”

option key, the CSV source already has a key [2] and we can use that in

the

supportedHIntOptions, for the common CSV and JSON formats, we cal also

give

a support. This is the

Re: [DISCUSS]FLIP-113: Support SQL and planner hints

2020-03-26 Thread Kurt Young
Hi Danny,

Thanks for the updates. I have 2 comments regarding to latest document:

1) I think we also need `*supportedHintOptions*` for  `*TableFormatFactory*`
2) IMO "dynamic-table-options.enabled" should belong to `
*OptimizerConfigOptions*`

Best,
Kurt


On Thu, Mar 26, 2020 at 4:40 PM Timo Walther  wrote:

> Thanks for the update Danny. +1 for this proposal.
>
> Regards,
> Timo
>
> On 26.03.20 04:51, Danny Chan wrote:
> > Thanks everyone who engaged in this discussion ~
> >
> > Our goal is "Supports Dynamic Table Options for Flink SQL". After an
> > offline discussion with Kurt, Timo and Dawid, we have made the final
> > conclusion, here is the summary:
> >
> >
> > - Use comment style syntax to specify the dynamic table options: "/*+
> > *OPTIONS*(k1='v1', k2='v2') */"
> > - Have constraint on the options keys: the options that may bring in
> > security problems should not be allowed, i.e. Kafka connector
> zookeeper
> > endpoint URL and topic name
> > - Use white-list to control the allowed options for each connector,
> > which is more safe for future extention
> > - We allow to enable/disable this feature globally
> > - Implement based on the current code base first, and when FLIP-95 is
> > checked in, implement this feature based on new interface
> >
> > Any suggestions are appreciated ~
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL
> >
> > Best,
> > Danny Chan
> >
> > Jark Wu  于2020年3月18日周三 下午10:38写道:
> >
> >> Hi everyone,
> >>
> >> Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid it
> >> doesn't solve the problems but increases some development and learning
> >> burdens.
> >>
> >> # increase development and learning burden
> >>
> >> According to the discussion so far, we want to support overriding a
> subset
> >> of options in hints which doesn't affect semantics.
> >> With the `supportedHintOptions`, it's up to the connector developers to
> >> decide which options will not affect semantics, and to be hint options.
> >> However, the question is how to distinguish whether an option will
> *affect
> >> semantics*? What happens if an option will affect semantics but
> provided as
> >> hint options?
> >>  From my point of view, it's not easy to distinguish. For example, the
> >> "format.ignore-parse-error" can be a very useful dynamic option but that
> >> will affect semantic, because the result is different (null vs
> exception).
> >> Another example, the "connector.lookup.cache.*" options are also very
> >> useful to tune jobs, however, it will also affect the job results. I can
> >> come up many more useful options but may affect semantics.
> >>
> >> I can see that the community will under endless discussion around "can
> this
> >> option to be a hint option?",  "wether this option will affect
> semantics?".
> >> You can also find that we already have different opinions on
> >> "ignore-parse-error". Those discussion is a waste of time! That's not
> what
> >> users want!
> >> The problem is user need this, this, this options and HOW to expose
> them?
> >> We should focus on that.
> >>
> >> Then there could be two endings in the future:
> >> 1) compromise on the usability, we drop the rule that hints don't affect
> >> semantics, allow all the useful options in the hints list.
> >> 2) stick on the rule, users will find this is a stumbling feature which
> >> doesn't solve their problems.
> >>  And they will be surprised why this option can't be set, but the
> other
> >> could. *semantic* is hard to be understood by users.
> >>
> >> # doesn't solve the problems
> >>
> >> I think the purpose of this FLIP is to allow users to quickly override
> some
> >> connectors' properties to tune their jobs.
> >> However, `supportedHintOptions` is off track. It only allows a subset
> >> options and for the users it's not *clear* which subset is allowed.
> >>
> >> Besides, I'm not sure `supportedHintOptions` can work well for all
> cases.
> >> How could you support kafka properties (`connector.properties.*`) as
> hint
> >> options? Some kafka properties may affect semantics (bootstrap.servers),
> >> some may not (max.poll.records). Besides, I think it's not possible to
> list
> >> all the possible kafka properties [1].
> >>
> >> In summary, IMO, `supportedHintOptions`
> >> (1) it increase the complexity to develop a connector
> >> (2) it confuses users which options can be used in hint, which are not,
> >> they have to check the docs again and again.
> >> (3) it doesn't solve the problems which we want to solve by this FLIP.
> >>
> >> I think we should avoid introducing some partial solutions. Otherwise,
> we
> >> will be stuck in a loop that introduce new API -> deprecate API ->
> >> introduce new API
> >>
> >> I personally in favor of an explicit WITH syntax after the table as a
> part
> >> of the query which is mentioned by Kurt before, e.g. SELECT * from T
> >> WITH('k

[jira] [Created] (FLINK-16804) Deprecate String based Expression DSL in Table

2020-03-26 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-16804:


 Summary: Deprecate String based Expression DSL in Table
 Key: FLINK-16804
 URL: https://issues.apache.org/jira/browse/FLINK-16804
 Project: Flink
  Issue Type: Sub-task
Reporter: Dawid Wysakowicz






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-26 Thread Yangze Guo
Thanks for the suggestion, @Stephan, @Becket and @Xintong.

I've updated the FLIP accordingly. I do not add a
ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
which takes the responsibility of all relevant operations on both RM
and TM sides.
After a rethink about decoupling the management of external resources
from TaskExecutor, I think we could do the same thing on the
ResourceManager side. We do not need to add a specific allocation
logic to the ResourceManager each time we add a specific external
resource.
- For Yarn, we need the ExternalResourceDriver to edit the containerRequest.
- For Kubenetes, ExternalResourceDriver could provide a decorator for
the TM pod.

In this way, just like MetricReporter, we allow users to define their
custom ExternalResourceDriver. It is more extensible and fits the
separation of concerns. For more details, please take a look at [1].

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink

Best,
Yangze Guo

On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen  wrote:
>
> This sounds good to go ahead from my side.
>
> I like the approach that Becket suggested - in that case the core
> abstraction that everyone would need to understand would be "external
> resource allocation" and the "ResourceInfoProvider", and the GPU specific
> code would be a specific implementation only known to that component that
> allocates the external resource. That fits the separation of concerns well.
>
> I also understand that it should not be over-engineered in the first
> version, so some simplification makes sense, and then gradually expand from
> there.
>
> So +1 to go ahead with what was suggested above (Xintong / Becket) from my
> side.
>
> On Mon, Mar 23, 2020 at 6:55 AM Xintong Song  wrote:
>
> > Thanks for the comments, Stephan & Becket.
> >
> > @Stephan
> >
> > I see your concern, and I completely agree with you that we should first
> > think about the "library" / "plugin" / "extension" style if possible.
> >
> > If GPUs are sliced and assigned during scheduling, there may be reason,
> > > although it looks that it would belong to the slot then. Is that what we
> > > are doing here?
> >
> >
> > In the current proposal, we do not have the GPUs sliced and assigned to
> > slots, because it could be problematic without dynamic slot allocation.
> > E.g., the number of GPUs might not be evenly divisible by the number of
> > slots.
> >
> > I think it makes sense to eventually have the GPUs assigned to slots. Even
> > then, we might still need a TM level GPUManager (or ResourceProvider like
> > Becket suggested). For memory, in each slot we can simply request the
> > amount of memory, leaving it to JVM / OS to decide which memory (address)
> > should be assigned. For GPU, and potentially other resources like FPGA, we
> > need to explicitly specify which GPU (index) should be used. Therefore, we
> > need some component at the TM level to coordinate which slot uses which
> > GPU.
> >
> > IMO, unless we say Flink will not support slot-level GPU slicing at least
> > in the foreseeable future, I don't see a good way to avoid touching the TM
> > core. To that end, I think Becket's suggestion points to a good direction,
> > that supports more features (GPU, FPGA, etc.) with less coupling to the TM
> > core (only needs to understand the general interfaces). The detailed
> > implementation for specific resource types can even be encapsulated as a
> > library.
> >
> > @Becket
> >
> > Thanks for sharing your thought on the final state. Despite the details how
> > the interfaces should look like, I think this is a really good abstraction
> > for supporting general resource types.
> >
> > I'd like to further clarify that, the following three things are all that
> > the "Flink core" needs to understand.
> >
> >- The *amount* of resource, for scheduling. Actually, we already have
> >the Resource class in ResourceProfile and ResourceSpec for extended
> >resource. It's just not really used.
> >- The *info*, that Flink provides to the operators / user codes.
> >- The *provider*, which generates the info based on the amount.
> >
> > The "core" does not need to understand the specific implementation details
> > of the above three. They can even be implemented in a 3rd-party library.
> > Similar to how we allow users to define their custom MetricReporter.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin  wrote:
> >
> > > Thanks for the comment, Stephan.
> > >
> > >   - If everything becomes a "core feature", it will make the project hard
> > > > to develop in the future. Thinking "library" / "plugin" / "extension"
> > > style
> > > > where possible helps.
> > >
> > >
> > > Completely agree. It is much more important to design a mechanism than
> > > focusing on a specific case. Here is what I am thinking to fully support
> > > custom resource management:
> > > 1. On the JM / RM side, use Resource

Re: [DISCUSS] FLIP-95: New TableSource and TableSink interfaces

2020-03-26 Thread Timo Walther

Hi Becket,

Regarding "PushDown/NestedPushDown which is internal to optimizer":

Those concepts cannot be entirely internal to the optimizer, at some 
point the optimizer needs to pass them into the connector specific code. 
This code will then convert it to e.g. Parque expressions. So there must 
be some interface that takes SQL Expression and converts to connector 
specific code. This interface between planner and connector is modelled 
by the SupportsXXX interfaces. And you are right, if developers don't 
care, they don't need to implement those optional interfaces but will 
not get performant connectors.


Regarding "Table connector can work with the above two mechanism":

A table connector needs three mechanisms that are represented in the 
current design.


1. a stateless discovery interface (Factory) that can convert 
ConfigOptions to a stateful factory interface 
(DynamicTableSource/DynamicTableSink)


2. a stateful factory interface (DynamicTableSource/DynamicTableSink) 
that receives concepts from the optimizer (watermarks, filters, 
projections) and produces runtime classes such as your 
`ExpressionToParquetFilter`


3. runtime interfaces that are generated from the stateful factory; all 
the factories that you mentioned can be used in `getScanRuntimeProvider`.


Regarding "connector developers just need to know how to write an 
ExpressionToParquetFilter":


This is the entire purpose of the DynamicTableSource/DynamicTableSink. 
The bridging between SQL concepts and connector specific concepts. 
Because this is the tricky part. How to get from a SQL concept to a 
connctor concept.


Regards,
Timo


On 26.03.20 04:46, Becket Qin wrote:

Hi Timo,

Thanks for the reply. I totally agree that there must be something new
added to the connector in order to make it work for SQL / Table. My concern
is mostly over what they should be, and how to add them. To be honest, I
was kind of lost when looking at the interfaces such as
DataStructureConverter, RuntimeConverter and their internal context. Also I
believe most connector developers do not care about the concept of
"PushDown" / "NestedPushDown" which is internal to optimizer and not even
exposed to SQL writers.

Therefore I am trying to see if we can:
A) Keep those additions minimum to the connector developers if they don't
have to know the details.
B) Expose as less high level concept as possible. More specifically, try to
speak the connector language and expose the general mechanism instead of
binding them with use case semantic.

If we can achieve the above two goals, we could avoid adding unnecessary
burden to the connector developers, and also make the connectors more
generic.

It might worth thinking about what additional work is necessary for the
connector developers, here are what I am thinking of, please correct me if
I miss something.

1. A Factory interface that allows high level use case, in this case
SQL, to find a matching source using service provider mechanism.
2. Allows the high level use case to specify the plugins that are
supported by the underneath DataStream Source.

If Table connector can work with the above two mechanism, maybe we can make
some slight modifications to the interfaces in the current FLIP.

- A *SourceFactory* which extends the Factory interface in the FLIP,
with one more method:
   - *Source getSource();*
- Some decorative interfaces to the SourceFactory such as:
   - *FilterFactory>*, with the
   following method
  - T getFilter();
   - *ProjectorFactory>*, with
   the following method.
  - T getProjector();
   - *DeserializerFactory*

With this set of API, a ParquetTableSourceFactory may become:

class ParqeutTableSourceFactory implements
SourceFactory,
DeserializerFactory,
FilterFactory {
 @Override
 ParquetSource getSource() { ... }

 @Override
 ExressionToParquetFilter getFilterSupplier() { ... };
}

The ExressionToParquetFilter will have an *applyPredicate(Expression)*
method.

I know it does not look like a perfect interface from the pure SQL
perspective. And I am not even sure if this would meet all the requirements
for SQL, but the benefit is that the connector developers just need to know
how to write an ExpressionToParquetFilter in order to make it work for
Table, without having to understand the entire SQL concept.

Thanks,

Jiangjie (Becket) Qin



On Wed, Mar 25, 2020 at 5:57 PM Timo Walther  wrote:


Hi Becket,

Let me clarify a few things first: Historically we thought of Table
API/SQL as a library on top of DataStream API. Similar to Gelly or CEP.
We used TypeInformation in Table API to integrate nicely with DataStream
API. However, the last years have shown that SQL is not just a library.
It is an entire ecosystem that defines data types, submission behavior,
execution behavior, and highly optimized SerDes. SQL is a way to declare
data processing end-to-end such that the planner 

Re: [DISCUSS]FLIP-113: Support SQL and planner hints

2020-03-26 Thread Jark Wu
Hi Danny,

Regarding to `supportedHintOptions()` interface, I suggest to use the
inverted version, `unsupportedHintOptions()`.
Because I think the disallowed list is much smaller.
In addition, it's hard to list all the properties under
`connector.properties.*`.
But we know `connector.properties.bootstrap.servers` and
`connector.properties.zookeeper.connect` are the only security options.

Best,
Jark

On Thu, 26 Mar 2020 at 16:47, Kurt Young  wrote:

> Hi Danny,
>
> Thanks for the updates. I have 2 comments regarding to latest document:
>
> 1) I think we also need `*supportedHintOptions*` for
> `*TableFormatFactory*`
> 2) IMO "dynamic-table-options.enabled" should belong to `
> *OptimizerConfigOptions*`
>
> Best,
> Kurt
>
>
> On Thu, Mar 26, 2020 at 4:40 PM Timo Walther  wrote:
>
> > Thanks for the update Danny. +1 for this proposal.
> >
> > Regards,
> > Timo
> >
> > On 26.03.20 04:51, Danny Chan wrote:
> > > Thanks everyone who engaged in this discussion ~
> > >
> > > Our goal is "Supports Dynamic Table Options for Flink SQL". After an
> > > offline discussion with Kurt, Timo and Dawid, we have made the final
> > > conclusion, here is the summary:
> > >
> > >
> > > - Use comment style syntax to specify the dynamic table options:
> "/*+
> > > *OPTIONS*(k1='v1', k2='v2') */"
> > > - Have constraint on the options keys: the options that may bring
> in
> > > security problems should not be allowed, i.e. Kafka connector
> > zookeeper
> > > endpoint URL and topic name
> > > - Use white-list to control the allowed options for each connector,
> > > which is more safe for future extention
> > > - We allow to enable/disable this feature globally
> > > - Implement based on the current code base first, and when FLIP-95
> is
> > > checked in, implement this feature based on new interface
> > >
> > > Any suggestions are appreciated ~
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL
> > >
> > > Best,
> > > Danny Chan
> > >
> > > Jark Wu  于2020年3月18日周三 下午10:38写道:
> > >
> > >> Hi everyone,
> > >>
> > >> Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid
> it
> > >> doesn't solve the problems but increases some development and learning
> > >> burdens.
> > >>
> > >> # increase development and learning burden
> > >>
> > >> According to the discussion so far, we want to support overriding a
> > subset
> > >> of options in hints which doesn't affect semantics.
> > >> With the `supportedHintOptions`, it's up to the connector developers
> to
> > >> decide which options will not affect semantics, and to be hint
> options.
> > >> However, the question is how to distinguish whether an option will
> > *affect
> > >> semantics*? What happens if an option will affect semantics but
> > provided as
> > >> hint options?
> > >>  From my point of view, it's not easy to distinguish. For example, the
> > >> "format.ignore-parse-error" can be a very useful dynamic option but
> that
> > >> will affect semantic, because the result is different (null vs
> > exception).
> > >> Another example, the "connector.lookup.cache.*" options are also very
> > >> useful to tune jobs, however, it will also affect the job results. I
> can
> > >> come up many more useful options but may affect semantics.
> > >>
> > >> I can see that the community will under endless discussion around "can
> > this
> > >> option to be a hint option?",  "wether this option will affect
> > semantics?".
> > >> You can also find that we already have different opinions on
> > >> "ignore-parse-error". Those discussion is a waste of time! That's not
> > what
> > >> users want!
> > >> The problem is user need this, this, this options and HOW to expose
> > them?
> > >> We should focus on that.
> > >>
> > >> Then there could be two endings in the future:
> > >> 1) compromise on the usability, we drop the rule that hints don't
> affect
> > >> semantics, allow all the useful options in the hints list.
> > >> 2) stick on the rule, users will find this is a stumbling feature
> which
> > >> doesn't solve their problems.
> > >>  And they will be surprised why this option can't be set, but the
> > other
> > >> could. *semantic* is hard to be understood by users.
> > >>
> > >> # doesn't solve the problems
> > >>
> > >> I think the purpose of this FLIP is to allow users to quickly override
> > some
> > >> connectors' properties to tune their jobs.
> > >> However, `supportedHintOptions` is off track. It only allows a subset
> > >> options and for the users it's not *clear* which subset is allowed.
> > >>
> > >> Besides, I'm not sure `supportedHintOptions` can work well for all
> > cases.
> > >> How could you support kafka properties (`connector.properties.*`) as
> > hint
> > >> options? Some kafka properties may affect semantics
> (bootstrap.servers),
> > >> some may not (max.poll.records). Besides, I think it's not possible to
> > list
> > >> all the po

Re: [DISCUSS]FLIP-113: Support SQL and planner hints

2020-03-26 Thread Danny Chan
Thanks Kurt for the suggestion ~

In my opinion:
- There is no need for TableFormatFactory#supportedHintOptions because all
the format options can be configured dynamically, they have no security
issues
- Dynamic table options is not an optimization, it is more like an
execution behavior from my side

Kurt Young  于2020年3月26日周四 下午4:47写道:

> Hi Danny,
>
> Thanks for the updates. I have 2 comments regarding to latest document:
>
> 1) I think we also need `*supportedHintOptions*` for
> `*TableFormatFactory*`
> 2) IMO "dynamic-table-options.enabled" should belong to `
> *OptimizerConfigOptions*`
>
> Best,
> Kurt
>
>
> On Thu, Mar 26, 2020 at 4:40 PM Timo Walther  wrote:
>
> > Thanks for the update Danny. +1 for this proposal.
> >
> > Regards,
> > Timo
> >
> > On 26.03.20 04:51, Danny Chan wrote:
> > > Thanks everyone who engaged in this discussion ~
> > >
> > > Our goal is "Supports Dynamic Table Options for Flink SQL". After an
> > > offline discussion with Kurt, Timo and Dawid, we have made the final
> > > conclusion, here is the summary:
> > >
> > >
> > > - Use comment style syntax to specify the dynamic table options:
> "/*+
> > > *OPTIONS*(k1='v1', k2='v2') */"
> > > - Have constraint on the options keys: the options that may bring
> in
> > > security problems should not be allowed, i.e. Kafka connector
> > zookeeper
> > > endpoint URL and topic name
> > > - Use white-list to control the allowed options for each connector,
> > > which is more safe for future extention
> > > - We allow to enable/disable this feature globally
> > > - Implement based on the current code base first, and when FLIP-95
> is
> > > checked in, implement this feature based on new interface
> > >
> > > Any suggestions are appreciated ~
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL
> > >
> > > Best,
> > > Danny Chan
> > >
> > > Jark Wu  于2020年3月18日周三 下午10:38写道:
> > >
> > >> Hi everyone,
> > >>
> > >> Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid
> it
> > >> doesn't solve the problems but increases some development and learning
> > >> burdens.
> > >>
> > >> # increase development and learning burden
> > >>
> > >> According to the discussion so far, we want to support overriding a
> > subset
> > >> of options in hints which doesn't affect semantics.
> > >> With the `supportedHintOptions`, it's up to the connector developers
> to
> > >> decide which options will not affect semantics, and to be hint
> options.
> > >> However, the question is how to distinguish whether an option will
> > *affect
> > >> semantics*? What happens if an option will affect semantics but
> > provided as
> > >> hint options?
> > >>  From my point of view, it's not easy to distinguish. For example, the
> > >> "format.ignore-parse-error" can be a very useful dynamic option but
> that
> > >> will affect semantic, because the result is different (null vs
> > exception).
> > >> Another example, the "connector.lookup.cache.*" options are also very
> > >> useful to tune jobs, however, it will also affect the job results. I
> can
> > >> come up many more useful options but may affect semantics.
> > >>
> > >> I can see that the community will under endless discussion around "can
> > this
> > >> option to be a hint option?",  "wether this option will affect
> > semantics?".
> > >> You can also find that we already have different opinions on
> > >> "ignore-parse-error". Those discussion is a waste of time! That's not
> > what
> > >> users want!
> > >> The problem is user need this, this, this options and HOW to expose
> > them?
> > >> We should focus on that.
> > >>
> > >> Then there could be two endings in the future:
> > >> 1) compromise on the usability, we drop the rule that hints don't
> affect
> > >> semantics, allow all the useful options in the hints list.
> > >> 2) stick on the rule, users will find this is a stumbling feature
> which
> > >> doesn't solve their problems.
> > >>  And they will be surprised why this option can't be set, but the
> > other
> > >> could. *semantic* is hard to be understood by users.
> > >>
> > >> # doesn't solve the problems
> > >>
> > >> I think the purpose of this FLIP is to allow users to quickly override
> > some
> > >> connectors' properties to tune their jobs.
> > >> However, `supportedHintOptions` is off track. It only allows a subset
> > >> options and for the users it's not *clear* which subset is allowed.
> > >>
> > >> Besides, I'm not sure `supportedHintOptions` can work well for all
> > cases.
> > >> How could you support kafka properties (`connector.properties.*`) as
> > hint
> > >> options? Some kafka properties may affect semantics
> (bootstrap.servers),
> > >> some may not (max.poll.records). Besides, I think it's not possible to
> > list
> > >> all the possible kafka properties [1].
> > >>
> > >> In summary, IMO, `supportedHintOptions`
> > >> (1) it increase the comp

Re: [DISCUSS]FLIP-113: Support SQL and planner hints

2020-03-26 Thread Danny Chan
Thanks Jark for the feedback ~

I actually have a discussion offline with Timo and we think the black-list
options has implicit rick with the growing new table options, a black-list
there means all the new introduced options are default to be configurable
dynamically, if the user forget to add it into the black-list, that would
be a risk, what do you think about this @Timo ?

Jark Wu  于2020年3月26日周四 下午5:29写道:

> Hi Danny,
>
> Regarding to `supportedHintOptions()` interface, I suggest to use the
> inverted version, `unsupportedHintOptions()`.
> Because I think the disallowed list is much smaller.
> In addition, it's hard to list all the properties under
> `connector.properties.*`.
> But we know `connector.properties.bootstrap.servers` and
> `connector.properties.zookeeper.connect` are the only security options.
>
> Best,
> Jark
>
> On Thu, 26 Mar 2020 at 16:47, Kurt Young  wrote:
>
> > Hi Danny,
> >
> > Thanks for the updates. I have 2 comments regarding to latest document:
> >
> > 1) I think we also need `*supportedHintOptions*` for
> > `*TableFormatFactory*`
> > 2) IMO "dynamic-table-options.enabled" should belong to `
> > *OptimizerConfigOptions*`
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Mar 26, 2020 at 4:40 PM Timo Walther  wrote:
> >
> > > Thanks for the update Danny. +1 for this proposal.
> > >
> > > Regards,
> > > Timo
> > >
> > > On 26.03.20 04:51, Danny Chan wrote:
> > > > Thanks everyone who engaged in this discussion ~
> > > >
> > > > Our goal is "Supports Dynamic Table Options for Flink SQL". After an
> > > > offline discussion with Kurt, Timo and Dawid, we have made the final
> > > > conclusion, here is the summary:
> > > >
> > > >
> > > > - Use comment style syntax to specify the dynamic table options:
> > "/*+
> > > > *OPTIONS*(k1='v1', k2='v2') */"
> > > > - Have constraint on the options keys: the options that may bring
> > in
> > > > security problems should not be allowed, i.e. Kafka connector
> > > zookeeper
> > > > endpoint URL and topic name
> > > > - Use white-list to control the allowed options for each
> connector,
> > > > which is more safe for future extention
> > > > - We allow to enable/disable this feature globally
> > > > - Implement based on the current code base first, and when
> FLIP-95
> > is
> > > > checked in, implement this feature based on new interface
> > > >
> > > > Any suggestions are appreciated ~
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL
> > > >
> > > > Best,
> > > > Danny Chan
> > > >
> > > > Jark Wu  于2020年3月18日周三 下午10:38写道:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid
> > it
> > > >> doesn't solve the problems but increases some development and
> learning
> > > >> burdens.
> > > >>
> > > >> # increase development and learning burden
> > > >>
> > > >> According to the discussion so far, we want to support overriding a
> > > subset
> > > >> of options in hints which doesn't affect semantics.
> > > >> With the `supportedHintOptions`, it's up to the connector developers
> > to
> > > >> decide which options will not affect semantics, and to be hint
> > options.
> > > >> However, the question is how to distinguish whether an option will
> > > *affect
> > > >> semantics*? What happens if an option will affect semantics but
> > > provided as
> > > >> hint options?
> > > >>  From my point of view, it's not easy to distinguish. For example,
> the
> > > >> "format.ignore-parse-error" can be a very useful dynamic option but
> > that
> > > >> will affect semantic, because the result is different (null vs
> > > exception).
> > > >> Another example, the "connector.lookup.cache.*" options are also
> very
> > > >> useful to tune jobs, however, it will also affect the job results. I
> > can
> > > >> come up many more useful options but may affect semantics.
> > > >>
> > > >> I can see that the community will under endless discussion around
> "can
> > > this
> > > >> option to be a hint option?",  "wether this option will affect
> > > semantics?".
> > > >> You can also find that we already have different opinions on
> > > >> "ignore-parse-error". Those discussion is a waste of time! That's
> not
> > > what
> > > >> users want!
> > > >> The problem is user need this, this, this options and HOW to expose
> > > them?
> > > >> We should focus on that.
> > > >>
> > > >> Then there could be two endings in the future:
> > > >> 1) compromise on the usability, we drop the rule that hints don't
> > affect
> > > >> semantics, allow all the useful options in the hints list.
> > > >> 2) stick on the rule, users will find this is a stumbling feature
> > which
> > > >> doesn't solve their problems.
> > > >>  And they will be surprised why this option can't be set, but
> the
> > > other
> > > >> could. *semantic* is hard to be understood by users.
> > > >>
> > > >> # doesn't s

[DISCUSS] FLIP-118: Improve Flink’s ID system

2020-03-26 Thread Yangze Guo
Hi everyone,

We would like to start a discussion thread on "FLIP-118: Improve
Flink’s ID system"[1].

This FLIP mainly discusses the following issues, target to enhance the
readability of IDs in log and help user to debug in case of failures:

- Enhance the readability of the string literals of IDs. Most of them
are hashcodes, e.g. ExecutionAttemptID, which do not provide much
meaningful information and are hard to recognize and compare for
users.
- Log the ID’s lineage information to make debugging more convenient.
Currently, the log fails to always show the lineage information
between IDs. Finding out relationships between entities identified by
given IDs is a common demand, e.g., slot of which AllocationID is
assigned to satisfy slot request of with SlotRequestID. Absence of
such lineage information, it’s impossible to track the end to end
lifecycle of an Execution or a Task now, which makes debugging
difficult.

Key changes proposed in the FLIP are as follows:

- Add location information to distributed components
- Add topology information to graph components
- Log the ID’s lineage information
- Expose the identifier of distributing component to user

Please find more details in the FLIP wiki document [1]. Looking forward to
your feedbacks.

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=148643521

Best,
Yangze Guo


Re: [DISCUSS]FLIP-113: Support SQL and planner hints

2020-03-26 Thread Jingsong Li
Thanks Danny for update.

+1 to "dynamic-table-options.enabled" should belong to
`*OptimizerConfigOptions*`.
Hint looks like optimizer in my opinion. Actually optimizer affect
execution, but it is optimizer, not directly related to execution.

+1 to `unsupportedHintOptions`, we already list all options in
`requiredContext` and `supportedProperties`, list again in the hints
is inconvenient.

Best,
Jingsong Lee

On Thu, Mar 26, 2020 at 5:36 PM Danny Chan  wrote:

> Thanks Jark for the feedback ~
>
> I actually have a discussion offline with Timo and we think the black-list
> options has implicit rick with the growing new table options, a black-list
> there means all the new introduced options are default to be configurable
> dynamically, if the user forget to add it into the black-list, that would
> be a risk, what do you think about this @Timo ?
>
> Jark Wu  于2020年3月26日周四 下午5:29写道:
>
> > Hi Danny,
> >
> > Regarding to `supportedHintOptions()` interface, I suggest to use the
> > inverted version, `unsupportedHintOptions()`.
> > Because I think the disallowed list is much smaller.
> > In addition, it's hard to list all the properties under
> > `connector.properties.*`.
> > But we know `connector.properties.bootstrap.servers` and
> > `connector.properties.zookeeper.connect` are the only security options.
> >
> > Best,
> > Jark
> >
> > On Thu, 26 Mar 2020 at 16:47, Kurt Young  wrote:
> >
> > > Hi Danny,
> > >
> > > Thanks for the updates. I have 2 comments regarding to latest document:
> > >
> > > 1) I think we also need `*supportedHintOptions*` for
> > > `*TableFormatFactory*`
> > > 2) IMO "dynamic-table-options.enabled" should belong to `
> > > *OptimizerConfigOptions*`
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Thu, Mar 26, 2020 at 4:40 PM Timo Walther 
> wrote:
> > >
> > > > Thanks for the update Danny. +1 for this proposal.
> > > >
> > > > Regards,
> > > > Timo
> > > >
> > > > On 26.03.20 04:51, Danny Chan wrote:
> > > > > Thanks everyone who engaged in this discussion ~
> > > > >
> > > > > Our goal is "Supports Dynamic Table Options for Flink SQL". After
> an
> > > > > offline discussion with Kurt, Timo and Dawid, we have made the
> final
> > > > > conclusion, here is the summary:
> > > > >
> > > > >
> > > > > - Use comment style syntax to specify the dynamic table
> options:
> > > "/*+
> > > > > *OPTIONS*(k1='v1', k2='v2') */"
> > > > > - Have constraint on the options keys: the options that may
> bring
> > > in
> > > > > security problems should not be allowed, i.e. Kafka connector
> > > > zookeeper
> > > > > endpoint URL and topic name
> > > > > - Use white-list to control the allowed options for each
> > connector,
> > > > > which is more safe for future extention
> > > > > - We allow to enable/disable this feature globally
> > > > > - Implement based on the current code base first, and when
> > FLIP-95
> > > is
> > > > > checked in, implement this feature based on new interface
> > > > >
> > > > > Any suggestions are appreciated ~
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > >
> > > > > Jark Wu  于2020年3月18日周三 下午10:38写道:
> > > > >
> > > > >> Hi everyone,
> > > > >>
> > > > >> Sorry, but I'm not sure about the `supportedHintOptions`. I'm
> afraid
> > > it
> > > > >> doesn't solve the problems but increases some development and
> > learning
> > > > >> burdens.
> > > > >>
> > > > >> # increase development and learning burden
> > > > >>
> > > > >> According to the discussion so far, we want to support overriding
> a
> > > > subset
> > > > >> of options in hints which doesn't affect semantics.
> > > > >> With the `supportedHintOptions`, it's up to the connector
> developers
> > > to
> > > > >> decide which options will not affect semantics, and to be hint
> > > options.
> > > > >> However, the question is how to distinguish whether an option will
> > > > *affect
> > > > >> semantics*? What happens if an option will affect semantics but
> > > > provided as
> > > > >> hint options?
> > > > >>  From my point of view, it's not easy to distinguish. For example,
> > the
> > > > >> "format.ignore-parse-error" can be a very useful dynamic option
> but
> > > that
> > > > >> will affect semantic, because the result is different (null vs
> > > > exception).
> > > > >> Another example, the "connector.lookup.cache.*" options are also
> > very
> > > > >> useful to tune jobs, however, it will also affect the job
> results. I
> > > can
> > > > >> come up many more useful options but may affect semantics.
> > > > >>
> > > > >> I can see that the community will under endless discussion around
> > "can
> > > > this
> > > > >> option to be a hint option?",  "wether this option will affect
> > > > semantics?".
> > > > >> You can also find that we already have different opinions on
> > > > >> "ignore-parse-error".

Re: [DISCUSS]FLIP-113: Support SQL and planner hints

2020-03-26 Thread Timo Walther

Hi everyone,

it is not only about security concerns. Hint options should be 
well-defined. We had a couple of people that were concerned about 
changing the semantics with a concept that is called "hint". These 
options are more like "debugging options" while someone is developing a 
connector or using a notebook to quickly produce some rows.


The final pipeline should use a temporary table instead. I suggest to 
use a whitelist and force people to think about what should be exposed 
as a hint. By default, no option should be exposed. It is better to be 
conservative here.


Regards,
Timo


On 26.03.20 10:31, Danny Chan wrote:

Thanks Kurt for the suggestion ~

In my opinion:
- There is no need for TableFormatFactory#supportedHintOptions because all
the format options can be configured dynamically, they have no security
issues
- Dynamic table options is not an optimization, it is more like an
execution behavior from my side

Kurt Young  于2020年3月26日周四 下午4:47写道:


Hi Danny,

Thanks for the updates. I have 2 comments regarding to latest document:

1) I think we also need `*supportedHintOptions*` for
`*TableFormatFactory*`
2) IMO "dynamic-table-options.enabled" should belong to `
*OptimizerConfigOptions*`

Best,
Kurt


On Thu, Mar 26, 2020 at 4:40 PM Timo Walther  wrote:


Thanks for the update Danny. +1 for this proposal.

Regards,
Timo

On 26.03.20 04:51, Danny Chan wrote:

Thanks everyone who engaged in this discussion ~

Our goal is "Supports Dynamic Table Options for Flink SQL". After an
offline discussion with Kurt, Timo and Dawid, we have made the final
conclusion, here is the summary:


 - Use comment style syntax to specify the dynamic table options:

"/*+

 *OPTIONS*(k1='v1', k2='v2') */"
 - Have constraint on the options keys: the options that may bring

in

 security problems should not be allowed, i.e. Kafka connector

zookeeper

 endpoint URL and topic name
 - Use white-list to control the allowed options for each connector,
 which is more safe for future extention
 - We allow to enable/disable this feature globally
 - Implement based on the current code base first, and when FLIP-95

is

 checked in, implement this feature based on new interface

Any suggestions are appreciated ~

[1]




https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL


Best,
Danny Chan

Jark Wu  于2020年3月18日周三 下午10:38写道:


Hi everyone,

Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid

it

doesn't solve the problems but increases some development and learning
burdens.

# increase development and learning burden

According to the discussion so far, we want to support overriding a

subset

of options in hints which doesn't affect semantics.
With the `supportedHintOptions`, it's up to the connector developers

to

decide which options will not affect semantics, and to be hint

options.

However, the question is how to distinguish whether an option will

*affect

semantics*? What happens if an option will affect semantics but

provided as

hint options?
  From my point of view, it's not easy to distinguish. For example, the
"format.ignore-parse-error" can be a very useful dynamic option but

that

will affect semantic, because the result is different (null vs

exception).

Another example, the "connector.lookup.cache.*" options are also very
useful to tune jobs, however, it will also affect the job results. I

can

come up many more useful options but may affect semantics.

I can see that the community will under endless discussion around "can

this

option to be a hint option?",  "wether this option will affect

semantics?".

You can also find that we already have different opinions on
"ignore-parse-error". Those discussion is a waste of time! That's not

what

users want!
The problem is user need this, this, this options and HOW to expose

them?

We should focus on that.

Then there could be two endings in the future:
1) compromise on the usability, we drop the rule that hints don't

affect

semantics, allow all the useful options in the hints list.
2) stick on the rule, users will find this is a stumbling feature

which

doesn't solve their problems.
  And they will be surprised why this option can't be set, but the

other

could. *semantic* is hard to be understood by users.

# doesn't solve the problems

I think the purpose of this FLIP is to allow users to quickly override

some

connectors' properties to tune their jobs.
However, `supportedHintOptions` is off track. It only allows a subset
options and for the users it's not *clear* which subset is allowed.

Besides, I'm not sure `supportedHintOptions` can work well for all

cases.

How could you support kafka properties (`connector.properties.*`) as

hint

options? Some kafka properties may affect semantics

(bootstrap.servers),

some may not (max.poll.records). Besides, I think it's not possible to

list

all the possible kafka properties [1].

In sum

Re: [VOTE] FLIP-102: Add More Metrics to TaskManager

2020-03-26 Thread Till Rohrmann
Thanks for updating the FLIP Yadong.

What is the difference between managedMemory and managedMemoryTotal
and networkMemory and networkMemoryTotal in the REST response? If they are
duplicates, then we might be able to remove one.

Apart from that, the proposal looks good to me.

Pulling also Andrey in to hear his opinion about the representation of the
memory components.

Cheers,
Till

On Thu, Mar 19, 2020 at 11:37 AM Yadong Xie  wrote:

> Hi all
>
> I have updated the design of the metric page and FLIP doc, please let me
> know what you think about it
>
> FLIP-102:
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-102%3A+Add+More+Metrics+to+TaskManager
> POC web:
>
> http://101.132.122.69:8081/web/#/task-manager/8e1f1beada3859ee8e46d0960bb1da18/metrics
>
> Till Rohrmann  于2020年2月27日周四 下午10:27写道:
>
> > Thinking a bit more about the problem whether to report the aggregated
> > memory statistics or the individual slot statistics, I think reporting it
> > on a per slot basis won't work nicely together with FLIP-56 (dynamic slot
> > allocation). The problem is that with FLIP-56, we will no longer have
> > dedicated slots. The number of slots might change over the lifetime of a
> > TaskExecutor. Hence, it won't be easy to generate a metric path for every
> > slot which are furthermore also ephemeral. So maybe, the more general and
> > easier solution would be to report the overall memory usage of a
> > TaskExecutor even though it means to do some aggregation on the
> > TaskExecutor.
> >
> > Concerning the JVM limit: Isn't it mainly the code cache? If we display
> > this value, then we should explain what exactly it means. I fear that
> most
> > users won't understand what JVM limit actually means.
> >
> > Cheers,
> > Till
> >
> > On Wed, Feb 26, 2020 at 11:15 AM Yadong Xie  wrote:
> >
> > > Hi Till
> > >
> > > Thanks a lot for your response
> > >
> > > > 2. I'm not entirely sure whether I would split the memory ...
> > >
> > > Split the memory display comes from the 'ancient' design of the web, it
> > is
> > > ok for me to change it following total/heap/managed/network/direct/jvm
> > > overhead/mapped sequence
> > >
> > > > 3. Displaying the memory configurations...
> > >
> > > I agree with you that it is not a very nice way, but the hierarchical
> > > relationship of configurations is too complex and hard to display in
> the
> > > other ways (I have tried)
> > >
> > > if anyone has a better idea, please feels no hesitates to help me
> > >
> > >
> > > > 4. What does JVM limit mean in Non-heap.JVM-Overhead?
> > >
> > > JVM limit is "non-heap max metric minus metaspace configuration" as
> > > @Xintong
> > > Song  replyed in this mail thread
> > >
> > >
> > > Till Rohrmann  于2020年2月25日周二 下午6:58写道:
> > >
> > > > Thanks for creating this FLIP Yadong. I think your proposal makes it
> > much
> > > > easier for the user to understand what's happening on Flink
> > > TaskManager's.
> > > >
> > > > I have some comments:
> > > >
> > > > 1. Some of the newly introduced metrics involve computations on the
> > > > TaskManager. I would like to avoid additional computations introduced
> > by
> > > > metrics as much as possible because metrics should not affect the
> > system.
> > > > In particular, total memory sizes which are configured should not be
> > > > derived computationally (getManagedMemoryTotal, getTotalMemorySize).
> > For
> > > > the currently available memory sizes (e.g. getManagedMemoryUsed), one
> > > could
> > > > think about reporting them on a per slot basis and to do the
> > aggregation
> > > on
> > > > the client side. Of course, this would increase the size of the
> > response
> > > > payload.
> > > >
> > > > 2. I'm not entirely sure whether I would split the memory display
> into
> > > JVM
> > > > memory and non JVM memory as you've done it int the POC. From a
> user's
> > > > perspective, one could start displaying the total process memory. The
> > > next
> > > > three most important metrics are the heap, managed memory and network
> > > > buffer usage, I guess. If one is interested in more details, one
> could
> > > then
> > > > display the remaining direct memory usage, the JVM overhead (I'm not
> > sure
> > > > whether I would call this non-heap though) and the mapped memory.
> > > >
> > > > 3. Displaying the memory configurations in three nested boxes does
> not
> > > look
> > > > so nice to me. I'm not sure how else one could display it, though.
> > > >
> > > > 4. What does JVM limit mean in Non-heap.JVM-Overhead?
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Feb 25, 2020 at 8:19 AM Yadong Xie 
> > wrote:
> > > >
> > > > > Hi Xintong
> > > > > thanks for your advice, the POC web and the FLIP doc was updated
> now
> > > > > here is the new link:
> > > > >
> > > > >
> > > >
> > >
> >
> http://101.132.122.69:8081/web/#/task-manager/7e7cf0293645c8537caab915c829aa73/metrics
> > > > >
> > > > >
> > > > > Xintong Song  于2020年2月21日周五 下午12:00写道:
> > > > >
> > > > > > >
> > > > > > > 1

[jira] [Created] (FLINK-16805) StreamingKafkaITCase fails with "Could not instantiate instance using default factory."

2020-03-26 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-16805:
--

 Summary: StreamingKafkaITCase fails with "Could not instantiate 
instance using default factory."
 Key: FLINK-16805
 URL: https://issues.apache.org/jira/browse/FLINK-16805
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka, Tests
Reporter: Robert Metzger


CI: 
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6654&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=27d1d645-cbce-54e2-51c4-d8b45fe24607

{code}
2020-03-26T08:17:42.8881925Z [INFO]  T E S T S
2020-03-26T08:17:42.8882791Z [INFO] 
---
2020-03-26T08:17:43.6840472Z [INFO] Running 
org.apache.flink.tests.util.kafka.StreamingKafkaITCase
2020-03-26T08:17:43.6933052Z [ERROR] Tests run: 3, Failures: 0, Errors: 3, 
Skipped: 0, Time elapsed: 0.006 s <<< FAILURE! - in 
org.apache.flink.tests.util.kafka.StreamingKafkaITCase
2020-03-26T08:17:43.6934567Z [ERROR] testKafka[0: 
kafka-version:0.10.2.0](org.apache.flink.tests.util.kafka.StreamingKafkaITCase) 
 Time elapsed: 0.004 s  <<< ERROR!
2020-03-26T08:17:43.6935170Z java.lang.RuntimeException: Could not instantiate 
instance using default factory.
2020-03-26T08:17:43.6935702Zat 
org.apache.flink.tests.util.kafka.StreamingKafkaITCase.(StreamingKafkaITCase.java:72)
2020-03-26T08:17:43.6936024Z 
2020-03-26T08:17:43.6936691Z [ERROR] testKafka[1: 
kafka-version:0.11.0.2](org.apache.flink.tests.util.kafka.StreamingKafkaITCase) 
 Time elapsed: 0 s  <<< ERROR!
2020-03-26T08:17:43.6937288Z java.lang.RuntimeException: Could not instantiate 
instance using default factory.
2020-03-26T08:17:43.6937789Zat 
org.apache.flink.tests.util.kafka.StreamingKafkaITCase.(StreamingKafkaITCase.java:72)
2020-03-26T08:17:43.6938113Z 
2020-03-26T08:17:43.6938890Z [ERROR] testKafka[2: 
kafka-version:2.2.0](org.apache.flink.tests.util.kafka.StreamingKafkaITCase)  
Time elapsed: 0 s  <<< ERROR!
2020-03-26T08:17:43.6939646Z java.lang.RuntimeException: Could not instantiate 
instance using default factory.
2020-03-26T08:17:43.6940153Zat 
org.apache.flink.tests.util.kafka.StreamingKafkaITCase.(StreamingKafkaITCase.java:72)
2020-03-26T08:17:43.6940485Z 
2020-03-26T08:17:44.0270048Z [INFO] 
2020-03-26T08:17:44.0270457Z [INFO] Results:
2020-03-26T08:17:44.0270649Z [INFO] 
2020-03-26T08:17:44.0270863Z [ERROR] Errors: 
2020-03-26T08:17:44.0271847Z [ERROR]   StreamingKafkaITCase.:72 » 
Runtime Could not instantiate instance using ...
2020-03-26T08:17:44.0272651Z [ERROR]   StreamingKafkaITCase.:72 » 
Runtime Could not instantiate instance using ...
2020-03-26T08:17:44.0273487Z [ERROR]   StreamingKafkaITCase.:72 » 
Runtime Could not instantiate instance using ...
2020-03-26T08:17:44.0274218Z [INFO] 
2020-03-26T08:17:44.0274517Z [ERROR] Tests run: 3, Failures: 0, Errors: 3, 
Skipped: 0
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [Dev Blog] Migrating Flink's CI Infrastructure from Travis CI to Azure Pipelines

2020-03-26 Thread Till Rohrmann
Thanks Robert for this dev blog post. It's a good read.

Cheers,
Till

On Mon, Mar 23, 2020 at 10:24 PM Arvid Heise  wrote:

> Thank you Robert! (also thanks for incorporating my feedback so swiftly)
>
> On Mon, Mar 23, 2020 at 8:54 PM Seth Wiesman  wrote:
>
> > Very interesting! No questions but thank you for taking the initiative to
> > put out the first dev blog.
> >
> > Seth
> >
> > > On Mar 23, 2020, at 5:14 AM, Robert Metzger 
> wrote:
> > >
> > > Hi all,
> > >
> > > I have just published the first post to the dev blog:
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/2020/03/22/Migrating+Flink%27s+CI+Infrastructure+from+Travis+CI+to+Azure+Pipelines
> > > .
> > >
> > > I'm looking forward to your feedback and questions on the article :)
> > >
> > > Best,
> > > Robert
> >
>


Re: [DISCUSS] FLIP-95: New TableSource and TableSink interfaces

2020-03-26 Thread Becket Qin
Hi Timo,

Regarding "connector developers just need to know how to write an
> ExpressionToParquetFilter":
>


This is the entire purpose of the DynamicTableSource/DynamicTableSink.
> The bridging between SQL concepts and connector specific concepts.
> Because this is the tricky part. How to get from a SQL concept to a
> connctor concept.


Maybe it is just a naming issue depending on whether one is looking upward
from the Connectors perspective, or looking downward from the SQL
perspective. If we agree that the connectors should provide semantic free
API to the high level use cases, it seems we should follow the former path.
And if there are one or two APIs that the connector developers have to
understand in order to support Table / SQL, I think we can just address
them case by case, instead of wrapping the entire low level source API
with a set of new concepts.

Correct me if I am wrong, can we tell the following story to a connector
developer and get a all the TableSource functionality work?

To provide a TableSource from a Source, one just need to know two more
concepts: *Row* and *Expression*. The work to create a TableSource are
following:
1. A connector developer can write three classes in order to build a table
source:

   - Deserializer (Must-have)
   - PredicateTranslator (optional, only
   applicable if the Source is a FilterableSource)
   - PredicateTranslator (optional, only
   applicable if the Source is a ProjectableSource)

2. In order to let the table source be discoverable, one need to provide a
Factory, and that Factory provides the following as a bundle:

   - The Source itself (Must-have)
   - The Deserializer (Must-have)
   - PredicateTranslator (optional, only
   applicable when the Factory is a FilterFactory)
   - PredicateTranslator (optional, only
   applicable when the Factory is a ProjectorFactory)

3. The Deserializer may implement one more decorative interfaces to
further convert the record after deserialization.

   - withMapFunction;

Note that the above description only require the connector developer to
understand Expression and Row. If this works, It is much easier to explain
than throwing a full set of new concepts. More importantly, it is way more
generic. For example, If we change Row to Coordinates, and Expression to
Area, we easily get a Source for a Spatial Database.


One thing I want to call out is that while the old SourceFunction and
InputFormat are concrete implementations that does the actual IO work. The
Source API in FLIP-27 itself is kind of a Factory by itself already. So if
we can push the decorative interfaces from the TableFactory layer to the
Source layer, it will help unify the experience for DataStream and Table
Source. This will also align with our goal of letting the DataStream Source
provide a semantic free API that can be used by different high level API.


BTW, Jark suggested that we can probably have an offline call to accelerate
the discussion. I think it is a good idea. Can we do that?

Thanks,

Jiangjie (Becket) Qin


On Thu, Mar 26, 2020 at 5:28 PM Timo Walther  wrote:

> Hi Becket,
>
> Regarding "PushDown/NestedPushDown which is internal to optimizer":
>
> Those concepts cannot be entirely internal to the optimizer, at some
> point the optimizer needs to pass them into the connector specific code.
> This code will then convert it to e.g. Parque expressions. So there must
> be some interface that takes SQL Expression and converts to connector
> specific code. This interface between planner and connector is modelled
> by the SupportsXXX interfaces. And you are right, if developers don't
> care, they don't need to implement those optional interfaces but will
> not get performant connectors.
>
> Regarding "Table connector can work with the above two mechanism":
>
> A table connector needs three mechanisms that are represented in the
> current design.
>
> 1. a stateless discovery interface (Factory) that can convert
> ConfigOptions to a stateful factory interface
> (DynamicTableSource/DynamicTableSink)
>
> 2. a stateful factory interface (DynamicTableSource/DynamicTableSink)
> that receives concepts from the optimizer (watermarks, filters,
> projections) and produces runtime classes such as your
> `ExpressionToParquetFilter`
>
> 3. runtime interfaces that are generated from the stateful factory; all
> the factories that you mentioned can be used in `getScanRuntimeProvider`.
>
> Regarding "connector developers just need to know how to write an
> ExpressionToParquetFilter":
>
> This is the entire purpose of the DynamicTableSource/DynamicTableSink.
> The bridging between SQL concepts and connector specific concepts.
> Because this is the tricky part. How to get from a SQL concept to a
> connctor concept.
>
> Regards,
> Timo
>
>
> On 26.03.20 04:46, Becket Qin wrote:
> > Hi Timo,
> >
> > Thanks for the reply. I totally agree that there must be something new
> > added to the connector in order to make it work for SQL / Table. My
> concern
> > is most

[jira] [Created] (FLINK-16806) Implement Input selection for MultipleInputStreamOperator

2020-03-26 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-16806:
--

 Summary: Implement Input selection for MultipleInputStreamOperator
 Key: FLINK-16806
 URL: https://issues.apache.org/jira/browse/FLINK-16806
 Project: Flink
  Issue Type: Sub-task
  Components: API / DataStream, Runtime / Task
Affects Versions: 1.10.0
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski
 Fix For: 1.11.0


After FLINK-16060 support for {{MultipleInputStreamOperator}} is incomplete. 
After defining new base class for the {{StreamOperator}} (FLINK-16316) that 
would be suitable to use with {{MultipleInputStreamOperator}}, we can provide 
support for missing features in the {{MultipleInputStreamOperator}}, like:
* keyed state support
* processing {{Watermark}}
* processing {{LatencyMarker}}
* {{StreamStatus}}
* input selection



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Switch to Azure Pipelines as the primary CI tool / switch off Travis

2020-03-26 Thread Till Rohrmann
Thanks for driving this effort Robert. I'd be in favour of disabling Travis
for PRs once AZP is decently stable.

Cheers,
Till

On Wed, Mar 25, 2020 at 8:28 PM Robert Metzger  wrote:

> Thank you for your responses.
>
> @Yu Li: In the current master, the log upload always fails, if the e2e job
> failed. I just merged a PR that fixes this issue [1]. The problem was not
> really the network stability, rather a problem with the interaction of the
> jobs in the pipeline (the e2e job did not set the right variables for the
> log upload)
> Secondly, you are looking at the report of the "flink-ci.flink" pipeline,
> where pull requests are build. Naturally, pull request builds fail all the
> time, because the PRs are not yet perfect.
>
> "flink-ci.flink-master" is the right pipeline to look at:
>
> https://dev.azure.com/rmetzger/Flink/_pipeline/analytics/stageawareoutcome?definitionId=8&contextType=build
> We have a fairly high number of failures there, because we currently have
> some issues downloading the maven artifacts [2]. I'm working already with
> Chesnay on merging a fix for that.
>
>
> [1]
>
> https://github.com/apache/flink/commit/1c86b8b9dd05615a3b2600984db738a9bf388259
> [2]https://issues.apache.org/jira/browse/FLINK-16720
>
>
>
> On Wed, Mar 25, 2020 at 4:48 PM Chesnay Schepler 
> wrote:
>
> > The easiest way to disable travis for pushes is to remove all builds
> > from the .travis.yml with a push/pr condition.
> >
> > On 25/03/2020 15:03, Robert Metzger wrote:
> > > Thank you for the feedback so far.
> > >
> > > Responses to the items Chesnay raised:
> > >
> > > - by virtue of maintaining the past 2 releases we will have to maintain
> > any
> > >> Travis infrastructure as long as 1.10 is supported, i.e., until 1.12
> > >>
> > > Okay. I wasn't sure about the exact policy there.
> > >
> > >
> > >> - the azure setup doesn't appear to be equivalent yet since the java
> e2e
> > >> profile isn't setting the hadoop switch (-Pe2e-hadoop), as a result of
> > >> which SQLClientKafkaITCase isn't run
> > >>
> > > I filed a ticket to address this:
> > > https://issues.apache.org/jira/browse/FLINK-16778
> > >
> > >
> > >> - the nightly scripts still seems to be using a maven version other
> than
> > >> 3.2.5; from today on master:
> > >> 2020-03-25T05:31:52.7412964Z [INFO] <
> > >> org.apache.flink:flink-end-to-end-tests-common-kafka >
> > >> 2020-03-25T05:31:52.7413854Z [INFO] Building
> > >> flink-end-to-end-tests-common-kafka 1.11-SNAPSHOT [39/46]
> > >> 2020-03-25T05:31:52.7414689Z [INFO] [
> > jar
> > >> ]-
> > >> 2020-03-25T05:31:52.7518360Z [INFO]
> > >> 2020-03-25T05:31:52.7519770Z [INFO] ---
> > maven-checkstyle-plugin:2.17:check
> > >> (validate) @ flink-end-to-end-tests-common-kafka ---
> > >>
> > > I'm planning to address this as part of
> > > https://issues.apache.org/jira/browse/FLINK-16411, where I work on
> > > centralizing all mvn invocations.
> > >
> > >
> > >> - there is no real benefit in retiring the travis support in CiBot;
> the
> > >> important part is whether Travis is run or not for pull requests.
> > >>  From what I can tell though azure seems to be working fine for pull
> > >> requests, so +1 to at least disable the travis PR runs.
> > >
> > > So we disable Travis for https://github.com/flink-ci/flink ? I will do
> > it
> > > once there are no new concerns and above tickets are resolved.
> > >
> > > What about disabling travis for master pushes? (e.g. removing the
> > > .travis.yml file from master)?
> > >
> > >
> > > @Dian:
> > > Thanks a lot for your feedback.
> > >
> > > - The report of Azure is still not viewable[1] (I noticed that Hequn
> has
> > >> also reported this issue in another thread). This is very useful
> > >> information.
> > >
> > > You are referring to the emails send to builds@f.a.o right?
> > > I have reported this both as a bug [1] and a feature request [2] to
> > Azure.
> > > But I don't believe they will resolve this issue anytime soon.
> > > Azure has an notifications API that we could use to build a service
> that
> > > sends emails to that list, but I feel that this is really a waste of
> > time.
> > > The URL in the link even contains the ID of the build. We would just
> need
> > > to extract this ID and generate the appropriate URL. I will try to
> > directly
> > > reach the product management of AZP, maybe I can get some attention
> from
> > > there.
> > >
> > >
> > >
> > > [1]
> > >
> >
> https://developercommunity.visualstudio.com/content/problem/957778/third-parties-are-unable-to-access-notification-li.html?childToView=960403#comment-960403
> > > [2]
> > >
> >
> https://developercommunity.visualstudio.com/content/idea/960472/third-parties-are-unable-to-access-notification-li-1.html
> > >
> > >
> > >
> > > On Wed, Mar 25, 2020 at 10:34 AM Chesnay Schepler 
> > > wrote:
> > >
> > >> It was left out since it adds significant additional complexity and
> the
> > >> value is dubiou

Re: [DISCUSS] Switch to Azure Pipelines as the primary CI tool / switch off Travis

2020-03-26 Thread Yu Li
Thanks for the clarification Robert.

Since the first step plan is to replace the travis PR runs, I checked all
PR builds from 2020-01-01 (PR#10735-11526) [1], and below is the result:

* Travis FAILURE: 298
* Travis SUCCESS: 649 (68.5%)
* Azure FAILURE: 420
* Azure SUCCESS: 571 (57.6%)

Since the patch for each run is equivalent for Travis and Azure, there
seems to be slightly higher failure rate (~10%) when running in Azure.

However, with the just-merged fix for uploading logs (FLINK-16480), I
believe the success rate of Azure could compete with Travis now (uploading
files contribute to 20% of the failures according to the report [2]).

So I'm +1 to disable travis runs according to the numbers.

Best Regards,
Yu

[1]
https://github.com/apache/flink/pulls?q=is%3Apr+created%3A%3E%3D2020-01-01
[2]
https://dev.azure.com/rmetzger/Flink/_pipeline/analytics/stageawareoutcome?definitionId=4

On Thu, 26 Mar 2020 at 03:28, Robert Metzger  wrote:

> Thank you for your responses.
>
> @Yu Li: In the current master, the log upload always fails, if the e2e job
> failed. I just merged a PR that fixes this issue [1]. The problem was not
> really the network stability, rather a problem with the interaction of the
> jobs in the pipeline (the e2e job did not set the right variables for the
> log upload)
> Secondly, you are looking at the report of the "flink-ci.flink" pipeline,
> where pull requests are build. Naturally, pull request builds fail all the
> time, because the PRs are not yet perfect.
>
> "flink-ci.flink-master" is the right pipeline to look at:
>
> https://dev.azure.com/rmetzger/Flink/_pipeline/analytics/stageawareoutcome?definitionId=8&contextType=build
> We have a fairly high number of failures there, because we currently have
> some issues downloading the maven artifacts [2]. I'm working already with
> Chesnay on merging a fix for that.
>
>
> [1]
>
> https://github.com/apache/flink/commit/1c86b8b9dd05615a3b2600984db738a9bf388259
> [2]https://issues.apache.org/jira/browse/FLINK-16720
>
>
>
> On Wed, Mar 25, 2020 at 4:48 PM Chesnay Schepler 
> wrote:
>
> > The easiest way to disable travis for pushes is to remove all builds
> > from the .travis.yml with a push/pr condition.
> >
> > On 25/03/2020 15:03, Robert Metzger wrote:
> > > Thank you for the feedback so far.
> > >
> > > Responses to the items Chesnay raised:
> > >
> > > - by virtue of maintaining the past 2 releases we will have to maintain
> > any
> > >> Travis infrastructure as long as 1.10 is supported, i.e., until 1.12
> > >>
> > > Okay. I wasn't sure about the exact policy there.
> > >
> > >
> > >> - the azure setup doesn't appear to be equivalent yet since the java
> e2e
> > >> profile isn't setting the hadoop switch (-Pe2e-hadoop), as a result of
> > >> which SQLClientKafkaITCase isn't run
> > >>
> > > I filed a ticket to address this:
> > > https://issues.apache.org/jira/browse/FLINK-16778
> > >
> > >
> > >> - the nightly scripts still seems to be using a maven version other
> than
> > >> 3.2.5; from today on master:
> > >> 2020-03-25T05:31:52.7412964Z [INFO] <
> > >> org.apache.flink:flink-end-to-end-tests-common-kafka >
> > >> 2020-03-25T05:31:52.7413854Z [INFO] Building
> > >> flink-end-to-end-tests-common-kafka 1.11-SNAPSHOT [39/46]
> > >> 2020-03-25T05:31:52.7414689Z [INFO] [
> > jar
> > >> ]-
> > >> 2020-03-25T05:31:52.7518360Z [INFO]
> > >> 2020-03-25T05:31:52.7519770Z [INFO] ---
> > maven-checkstyle-plugin:2.17:check
> > >> (validate) @ flink-end-to-end-tests-common-kafka ---
> > >>
> > > I'm planning to address this as part of
> > > https://issues.apache.org/jira/browse/FLINK-16411, where I work on
> > > centralizing all mvn invocations.
> > >
> > >
> > >> - there is no real benefit in retiring the travis support in CiBot;
> the
> > >> important part is whether Travis is run or not for pull requests.
> > >>  From what I can tell though azure seems to be working fine for pull
> > >> requests, so +1 to at least disable the travis PR runs.
> > >
> > > So we disable Travis for https://github.com/flink-ci/flink ? I will do
> > it
> > > once there are no new concerns and above tickets are resolved.
> > >
> > > What about disabling travis for master pushes? (e.g. removing the
> > > .travis.yml file from master)?
> > >
> > >
> > > @Dian:
> > > Thanks a lot for your feedback.
> > >
> > > - The report of Azure is still not viewable[1] (I noticed that Hequn
> has
> > >> also reported this issue in another thread). This is very useful
> > >> information.
> > >
> > > You are referring to the emails send to builds@f.a.o right?
> > > I have reported this both as a bug [1] and a feature request [2] to
> > Azure.
> > > But I don't believe they will resolve this issue anytime soon.
> > > Azure has an notifications API that we could use to build a service
> that
> > > sends emails to that list, but I feel that this is really a waste of
> > time.
> > > The URL in the link even

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-26 Thread Stephan Ewen
Nice, thanks a lot!

On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo  wrote:

> Thanks for the suggestion, @Stephan, @Becket and @Xintong.
>
> I've updated the FLIP accordingly. I do not add a
> ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
> which takes the responsibility of all relevant operations on both RM
> and TM sides.
> After a rethink about decoupling the management of external resources
> from TaskExecutor, I think we could do the same thing on the
> ResourceManager side. We do not need to add a specific allocation
> logic to the ResourceManager each time we add a specific external
> resource.
> - For Yarn, we need the ExternalResourceDriver to edit the
> containerRequest.
> - For Kubenetes, ExternalResourceDriver could provide a decorator for
> the TM pod.
>
> In this way, just like MetricReporter, we allow users to define their
> custom ExternalResourceDriver. It is more extensible and fits the
> separation of concerns. For more details, please take a look at [1].
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
>
> Best,
> Yangze Guo
>
> On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen  wrote:
> >
> > This sounds good to go ahead from my side.
> >
> > I like the approach that Becket suggested - in that case the core
> > abstraction that everyone would need to understand would be "external
> > resource allocation" and the "ResourceInfoProvider", and the GPU specific
> > code would be a specific implementation only known to that component that
> > allocates the external resource. That fits the separation of concerns
> well.
> >
> > I also understand that it should not be over-engineered in the first
> > version, so some simplification makes sense, and then gradually expand
> from
> > there.
> >
> > So +1 to go ahead with what was suggested above (Xintong / Becket) from
> my
> > side.
> >
> > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song 
> wrote:
> >
> > > Thanks for the comments, Stephan & Becket.
> > >
> > > @Stephan
> > >
> > > I see your concern, and I completely agree with you that we should
> first
> > > think about the "library" / "plugin" / "extension" style if possible.
> > >
> > > If GPUs are sliced and assigned during scheduling, there may be reason,
> > > > although it looks that it would belong to the slot then. Is that
> what we
> > > > are doing here?
> > >
> > >
> > > In the current proposal, we do not have the GPUs sliced and assigned to
> > > slots, because it could be problematic without dynamic slot allocation.
> > > E.g., the number of GPUs might not be evenly divisible by the number of
> > > slots.
> > >
> > > I think it makes sense to eventually have the GPUs assigned to slots.
> Even
> > > then, we might still need a TM level GPUManager (or ResourceProvider
> like
> > > Becket suggested). For memory, in each slot we can simply request the
> > > amount of memory, leaving it to JVM / OS to decide which memory
> (address)
> > > should be assigned. For GPU, and potentially other resources like
> FPGA, we
> > > need to explicitly specify which GPU (index) should be used.
> Therefore, we
> > > need some component at the TM level to coordinate which slot uses which
> > > GPU.
> > >
> > > IMO, unless we say Flink will not support slot-level GPU slicing at
> least
> > > in the foreseeable future, I don't see a good way to avoid touching
> the TM
> > > core. To that end, I think Becket's suggestion points to a good
> direction,
> > > that supports more features (GPU, FPGA, etc.) with less coupling to
> the TM
> > > core (only needs to understand the general interfaces). The detailed
> > > implementation for specific resource types can even be encapsulated as
> a
> > > library.
> > >
> > > @Becket
> > >
> > > Thanks for sharing your thought on the final state. Despite the
> details how
> > > the interfaces should look like, I think this is a really good
> abstraction
> > > for supporting general resource types.
> > >
> > > I'd like to further clarify that, the following three things are all
> that
> > > the "Flink core" needs to understand.
> > >
> > >- The *amount* of resource, for scheduling. Actually, we already
> have
> > >the Resource class in ResourceProfile and ResourceSpec for extended
> > >resource. It's just not really used.
> > >- The *info*, that Flink provides to the operators / user codes.
> > >- The *provider*, which generates the info based on the amount.
> > >
> > > The "core" does not need to understand the specific implementation
> details
> > > of the above three. They can even be implemented in a 3rd-party
> library.
> > > Similar to how we allow users to define their custom MetricReporter.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin 
> wrote:
> > >
> > > > Thanks for the comment, Stephan.
> > > >
> > > >   - If everything becomes a "core feature", it will make the project
> hard
> > > > > to develop in the fu

Re: [DISCUSS] FLIP-84 Feedback Summary

2020-03-26 Thread godfrey he
Hi Timo,

I agree with you that streaming queries mostly need async execution.
In fact, our original plan is only introducing sync methods in this FLIP,
and async methods (like "executeSqlAsync") will be introduced in the future
which is mentioned in the appendix.

Maybe the async methods also need to be considered in this FLIP.

I think sync methods is also useful for streaming which can be used to run
bounded source.
Maybe we should check whether all sources are bounded in sync execution
mode.

>Also, if we block for streaming queries, we could never support
> multiline files. Because the first INSERT INTO would block the further
> execution.
agree with you, we need async method to submit multiline files,
and files should be limited that the DQL and DML should be always in the
end for streaming.

Best,
Godfrey

Timo Walther  于2020年3月26日周四 下午4:29写道:

> Hi Godfrey,
>
> having control over the job after submission is a requirement that was
> requested frequently (some examples are [1], [2]). Users would like to
> get insights about the running or completed job. Including the jobId,
> jobGraph etc., the JobClient summarizes these properties.
>
> It is good to have a discussion about synchronous/asynchronous
> submission now to have a complete execution picture.
>
> I thought we submit streaming queries mostly async and just wait for the
> successful submission. If we block for streaming queries, how can we
> collect() or print() results?
>
> Also, if we block for streaming queries, we could never support
> multiline files. Because the first INSERT INTO would block the further
> execution.
>
> If we decide to block entirely on streaming queries, we need the async
> execution methods in the design already. However, I would rather go for
> non-blocking streaming queries. Also with the `EMIT STREAM` key word in
> mind that we might add to SQL statements soon.
>
> Regards,
> Timo
>
> [1] https://issues.apache.org/jira/browse/FLINK-16761
> [2] https://issues.apache.org/jira/browse/FLINK-12214
>
> On 25.03.20 16:30, godfrey he wrote:
> > Hi Timo,
> >
> > Thanks for the updating.
> >
> > Regarding to "multiline statement support", I'm also fine that
> > `TableEnvironment.executeSql()` only supports single line statement, and
> we
> > can support multiline statement later (needs more discussion about this).
> >
> > Regarding to "StatementSet.explian()", I don't have strong opinions about
> > that.
> >
> > Regarding to "TableResult.getJobClient()", I think it's unnecessary. The
> > reason is: first, many statements (e.g. DDL, show xx, use xx)  will not
> > submit a Flink job. second, `TableEnvironment.executeSql()` and
> > `StatementSet.execute()` are synchronous method, `TableResult` will be
> > returned only after the job is finished or failed.
> >
> > Regarding to "whether StatementSet.execute() needs to throw exception", I
> > think we should choose a unified way to tell whether the execution is
> > successful. If `TableResult` contains ERROR kind (non-runtime exception),
> > users need to not only check the result but also catch the runtime
> > exception in their code. or `StatementSet.execute()` does not throw any
> > exception (including runtime exception), all exception messages are in
> the
> > result.  I prefer "StatementSet.execute() needs to throw exception". cc
> @Jark
> > Wu 
> >
> > I will update the agreed parts to the document first.
> >
> > Best,
> > Godfrey
> >
> >
> > Timo Walther  于2020年3月25日周三 下午6:51写道:
> >
> >> Hi Godfrey,
> >>
> >> thanks for starting the discussion on the mailing list. And sorry again
> >> for the late reply to FLIP-84. I have updated the Google doc one more
> >> time to incorporate the offline discussions.
> >>
> >>   From Dawid's and my view, it is fine to postpone the multiline support
> >> to a separate method. This can be future work even though we will need
> >> it rather soon.
> >>
> >> If there are no objections, I suggest to update the FLIP-84 again and
> >> have another voting process.
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> On 25.03.20 11:17, godfrey he wrote:
> >>> Hi community,
> >>> Timo, Fabian and Dawid have some feedbacks about FLIP-84[1]. The
> >> feedbacks
> >>> are all about new introduced methods. We had a discussion yesterday,
> and
> >>> most of feedbacks have been agreed upon. Here is the conclusions:
> >>>
> >>> *1. about proposed methods in `TableEnvironment`:*
> >>>
> >>> the original proposed methods:
> >>>
> >>> TableEnvironment.createDmlBatch(): DmlBatch
> >>> TableEnvironment.executeStatement(String statement): ResultTable
> >>>
> >>> the new proposed methods:
> >>>
> >>> // we should not use abbreviations in the API, and the term "Batch" is
> >>> easily confused with batch/streaming processing
> >>> TableEnvironment.createStatementSet(): StatementSet
> >>>
> >>> // every method that takes SQL should have `Sql` in its name
> >>> // supports multiline statement ???
> >>> TableEnvironment.executeSql(String statement): TableResult
> >>>
> >>> // 

[jira] [Created] (FLINK-16807) Improve reporting of errors during resource initialization

2020-03-26 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-16807:


 Summary: Improve reporting of errors during resource initialization
 Key: FLINK-16807
 URL: https://issues.apache.org/jira/browse/FLINK-16807
 Project: Flink
  Issue Type: Improvement
  Components: Test Infrastructure
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


The factories for resources currently returns Optionals for handling failed 
instantiations, which are an expected occurrence. The factories themselves are 
only logging the exception.

This has the downside that no error is encoded in the optional, so if no 
resource can be instantiated we cannot enrich the final exception in any way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16808) Consolidated logging in FactoryUtils

2020-03-26 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-16808:


 Summary: Consolidated logging in FactoryUtils
 Key: FLINK-16808
 URL: https://issues.apache.org/jira/browse/FLINK-16808
 Project: Flink
  Issue Type: Improvement
  Components: Test Infrastructure
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


Resource factories currently follow a convention how failed/successful 
instantiation should be logged, but we can enforce this to be handled 
consistently in {{FactoryUtils}} with FLINK-16805.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16809) Add Caller Context in Flink

2020-03-26 Thread zhuqi (Jira)
zhuqi created FLINK-16809:
-

 Summary: Add Caller Context in Flink
 Key: FLINK-16809
 URL: https://issues.apache.org/jira/browse/FLINK-16809
 Project: Flink
  Issue Type: Improvement
Reporter: zhuqi


Now the Spark and hive have the Call Context to meet the HDFS Job Audit 
requirement.

I think the flink also should to add it, and in our cluster the flink job may 
have big pressure to the HDFS, it's will be helpful to find the root job.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-95: New TableSource and TableSink interfaces

2020-03-26 Thread Dawid Wysakowicz
Hi Becket,

Generally I don't think connector developers should bother with
understanding any of the SQL concepts.

I am not sure if we understand "connector developer" the same way. Let
me describe how I see the process of writing a new source (that can be
used in both Table & DataStream API)

1. Connector developer writes a Source that deals with the actual
reading and deserializing (preferably with a pluggable
format/deserializer). The result of that step should be something like:

|FilesystemSource|

|    .path(...)|

|    .format(ParquetFormat|

|                    .filterPredicate(/* parquet specific filter */)|

|    .project(/* parquet specific projection */)|

|    .map(...))|

|    .watermarkAssigner(...) |

This is useful for DataStream and we can and want to use this in the
Table API.  Those interface shouldn't accept any *Translators though. It
does make no sense cause internally they are not dealing e.g. with the
Expression. They should accept already created predicates.

We are not designing anything at that level. This we expect from FLIP-27

2. Then we need to have a DynamicTableSource with different abilities
that can create e.g. the parquet filter or projection from expressions.
I think this is what you also describe in your second point. And this is
what we are designing in the FLIP. Bear in mind that e.g. Deserializer
will be created out of multiple SQL concepts: regular schema/computed
columns/possibly projections etc., each applied at different planning
stages.

All of those interfaces serve the purpose of configuring the
DynamicTableSource so that it is able to instantiate the Source with
proper configuration. In other words it is a factory for the source that
you can configure with SQL concepts. In turn this Factory will call
another factory from point 1.

I don't see a potential for unifying factories across different high
level APIs. Taking your example with Spatial Database that operates on
Coordinates and Area (even though those would rather be modeled as SQL
types and we would still operate on Rows, but just for the sake of the
example). In that respect there is no point in having a
PushDownComputedColumns interface in the factory for the spatial database.

Best,

Dawid


On 26/03/2020 11:47, Becket Qin wrote:
> Hi Timo,
>
> Regarding "connector developers just need to know how to write an
>> ExpressionToParquetFilter":
>>
>
> This is the entire purpose of the DynamicTableSource/DynamicTableSink.
>> The bridging between SQL concepts and connector specific concepts.
>> Because this is the tricky part. How to get from a SQL concept to a
>> connctor concept.
>
> Maybe it is just a naming issue depending on whether one is looking upward
> from the Connectors perspective, or looking downward from the SQL
> perspective. If we agree that the connectors should provide semantic free
> API to the high level use cases, it seems we should follow the former path.
> And if there are one or two APIs that the connector developers have to
> understand in order to support Table / SQL, I think we can just address
> them case by case, instead of wrapping the entire low level source API
> with a set of new concepts.
>
> Correct me if I am wrong, can we tell the following story to a connector
> developer and get a all the TableSource functionality work?
>
> To provide a TableSource from a Source, one just need to know two more
> concepts: *Row* and *Expression*. The work to create a TableSource are
> following:
> 1. A connector developer can write three classes in order to build a table
> source:
>
>- Deserializer (Must-have)
>- PredicateTranslator (optional, only
>applicable if the Source is a FilterableSource)
>- PredicateTranslator (optional, only
>applicable if the Source is a ProjectableSource)
>
> 2. In order to let the table source be discoverable, one need to provide a
> Factory, and that Factory provides the following as a bundle:
>
>- The Source itself (Must-have)
>- The Deserializer (Must-have)
>- PredicateTranslator (optional, only
>applicable when the Factory is a FilterFactory)
>- PredicateTranslator (optional, only
>applicable when the Factory is a ProjectorFactory)
>
> 3. The Deserializer may implement one more decorative interfaces to
> further convert the record after deserialization.
>
>- withMapFunction;
>
> Note that the above description only require the connector developer to
> understand Expression and Row. If this works, It is much easier to explain
> than throwing a full set of new concepts. More importantly, it is way more
> generic. For example, If we change Row to Coordinates, and Expression to
> Area, we easily get a Source for a Spatial Database.
>
>
> One thing I want to call out is that while the old SourceFunction and
> InputFormat are concrete implementations that does the actual IO work. The
> Source API in FLIP-27 itself is kind of a Factory by itself already. So if
> we can push th

Re: [DISCUSS] FLIP-84 Feedback Summary

2020-03-26 Thread Timo Walther

Hi Godfrey,

executing streaming queries must be our top priority because this is 
what distinguishes Flink from competitors. If we change the execution 
behavior, we should think about the other cases as well to not break the 
API a third time.


I fear that just having an async execute method will not be enough 
because users should be able to mix streaming and batch queries in a 
unified scenario.


If I remember it correctly, we had some discussions in the past about 
what decides about the execution mode of a query. Currently, we would 
like to let the query decide, not derive it from the sources.


So I could image a multiline pipeline as:

USE CATALOG 'mycat';
INSERT INTO t1 SELECT * FROM s;
INSERT INTO t2 SELECT * FROM s JOIN t1 EMIT STREAM;

For executeMultilineSql():

sync because regular SQL
sync because regular Batch SQL
async because Streaming SQL

For executeAsyncMultilineSql():

async because everything should be async
async because everything should be async
async because everything should be async

What we should not start for executeAsyncMultilineSql():

sync because DDL
async because everything should be async
async because everything should be async

What are you thoughts here?

Regards,
Timo


On 26.03.20 12:50, godfrey he wrote:

Hi Timo,

I agree with you that streaming queries mostly need async execution.
In fact, our original plan is only introducing sync methods in this FLIP,
and async methods (like "executeSqlAsync") will be introduced in the future
which is mentioned in the appendix.

Maybe the async methods also need to be considered in this FLIP.

I think sync methods is also useful for streaming which can be used to run
bounded source.
Maybe we should check whether all sources are bounded in sync execution
mode.


Also, if we block for streaming queries, we could never support
multiline files. Because the first INSERT INTO would block the further
execution.

agree with you, we need async method to submit multiline files,
and files should be limited that the DQL and DML should be always in the
end for streaming.

Best,
Godfrey

Timo Walther  于2020年3月26日周四 下午4:29写道:


Hi Godfrey,

having control over the job after submission is a requirement that was
requested frequently (some examples are [1], [2]). Users would like to
get insights about the running or completed job. Including the jobId,
jobGraph etc., the JobClient summarizes these properties.

It is good to have a discussion about synchronous/asynchronous
submission now to have a complete execution picture.

I thought we submit streaming queries mostly async and just wait for the
successful submission. If we block for streaming queries, how can we
collect() or print() results?

Also, if we block for streaming queries, we could never support
multiline files. Because the first INSERT INTO would block the further
execution.

If we decide to block entirely on streaming queries, we need the async
execution methods in the design already. However, I would rather go for
non-blocking streaming queries. Also with the `EMIT STREAM` key word in
mind that we might add to SQL statements soon.

Regards,
Timo

[1] https://issues.apache.org/jira/browse/FLINK-16761
[2] https://issues.apache.org/jira/browse/FLINK-12214

On 25.03.20 16:30, godfrey he wrote:

Hi Timo,

Thanks for the updating.

Regarding to "multiline statement support", I'm also fine that
`TableEnvironment.executeSql()` only supports single line statement, and

we

can support multiline statement later (needs more discussion about this).

Regarding to "StatementSet.explian()", I don't have strong opinions about
that.

Regarding to "TableResult.getJobClient()", I think it's unnecessary. The
reason is: first, many statements (e.g. DDL, show xx, use xx)  will not
submit a Flink job. second, `TableEnvironment.executeSql()` and
`StatementSet.execute()` are synchronous method, `TableResult` will be
returned only after the job is finished or failed.

Regarding to "whether StatementSet.execute() needs to throw exception", I
think we should choose a unified way to tell whether the execution is
successful. If `TableResult` contains ERROR kind (non-runtime exception),
users need to not only check the result but also catch the runtime
exception in their code. or `StatementSet.execute()` does not throw any
exception (including runtime exception), all exception messages are in

the

result.  I prefer "StatementSet.execute() needs to throw exception". cc

@Jark

Wu 

I will update the agreed parts to the document first.

Best,
Godfrey


Timo Walther  于2020年3月25日周三 下午6:51写道:


Hi Godfrey,

thanks for starting the discussion on the mailing list. And sorry again
for the late reply to FLIP-84. I have updated the Google doc one more
time to incorporate the offline discussions.

   From Dawid's and my view, it is fine to postpone the multiline support
to a separate method. This can be future work even though we will need
it rather soon.

If there are no objections, I suggest to update the FLIP-84 

Re: [DISCUSS] Drop Bucketing Sink

2020-03-26 Thread Robert Metzger
By the way: I have disabled the test in the Hadoop 2.4.1 build, so my
original problem is resolved.

I'm not convinced that posting on the user@ ml is the right approach.
The last few questions to the user@ list asking for feedback have not
really been answered. I believe that we have deprecated the sink for long
enough, also we seem to have a common understanding of missing features we
need to resolve first.
Last but not least, we can always port the bucketing sink to another
repository (apache bahir) and keep compatibility with later Flink versions
there if a lot of users complain once it has been removed.


On Tue, Mar 17, 2020 at 10:35 AM Kostas Kloudas  wrote:

> Thanks Robert for all this,
>
> I think that we should also post a thread in the user ML so that users
> can also comment on the topic.
>
> What do you think?
>
> Kostas
>
> On Mon, Mar 16, 2020 at 12:27 PM Robert Metzger 
> wrote:
> >
> > Thank you all for your feedback.
> >
> > I will try to fix the test then (or disable it).
> >
> > Here's a ticket for dropping the BucketingSink:
> > https://issues.apache.org/jira/browse/FLINK-16616 Please mark whatever
> we
> > consider necessary as a "depends on" ticket.
> > @David / @Seth: Where are the tickets depending on FLIP-46 listed? Can
> one
> > of you add them to FLINK-16616
> >
> >
> > On Fri, Mar 13, 2020 at 11:32 AM Guowei Ma  wrote:
> >
> > > +1 to drop it.
> > >
> > > To Jingsong :
> > > we are planning to implement the orc StreamingFileSink in 1.11.
> > > I think users also could reference the old BucktSink from the old
> version.
> > >
> > > Best,
> > > Guowei
> > >
> > >
> > > Jingsong Li  于2020年3月13日周五 上午10:07写道:
> > >
> > > > Hi Robert,
> > > >
> > > > +1 to drop it but maybe not 1.11.
> > > >
> > > > ORC has not been supported on StreamingFileSink. I have seen lots of
> > > users
> > > > run ORC in the bucketing sink.
> > > >
> > > > Best,
> > > > Jingsong Lee
> > > >
> > > > On Fri, Mar 13, 2020 at 1:11 AM Seth Wiesman 
> > > wrote:
> > > >
> > > > > Sorry, I meant FLIP-46.
> > > > >
> > > > > Seth
> > > > >
> > > > > On Thu, Mar 12, 2020 at 11:52 AM Seth Wiesman  >
> > > > wrote:
> > > > >
> > > > > > I agree with David, I think FLIP-49 needs to be prioritized for
> 1.11
> > > if
> > > > > we
> > > > > > want to drop the bucketing sink.
> > > > > >
> > > > > > Seth
> > > > > >
> > > > > > On Thu, Mar 12, 2020 at 10:53 AM David Anderson <
> da...@ververica.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > >> The BucketingSink is still somewhat widely used, I think in part
> > > > because
> > > > > >> of
> > > > > >> shortcomings in the StreamingFileSink.
> > > > > >>
> > > > > >> I would hope that in tandem with removing the bucketing sink we
> > > could
> > > > > also
> > > > > >> address some of these issues. I'm thinking in particular of
> issues
> > > > that
> > > > > >> are
> > > > > >> waiting on FLIP-46 [1].
> > > > > >>
> > > > > >> Removing the bucketing sink will go down better, in my opinion,
> if
> > > > it's
> > > > > >> coupled with progress on some of the open StreamingFileSink
> tickets.
> > > > > >>
> > > > > >> Best,
> > > > > >> David
> > > > > >>
> > > > > >> [1]
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-46%3A+Graceful+Shutdown+Handling+by+UDFs
> > > > > >>
> > > > > >>
> > > > > >> On Thu, Mar 12, 2020 at 4:27 PM Zhijiang <
> > > wangzhijiang...@aliyun.com
> > > > > >> .invalid>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Thanks for driving this discussion, Robert!
> > > > > >> >
> > > > > >> > This e2e test really fails frequently.  +1 to drop bucketing
> sink,
> > > > it
> > > > > is
> > > > > >> > not worth paying more efforts since deprecated.
> > > > > >> >
> > > > > >> > Best,
> > > > > >> > Zhijiang
> > > > > >> >
> > > > > >> >
> > > > > >> >
> --
> > > > > >> > From:Jeff Zhang 
> > > > > >> > Send Time:2020 Mar. 12 (Thu.) 23:17
> > > > > >> > To:dev 
> > > > > >> > Subject:Re: [DISCUSS] Drop Bucketing Sink
> > > > > >> >
> > > > > >> > +1, dropping deprecated api is always necessary for a
> sustainable
> > > > > >> project.
> > > > > >> >
> > > > > >> > Kostas Kloudas  于2020年3月12日周四 下午11:06写道:
> > > > > >> >
> > > > > >> > > Hi Robert,
> > > > > >> > >
> > > > > >> > > +1 for dropping the BucketingSink.
> > > > > >> > > In any case, it has not been maintained for quite some time.
> > > > > >> > >
> > > > > >> > > Cheers,
> > > > > >> > > Kostas
> > > > > >> > >
> > > > > >> > > On Thu, Mar 12, 2020 at 3:41 PM Robert Metzger <
> > > > rmetz...@apache.org
> > > > > >
> > > > > >> > > wrote:
> > > > > >> > > >
> > > > > >> > > > Hi all,
> > > > > >> > > >
> > > > > >> > > > I'm currently investigating a failing end to end test for
> the
> > > > > >> bucketing
> > > > > >> > > > sink [1].
> > > > > >> > > > The bucketing sink has been deprecated in the 1.9 release
> [2],
> > > > > >> because
> > > > > >> > we
> > > > > >> >

Re: [DISCUSS] FLIP-95: New TableSource and TableSink interfaces

2020-03-26 Thread Becket Qin
Hi Timo and Dawid,

Thanks for the patient explanation. I just had a phone call with Kurt and
Jark. I do see there are a few abstractions that we only see the use case
in SQL so far. Therefore while thinking of a Source abstraction that may be
shared with different use cases semantics is theoretically useful, doing
that may not bring us much value at this point. So I am convinced that it
doesn't have to be done right now and I have no further concern with the
design in the current FLIP.

Again, really appreciate the patient discussion! I learned quite a bit from
it.

Cheers,

Jiangjie (Becket) Qin

On Thu, Mar 26, 2020 at 8:58 PM Dawid Wysakowicz 
wrote:

> Hi Becket,
>
> Generally I don't think connector developers should bother with
> understanding any of the SQL concepts.
>
> I am not sure if we understand "connector developer" the same way. Let me
> describe how I see the process of writing a new source (that can be used in
> both Table & DataStream API)
>
> 1. Connector developer writes a Source that deals with the actual reading
> and deserializing (preferably with a pluggable format/deserializer). The
> result of that step should be something like:
>
> FilesystemSource
>
> .path(...)
>
> .format(ParquetFormat
>
> .filterPredicate(/* parquet specific filter */)
>
> .project(/* parquet specific projection */)
>
> .map(...))
>
> .watermarkAssigner(...)
>
>
> This is useful for DataStream and we can and want to use this in the Table
> API.  Those interface shouldn't accept any *Translators though. It does
> make no sense cause internally they are not dealing e.g. with the
> Expression. They should accept already created predicates.
>
> We are not designing anything at that level. This we expect from FLIP-27
>
> 2. Then we need to have a DynamicTableSource with different abilities that
> can create e.g. the parquet filter or projection from expressions. I think
> this is what you also describe in your second point. And this is what we
> are designing in the FLIP. Bear in mind that e.g. Deserializer will be
> created out of multiple SQL concepts: regular schema/computed
> columns/possibly projections etc., each applied at different planning
> stages.
>
> All of those interfaces serve the purpose of configuring the
> DynamicTableSource so that it is able to instantiate the Source with proper
> configuration. In other words it is a factory for the source that you can
> configure with SQL concepts. In turn this Factory will call another factory
> from point 1.
>
> I don't see a potential for unifying factories across different high level
> APIs. Taking your example with Spatial Database that operates on
> Coordinates and Area (even though those would rather be modeled as SQL
> types and we would still operate on Rows, but just for the sake of the
> example). In that respect there is no point in having a
> PushDownComputedColumns interface in the factory for the spatial database.
>
> Best,
>
> Dawid
>
>
> On 26/03/2020 11:47, Becket Qin wrote:
>
> Hi Timo,
>
> Regarding "connector developers just need to know how to write an
>
> ExpressionToParquetFilter":
>
>
> This is the entire purpose of the DynamicTableSource/DynamicTableSink.
>
> The bridging between SQL concepts and connector specific concepts.
> Because this is the tricky part. How to get from a SQL concept to a
> connctor concept.
>
> Maybe it is just a naming issue depending on whether one is looking upward
> from the Connectors perspective, or looking downward from the SQL
> perspective. If we agree that the connectors should provide semantic free
> API to the high level use cases, it seems we should follow the former path.
> And if there are one or two APIs that the connector developers have to
> understand in order to support Table / SQL, I think we can just address
> them case by case, instead of wrapping the entire low level source API
> with a set of new concepts.
>
> Correct me if I am wrong, can we tell the following story to a connector
> developer and get a all the TableSource functionality work?
>
> To provide a TableSource from a Source, one just need to know two more
> concepts: *Row* and *Expression*. The work to create a TableSource are
> following:
> 1. A connector developer can write three classes in order to build a table
> source:
>
>- Deserializer (Must-have)
>- PredicateTranslator (optional, only
>applicable if the Source is a FilterableSource)
>- PredicateTranslator (optional, only
>applicable if the Source is a ProjectableSource)
>
> 2. In order to let the table source be discoverable, one need to provide a
> Factory, and that Factory provides the following as a bundle:
>
>- The Source itself (Must-have)
>- The Deserializer (Must-have)
>- PredicateTranslator (optional, only
>applicable when the Factory is a FilterFactory)
>- PredicateTranslator (optional, only
>applicable when the Factory is a ProjectorFactory)
>

Re: [DISCUSS] FLIP-95: New TableSource and TableSink interfaces

2020-03-26 Thread Timo Walther

Hi Becket,

thanks for your feedback and the healthy discussion.

I think the connector story will still keep many of us busy in the next 
time. It would be great if concepts from SQL can positively influence 
the design of Source/Sink abstractions. Esp. we should think about some 
guidelines of how to design a connector in a semantic-free API as Dawid 
pointed out in his last email. We should not aim to develop 
SQL-specific/SQL-only runtime connectors.


@all: If there are no objections, I would like to start a voting thread 
by tomorrow. So this is the last call to give feedback for FLIP-95.


Thanks everyone,
Timo


On 26.03.20 14:56, Becket Qin wrote:

Hi Timo and Dawid,

Thanks for the patient explanation. I just had a phone call with Kurt and
Jark. I do see there are a few abstractions that we only see the use case
in SQL so far. Therefore while thinking of a Source abstraction that may be
shared with different use cases semantics is theoretically useful, doing
that may not bring us much value at this point. So I am convinced that it
doesn't have to be done right now and I have no further concern with the
design in the current FLIP.

Again, really appreciate the patient discussion! I learned quite a bit from
it.

Cheers,

Jiangjie (Becket) Qin

On Thu, Mar 26, 2020 at 8:58 PM Dawid Wysakowicz 
wrote:


Hi Becket,

Generally I don't think connector developers should bother with
understanding any of the SQL concepts.

I am not sure if we understand "connector developer" the same way. Let me
describe how I see the process of writing a new source (that can be used in
both Table & DataStream API)

1. Connector developer writes a Source that deals with the actual reading
and deserializing (preferably with a pluggable format/deserializer). The
result of that step should be something like:

FilesystemSource

 .path(...)

 .format(ParquetFormat

 .filterPredicate(/* parquet specific filter */)

 .project(/* parquet specific projection */)

 .map(...))

 .watermarkAssigner(...)


This is useful for DataStream and we can and want to use this in the Table
API.  Those interface shouldn't accept any *Translators though. It does
make no sense cause internally they are not dealing e.g. with the
Expression. They should accept already created predicates.

We are not designing anything at that level. This we expect from FLIP-27

2. Then we need to have a DynamicTableSource with different abilities that
can create e.g. the parquet filter or projection from expressions. I think
this is what you also describe in your second point. And this is what we
are designing in the FLIP. Bear in mind that e.g. Deserializer will be
created out of multiple SQL concepts: regular schema/computed
columns/possibly projections etc., each applied at different planning
stages.

All of those interfaces serve the purpose of configuring the
DynamicTableSource so that it is able to instantiate the Source with proper
configuration. In other words it is a factory for the source that you can
configure with SQL concepts. In turn this Factory will call another factory
from point 1.

I don't see a potential for unifying factories across different high level
APIs. Taking your example with Spatial Database that operates on
Coordinates and Area (even though those would rather be modeled as SQL
types and we would still operate on Rows, but just for the sake of the
example). In that respect there is no point in having a
PushDownComputedColumns interface in the factory for the spatial database.

Best,

Dawid


On 26/03/2020 11:47, Becket Qin wrote:

Hi Timo,

Regarding "connector developers just need to know how to write an

ExpressionToParquetFilter":


This is the entire purpose of the DynamicTableSource/DynamicTableSink.

The bridging between SQL concepts and connector specific concepts.
Because this is the tricky part. How to get from a SQL concept to a
connctor concept.

Maybe it is just a naming issue depending on whether one is looking upward
from the Connectors perspective, or looking downward from the SQL
perspective. If we agree that the connectors should provide semantic free
API to the high level use cases, it seems we should follow the former path.
And if there are one or two APIs that the connector developers have to
understand in order to support Table / SQL, I think we can just address
them case by case, instead of wrapping the entire low level source API
with a set of new concepts.

Correct me if I am wrong, can we tell the following story to a connector
developer and get a all the TableSource functionality work?

To provide a TableSource from a Source, one just need to know two more
concepts: *Row* and *Expression*. The work to create a TableSource are
following:
1. A connector developer can write three classes in order to build a table
source:

- Deserializer (Must-have)
- PredicateTranslator (optional, only
applicable if the Source is a FilterableSource

Re: [Discuss] FLINK-16039 Add API method to get last element in session window

2020-03-26 Thread Dawid Wysakowicz
Hi Manas,

First of all I think your understanding of how the session windows work
is correct.

I tend to slightly disagree that the end for a session window is wrong.
It is my personal opinion though. I see it this way that a TimeWindow in
case of a session window is the session itself. The session always ends
after a period of inactivity. Take a user session on a webpage. Such a
session does not end/isn't brought down at the time of a last event. It
is closed after a period of inactivity. In such scenario I think the
behavior of the session window is correct.

Moreover you can achieve what you are describing with an aggregate[1]
function. You can easily maintain the biggest number seen for a window.

Lastly, I think the overall feeling in the community is that we are very
skeptical towards extending the Windows API. From what I've heard and
experienced the ProcessFunction[2] is a much better principle to build
custom solutions upon, as in fact its easier to control and even
understand. That said I am rather against introducing that change.

Best,

Dawid

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/windows.html#aggregatefunction

[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html#process-function-low-level-operations

On 13/03/2020 09:46, Manas Kale wrote:
> Hi all,
> I would like to start a discussion on this feature request (JIRA link).
> 
>
> Consider the events :
>
> [1, event], [2, event]
>
> where first element is event timestamp in seconds and second element is
> event code/name.
>
> Also consider that an Event time session window with inactivityGap = 2
> seconds is acting on above stream.
>
> When the first event arrives, a session window should be created that is
> [1,1].
>
> When the second event arrives, a new session window should be created that
> is [2,2]. Since this falls within firstWindowTimestamp+inactivityGap, it
> should be merged into session window [1,2] and  [2,2] should be deleted.
>
>
> *This is my understanding of how session windows are created. Please
> correct me if wrong.*
> However, Flink does not follow such a definition of windows semantically.
> If I call the  getEnd() method of the TimeWindow() class, I get back
> timestamp + inactivityGap.
>
> For the above example, after processing the first element, I would get 1 +
> 2 = 3 seconds as the window "end".
>
> The actual window end should be the timestamp 1, which is the last event in
> the session window.
>
> A solution would be to change the "end" definition of all windows, but I
> suppose this would be breaking and would need some debate.
>
> Therefore, I propose an intermediate solution : add a new API method that
> keeps track of the last element added in the session window.
>
> If there is agreement on this, I would like to start drafting a change
> document and implement this.
>



signature.asc
Description: OpenPGP digital signature


[DISCUSS] Creating a new repo to host Stateful Functions Dockerfiles

2020-03-26 Thread Tzu-Li (Gordon) Tai
Hi Flink devs,

As part of a Stateful Functions release, we would like to publish Stateful
Functions Docker images to Dockerhub as an official image.

Some background context on Stateful Function images, for those who are not
familiar with the project yet:

   - Stateful Function images are built on top of the Flink official
   images, with additional StateFun dependencies being added.
   You can take a look at the scripts we currently use to build the images
   locally for development purposes [1].
   - They are quite important for user experience, since building a Docker
   image is the recommended go-to deployment mode for StateFun user
   applications [2].


A prerequisite for all of this is to first decide where we host the
Stateful Functions Dockerfiles,
before we can proceed with the process of requesting a new official image
repository at Dockerhub.

We’re proposing to create a new dedicated repo for this purpose,
with the name `apache/flink-statefun-docker`.

While we did initially consider integrating the StateFun Dockerfiles to be
hosted together with the Flink ones in the existing `apache/flink-docker`
repo, we had the following concerns:

   - In general, it is a convention that each official Dockerhub image is
   backed by a dedicated source repo hosting the Dockerfiles.
   - The `apache/flink-docker` repo already has quite a few dedicated
   tooling and CI smoke tests specific for the Flink images.
   - Flink and StateFun have separate versioning schemes and independent
   release cycles. A new Flink release does not necessarily require a
   “lock-step” to release new StateFun images as well.
   - Considering the above all-together, and the fact that creating a new
   repo is rather low-effort, having a separate repo would probably make more
   sense here.


What do you think?

Cheers,
Gordon

[1]
https://github.com/apache/flink-statefun/blob/master/tools/docker/build-stateful-functions.sh
[2]
https://ci.apache.org/projects/flink/flink-statefun-docs-master/deployment-and-operations/packaging.html


Re: [DISCUSS] Creating a new repo to host Stateful Functions Dockerfiles

2020-03-26 Thread Stephan Ewen
+1 to a separate repository.

It seems to be best practice in the docker community.
And since it does not add overhead, why not go with the best practice?

Best,
Stephan


On Thu, Mar 26, 2020 at 4:15 PM Tzu-Li (Gordon) Tai 
wrote:

> Hi Flink devs,
>
> As part of a Stateful Functions release, we would like to publish Stateful
> Functions Docker images to Dockerhub as an official image.
>
> Some background context on Stateful Function images, for those who are not
> familiar with the project yet:
>
>- Stateful Function images are built on top of the Flink official
>images, with additional StateFun dependencies being added.
>You can take a look at the scripts we currently use to build the images
>locally for development purposes [1].
>- They are quite important for user experience, since building a Docker
>image is the recommended go-to deployment mode for StateFun user
>applications [2].
>
>
> A prerequisite for all of this is to first decide where we host the
> Stateful Functions Dockerfiles,
> before we can proceed with the process of requesting a new official image
> repository at Dockerhub.
>
> We’re proposing to create a new dedicated repo for this purpose,
> with the name `apache/flink-statefun-docker`.
>
> While we did initially consider integrating the StateFun Dockerfiles to be
> hosted together with the Flink ones in the existing `apache/flink-docker`
> repo, we had the following concerns:
>
>- In general, it is a convention that each official Dockerhub image is
>backed by a dedicated source repo hosting the Dockerfiles.
>- The `apache/flink-docker` repo already has quite a few dedicated
>tooling and CI smoke tests specific for the Flink images.
>- Flink and StateFun have separate versioning schemes and independent
>release cycles. A new Flink release does not necessarily require a
>“lock-step” to release new StateFun images as well.
>- Considering the above all-together, and the fact that creating a new
>repo is rather low-effort, having a separate repo would probably make
> more
>sense here.
>
>
> What do you think?
>
> Cheers,
> Gordon
>
> [1]
>
> https://github.com/apache/flink-statefun/blob/master/tools/docker/build-stateful-functions.sh
> [2]
>
> https://ci.apache.org/projects/flink/flink-statefun-docs-master/deployment-and-operations/packaging.html
>


Re: [DISCUSS] Creating a new repo to host Stateful Functions Dockerfiles

2020-03-26 Thread Igal Shilman
+1 for a separate repository.

Thanks,
Igal

On Thursday, March 26, 2020, Stephan Ewen  wrote:

> +1 to a separate repository.
>
> It seems to be best practice in the docker community.
> And since it does not add overhead, why not go with the best practice?
>
> Best,
> Stephan
>
>
> On Thu, Mar 26, 2020 at 4:15 PM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi Flink devs,
> >
> > As part of a Stateful Functions release, we would like to publish
> Stateful
> > Functions Docker images to Dockerhub as an official image.
> >
> > Some background context on Stateful Function images, for those who are
> not
> > familiar with the project yet:
> >
> >- Stateful Function images are built on top of the Flink official
> >images, with additional StateFun dependencies being added.
> >You can take a look at the scripts we currently use to build the
> images
> >locally for development purposes [1].
> >- They are quite important for user experience, since building a
> Docker
> >image is the recommended go-to deployment mode for StateFun user
> >applications [2].
> >
> >
> > A prerequisite for all of this is to first decide where we host the
> > Stateful Functions Dockerfiles,
> > before we can proceed with the process of requesting a new official image
> > repository at Dockerhub.
> >
> > We’re proposing to create a new dedicated repo for this purpose,
> > with the name `apache/flink-statefun-docker`.
> >
> > While we did initially consider integrating the StateFun Dockerfiles to
> be
> > hosted together with the Flink ones in the existing `apache/flink-docker`
> > repo, we had the following concerns:
> >
> >- In general, it is a convention that each official Dockerhub image is
> >backed by a dedicated source repo hosting the Dockerfiles.
> >- The `apache/flink-docker` repo already has quite a few dedicated
> >tooling and CI smoke tests specific for the Flink images.
> >- Flink and StateFun have separate versioning schemes and independent
> >release cycles. A new Flink release does not necessarily require a
> >“lock-step” to release new StateFun images as well.
> >- Considering the above all-together, and the fact that creating a new
> >repo is rather low-effort, having a separate repo would probably make
> > more
> >sense here.
> >
> >
> > What do you think?
> >
> > Cheers,
> > Gordon
> >
> > [1]
> >
> > https://github.com/apache/flink-statefun/blob/master/
> tools/docker/build-stateful-functions.sh
> > [2]
> >
> > https://ci.apache.org/projects/flink/flink-statefun-
> docs-master/deployment-and-operations/packaging.html
> >
>


Re: [DISCUSS] Creating a new repo to host Stateful Functions Dockerfiles

2020-03-26 Thread Yu Li
+1 to use a dedicated repository. All reasons listed in the proposal make
sense to me.

Best Regards,
Yu


On Thu, 26 Mar 2020 at 23:56, Igal Shilman  wrote:

> +1 for a separate repository.
>
> Thanks,
> Igal
>
> On Thursday, March 26, 2020, Stephan Ewen  wrote:
>
> > +1 to a separate repository.
> >
> > It seems to be best practice in the docker community.
> > And since it does not add overhead, why not go with the best practice?
> >
> > Best,
> > Stephan
> >
> >
> > On Thu, Mar 26, 2020 at 4:15 PM Tzu-Li (Gordon) Tai  >
> > wrote:
> >
> > > Hi Flink devs,
> > >
> > > As part of a Stateful Functions release, we would like to publish
> > Stateful
> > > Functions Docker images to Dockerhub as an official image.
> > >
> > > Some background context on Stateful Function images, for those who are
> > not
> > > familiar with the project yet:
> > >
> > >- Stateful Function images are built on top of the Flink official
> > >images, with additional StateFun dependencies being added.
> > >You can take a look at the scripts we currently use to build the
> > images
> > >locally for development purposes [1].
> > >- They are quite important for user experience, since building a
> > Docker
> > >image is the recommended go-to deployment mode for StateFun user
> > >applications [2].
> > >
> > >
> > > A prerequisite for all of this is to first decide where we host the
> > > Stateful Functions Dockerfiles,
> > > before we can proceed with the process of requesting a new official
> image
> > > repository at Dockerhub.
> > >
> > > We’re proposing to create a new dedicated repo for this purpose,
> > > with the name `apache/flink-statefun-docker`.
> > >
> > > While we did initially consider integrating the StateFun Dockerfiles to
> > be
> > > hosted together with the Flink ones in the existing
> `apache/flink-docker`
> > > repo, we had the following concerns:
> > >
> > >- In general, it is a convention that each official Dockerhub image
> is
> > >backed by a dedicated source repo hosting the Dockerfiles.
> > >- The `apache/flink-docker` repo already has quite a few dedicated
> > >tooling and CI smoke tests specific for the Flink images.
> > >- Flink and StateFun have separate versioning schemes and
> independent
> > >release cycles. A new Flink release does not necessarily require a
> > >“lock-step” to release new StateFun images as well.
> > >- Considering the above all-together, and the fact that creating a
> new
> > >repo is rather low-effort, having a separate repo would probably
> make
> > > more
> > >sense here.
> > >
> > >
> > > What do you think?
> > >
> > > Cheers,
> > > Gordon
> > >
> > > [1]
> > >
> > > https://github.com/apache/flink-statefun/blob/master/
> > tools/docker/build-stateful-functions.sh
> > > [2]
> > >
> > > https://ci.apache.org/projects/flink/flink-statefun-
> > docs-master/deployment-and-operations/packaging.html
> > >
> >
>


Re: [DISCUSS] Creating a new repo to host Stateful Functions Dockerfiles

2020-03-26 Thread Ufuk Celebi
+1.

The repo creation process is a light-weight, automated process on the ASF
side. When Patrick Lucas contributed docker-flink back to the Flink
community (as flink-docker), there was virtually no overhead in creating
the repository. Reusing build scripts should still be possible at the cost
of some duplication which is fine imo.

– Ufuk

On Thu, Mar 26, 2020 at 4:18 PM Stephan Ewen  wrote:
>
> +1 to a separate repository.
>
> It seems to be best practice in the docker community.
> And since it does not add overhead, why not go with the best practice?
>
> Best,
> Stephan
>
>
> On Thu, Mar 26, 2020 at 4:15 PM Tzu-Li (Gordon) Tai 
wrote:
>>
>> Hi Flink devs,
>>
>> As part of a Stateful Functions release, we would like to publish
Stateful
>> Functions Docker images to Dockerhub as an official image.
>>
>> Some background context on Stateful Function images, for those who are
not
>> familiar with the project yet:
>>
>>- Stateful Function images are built on top of the Flink official
>>images, with additional StateFun dependencies being added.
>>You can take a look at the scripts we currently use to build the
images
>>locally for development purposes [1].
>>- They are quite important for user experience, since building a
Docker
>>image is the recommended go-to deployment mode for StateFun user
>>applications [2].
>>
>>
>> A prerequisite for all of this is to first decide where we host the
>> Stateful Functions Dockerfiles,
>> before we can proceed with the process of requesting a new official image
>> repository at Dockerhub.
>>
>> We’re proposing to create a new dedicated repo for this purpose,
>> with the name `apache/flink-statefun-docker`.
>>
>> While we did initially consider integrating the StateFun Dockerfiles to
be
>> hosted together with the Flink ones in the existing `apache/flink-docker`
>> repo, we had the following concerns:
>>
>>- In general, it is a convention that each official Dockerhub image is
>>backed by a dedicated source repo hosting the Dockerfiles.
>>- The `apache/flink-docker` repo already has quite a few dedicated
>>tooling and CI smoke tests specific for the Flink images.
>>- Flink and StateFun have separate versioning schemes and independent
>>release cycles. A new Flink release does not necessarily require a
>>“lock-step” to release new StateFun images as well.
>>- Considering the above all-together, and the fact that creating a new
>>repo is rather low-effort, having a separate repo would probably make
more
>>sense here.
>>
>>
>> What do you think?
>>
>> Cheers,
>> Gordon
>>
>> [1]
>>
https://github.com/apache/flink-statefun/blob/master/tools/docker/build-stateful-functions.sh
>> [2]
>>
https://ci.apache.org/projects/flink/flink-statefun-docs-master/deployment-and-operations/packaging.html


[VOTE] Apache Flink Stateful Functions Release 2.0.0, release candidate #1

2020-03-26 Thread Tzu-Li (Gordon) Tai
Hi everyone,

Please review and vote on the release candidate #0 for the version 2.0.0 of
Apache Flink Stateful Functions,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

**Testing Guideline**

You can find here [1] a doc that we can use for collaborating testing
efforts.
The listed testing tasks in the doc also serve as a guideline in what to
test for this release.
If you wish to take ownership of a testing task, simply put your name down
in the "Checked by" field of the task.

**Release Overview**

As an overview, the release consists of the following:
a) Stateful Functions canonical source distribution, to be deployed to the
release repository at dist.apache.org
b) Stateful Functions Python SDK distributions to be deployed to PyPI
c) Maven artifacts to be deployed to the Maven Central Repository

**Staging Areas to Review**

The staging areas containing the above mentioned artifacts are as follows,
for your review:
* All artifacts for a) and b) can be found in the corresponding dev
repository at dist.apache.org [2]
* All artifacts for c) can be found at the Apache Nexus Repository [3]

All artifacts are singed with the
key 1C1E2394D3194E1944613488F320986D35C33D6A [4]

Other links for your review:
* JIRA release notes [5]
* source code tag "release-2.0.0-rc0" [6] [7]

**Extra Remarks**

* Part of the release is also official Docker images for Stateful
Functions. This can be a separate process, since the creation of those
relies on the fact that we have distribution jars already deployed to
Maven. I will follow-up with this after these artifacts are officially
released.
In the meantime, there is this discussion [8] ongoing about where to host
the StateFun Dockerfiles.
* The Flink Website and blog post is also being worked on (by Marta) as
part of the release, to incorporate the new Stateful Functions project. We
can follow up with a link to those changes afterwards in this vote thread,
but that would not block you to test and cast your votes already.

**Vote Duration**

The vote will be open for at least 72 hours *(target end date is next
Tuesday, April 31).*
It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Gordon

[1]
https://docs.google.com/document/d/1P9yjwSbPQtul0z2AXMnVolWQbzhxs68suJvzR6xMjcs/edit?usp=sharing
[2] https://dist.apache.org/repos/dist/dev/flink/flink-statefun-2.0.0-rc1/
[3] https://repository.apache.org/content/repositories/orgapacheflink-1339/
[4] https://dist.apache.org/repos/dist/release/flink/KEYS
[5]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346878
[6]
https://gitbox.apache.org/repos/asf?p=flink-statefun.git;a=commit;h=ebd7ca866f7d11fa43c7a5bb36861ee1b24b0980
[7] https://github.com/apache/flink-statefun/tree/release-2.0.0-rc1
[8]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Creating-a-new-repo-to-host-Stateful-Functions-Dockerfiles-td39342.html

TIP: You can create a `settings.xml` file with these contents:

"""

  
flink-statefun-2.0.0
  
  

  flink-statefun-2.0.0
  

  flink-statefun-2.0.0
  
https://repository.apache.org/content/repositories/orgapacheflink-1339/



  archetype
  
https://repository.apache.org/content/repositories/orgapacheflink-1339/


  

  

"""

And reference that in you maven commands via `--settings
path/to/settings.xml`.
This is useful for creating a quickstart based on the staged release and
for building against the staged jars.


Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-26 Thread Till Rohrmann
Hi everyone,

I'm a bit late to the party. I think the current proposal looks good.

Concerning the ExternalResourceDriver interface defined in the FLIP [1], I
would suggest to not include the decorator calls for Kubernetes and Yarn in
the base interface. Instead I would suggest to segregate the deployment
specific decorator calls into separate interfaces. That way an
ExternalResourceDriver does not have to support all deployments from the
very beginning. Moreover, some resources might not be supported by a
specific deployment target and the natural way to express this would be to
not implement the respective deployment specific interface.

Moreover, having void
addExternalResourceToRequest(AMRMClient.ContainerRequest containerRequest)
in the ExternalResourceDriver interface would require Hadoop on Flink's
classpath whenever the external resource driver is being used.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink

Cheers,
Till

On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen  wrote:

> Nice, thanks a lot!
>
> On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo  wrote:
>
> > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> >
> > I've updated the FLIP accordingly. I do not add a
> > ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
> > which takes the responsibility of all relevant operations on both RM
> > and TM sides.
> > After a rethink about decoupling the management of external resources
> > from TaskExecutor, I think we could do the same thing on the
> > ResourceManager side. We do not need to add a specific allocation
> > logic to the ResourceManager each time we add a specific external
> > resource.
> > - For Yarn, we need the ExternalResourceDriver to edit the
> > containerRequest.
> > - For Kubenetes, ExternalResourceDriver could provide a decorator for
> > the TM pod.
> >
> > In this way, just like MetricReporter, we allow users to define their
> > custom ExternalResourceDriver. It is more extensible and fits the
> > separation of concerns. For more details, please take a look at [1].
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen  wrote:
> > >
> > > This sounds good to go ahead from my side.
> > >
> > > I like the approach that Becket suggested - in that case the core
> > > abstraction that everyone would need to understand would be "external
> > > resource allocation" and the "ResourceInfoProvider", and the GPU
> specific
> > > code would be a specific implementation only known to that component
> that
> > > allocates the external resource. That fits the separation of concerns
> > well.
> > >
> > > I also understand that it should not be over-engineered in the first
> > > version, so some simplification makes sense, and then gradually expand
> > from
> > > there.
> > >
> > > So +1 to go ahead with what was suggested above (Xintong / Becket) from
> > my
> > > side.
> > >
> > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song 
> > wrote:
> > >
> > > > Thanks for the comments, Stephan & Becket.
> > > >
> > > > @Stephan
> > > >
> > > > I see your concern, and I completely agree with you that we should
> > first
> > > > think about the "library" / "plugin" / "extension" style if possible.
> > > >
> > > > If GPUs are sliced and assigned during scheduling, there may be
> reason,
> > > > > although it looks that it would belong to the slot then. Is that
> > what we
> > > > > are doing here?
> > > >
> > > >
> > > > In the current proposal, we do not have the GPUs sliced and assigned
> to
> > > > slots, because it could be problematic without dynamic slot
> allocation.
> > > > E.g., the number of GPUs might not be evenly divisible by the number
> of
> > > > slots.
> > > >
> > > > I think it makes sense to eventually have the GPUs assigned to slots.
> > Even
> > > > then, we might still need a TM level GPUManager (or ResourceProvider
> > like
> > > > Becket suggested). For memory, in each slot we can simply request the
> > > > amount of memory, leaving it to JVM / OS to decide which memory
> > (address)
> > > > should be assigned. For GPU, and potentially other resources like
> > FPGA, we
> > > > need to explicitly specify which GPU (index) should be used.
> > Therefore, we
> > > > need some component at the TM level to coordinate which slot uses
> which
> > > > GPU.
> > > >
> > > > IMO, unless we say Flink will not support slot-level GPU slicing at
> > least
> > > > in the foreseeable future, I don't see a good way to avoid touching
> > the TM
> > > > core. To that end, I think Becket's suggestion points to a good
> > direction,
> > > > that supports more features (GPU, FPGA, etc.) with less coupling to
> > the TM
> > > > core (only needs to understand the general interfaces). The detailed
> > > > implementation for specific resource types can even be encapsulated
> as
> > a
> > > > library.
> > > >
> >

Re: [DISCUSS] Creating a new repo to host Stateful Functions Dockerfiles

2020-03-26 Thread Till Rohrmann
+1 for a separate repository.

Cheers,
Till

On Thu, Mar 26, 2020 at 5:13 PM Ufuk Celebi  wrote:

> +1.
>
> The repo creation process is a light-weight, automated process on the ASF
> side. When Patrick Lucas contributed docker-flink back to the Flink
> community (as flink-docker), there was virtually no overhead in creating
> the repository. Reusing build scripts should still be possible at the cost
> of some duplication which is fine imo.
>
> – Ufuk
>
> On Thu, Mar 26, 2020 at 4:18 PM Stephan Ewen  wrote:
> >
> > +1 to a separate repository.
> >
> > It seems to be best practice in the docker community.
> > And since it does not add overhead, why not go with the best practice?
> >
> > Best,
> > Stephan
> >
> >
> > On Thu, Mar 26, 2020 at 4:15 PM Tzu-Li (Gordon) Tai  >
> wrote:
> >>
> >> Hi Flink devs,
> >>
> >> As part of a Stateful Functions release, we would like to publish
> Stateful
> >> Functions Docker images to Dockerhub as an official image.
> >>
> >> Some background context on Stateful Function images, for those who are
> not
> >> familiar with the project yet:
> >>
> >>- Stateful Function images are built on top of the Flink official
> >>images, with additional StateFun dependencies being added.
> >>You can take a look at the scripts we currently use to build the
> images
> >>locally for development purposes [1].
> >>- They are quite important for user experience, since building a
> Docker
> >>image is the recommended go-to deployment mode for StateFun user
> >>applications [2].
> >>
> >>
> >> A prerequisite for all of this is to first decide where we host the
> >> Stateful Functions Dockerfiles,
> >> before we can proceed with the process of requesting a new official
> image
> >> repository at Dockerhub.
> >>
> >> We’re proposing to create a new dedicated repo for this purpose,
> >> with the name `apache/flink-statefun-docker`.
> >>
> >> While we did initially consider integrating the StateFun Dockerfiles to
> be
> >> hosted together with the Flink ones in the existing
> `apache/flink-docker`
> >> repo, we had the following concerns:
> >>
> >>- In general, it is a convention that each official Dockerhub image
> is
> >>backed by a dedicated source repo hosting the Dockerfiles.
> >>- The `apache/flink-docker` repo already has quite a few dedicated
> >>tooling and CI smoke tests specific for the Flink images.
> >>- Flink and StateFun have separate versioning schemes and independent
> >>release cycles. A new Flink release does not necessarily require a
> >>“lock-step” to release new StateFun images as well.
> >>- Considering the above all-together, and the fact that creating a
> new
> >>repo is rather low-effort, having a separate repo would probably make
> more
> >>sense here.
> >>
> >>
> >> What do you think?
> >>
> >> Cheers,
> >> Gordon
> >>
> >> [1]
> >>
>
> https://github.com/apache/flink-statefun/blob/master/tools/docker/build-stateful-functions.sh
> >> [2]
> >>
>
> https://ci.apache.org/projects/flink/flink-statefun-docs-master/deployment-and-operations/packaging.html
>


[DISCUSS] FLIP-119: Pipelined Region Scheduling

2020-03-26 Thread Gary Yao
Hi community,

In the past releases, we have been working on refactoring Flink's scheduler
with the goal of making the scheduler extensible [1]. We have rolled out
most of the intended refactoring in Flink 1.10, and we think it is now time
to leverage our newly introduced abstractions to implement a new resource
optimized scheduling strategy: Pipelined Region Scheduling.

This scheduling strategy aims at:

* avoidance of resource deadlocks when running batch jobs

* tunable with respect to resource consumption and throughput

More details can be found in the Wiki [2]. We are looking forward to your
feedback.

Best,

Zhu Zhu & Gary

[1] https://issues.apache.org/jira/browse/FLINK-10429

[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling


Re: [DISCUSS] FLIP-118: Improve Flink’s ID system

2020-03-26 Thread Till Rohrmann
Hi Yangze,

thanks for creating this FLIP. I think it is a very good improvement
helping our users and ourselves understanding better what's going on in
Flink.

Creating the ResourceIDs with host information/pod name is a good idea.

Also deriving ExecutionGraph IDs from their superset ID is a good idea.

The InstanceID is used for fencing purposes. I would not make it a
composition of the ResourceID + a monotonically increasing number. The
problem is that in case of a RM failure the InstanceIDs would start from 0
again and this could lead to collisions.

Logging more information on how the different runtime IDs are correlated is
also a good idea.

Two other ideas for simplifying the ids are the following:

* The SlotRequestID was introduced because the SlotPool was a separate
RpcEndpoint a while ago. With this no longer being the case I think we
could remove the SlotRequestID and replace it with the AllocationID.
* Instead of creating new SlotRequestIDs for multi task slots one could
derive them from the SlotRequestID used for requesting the underlying
AllocatedSlot.

Given that the slot sharing logic will most likely be reworked with the
pipelined region scheduling, we might be able to resolve these two points
as part of the pipelined region scheduling effort.

Cheers,
Till

On Thu, Mar 26, 2020 at 10:51 AM Yangze Guo  wrote:

> Hi everyone,
>
> We would like to start a discussion thread on "FLIP-118: Improve
> Flink’s ID system"[1].
>
> This FLIP mainly discusses the following issues, target to enhance the
> readability of IDs in log and help user to debug in case of failures:
>
> - Enhance the readability of the string literals of IDs. Most of them
> are hashcodes, e.g. ExecutionAttemptID, which do not provide much
> meaningful information and are hard to recognize and compare for
> users.
> - Log the ID’s lineage information to make debugging more convenient.
> Currently, the log fails to always show the lineage information
> between IDs. Finding out relationships between entities identified by
> given IDs is a common demand, e.g., slot of which AllocationID is
> assigned to satisfy slot request of with SlotRequestID. Absence of
> such lineage information, it’s impossible to track the end to end
> lifecycle of an Execution or a Task now, which makes debugging
> difficult.
>
> Key changes proposed in the FLIP are as follows:
>
> - Add location information to distributed components
> - Add topology information to graph components
> - Log the ID’s lineage information
> - Expose the identifier of distributing component to user
>
> Please find more details in the FLIP wiki document [1]. Looking forward to
> your feedbacks.
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=148643521
>
> Best,
> Yangze Guo
>


[jira] [Created] (FLINK-16810) add back PostgresCatalogITCase

2020-03-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-16810:


 Summary: add back PostgresCatalogITCase
 Key: FLINK-16810
 URL: https://issues.apache.org/jira/browse/FLINK-16810
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / JDBC
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16811) introduce JDBCRowConverter

2020-03-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-16811:


 Summary: introduce JDBCRowConverter
 Key: FLINK-16811
 URL: https://issues.apache.org/jira/browse/FLINK-16811
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.11.0


currently when JDBCInputFormat converts a JDBC result set row to Flink Row, it 
doesn't check the type returned from jdbc result set. Problem is that object 
from jdbc result set doesn't always match the corresponding type in relational 
db. E.g. a short column in Postgres actually returns a Integer via jdbc. And 
such mismatch can be db-dependent.

 

Thus, we introduce JDBCRowConverter interface to convert a db specific row from 
jdbc to Flink row. Dbs should implement their own row converters.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16812) introduce PostgresRowConverter

2020-03-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-16812:


 Summary: introduce PostgresRowConverter
 Key: FLINK-16812
 URL: https://issues.apache.org/jira/browse/FLINK-16812
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / JDBC
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.11.0


per https://issues.apache.org/jira/browse/FLINK-16811



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16813) JDBCInputFormat doesn't correctly map Short

2020-03-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-16813:


 Summary:  JDBCInputFormat doesn't correctly map Short
 Key: FLINK-16813
 URL: https://issues.apache.org/jira/browse/FLINK-16813
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / JDBC
Affects Versions: 1.10.0
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.11.0


currently when JDBCInputFormat converts a JDBC result set row to Flink Row, it 
doesn't check the type returned from jdbc result set. Problem is that object 
from jdbc result set doesn't always match the corresponding type in relational 
db. E.g. a short column in Postgres actually returns an Integer via jdbc.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16814) StringUtils.arrayToString() doesn't convert byte[] correctly

2020-03-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-16814:


 Summary: StringUtils.arrayToString() doesn't convert byte[] 
correctly
 Key: FLINK-16814
 URL: https://issues.apache.org/jira/browse/FLINK-16814
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Affects Versions: 1.10.0
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.11.0


StringUtils.arrayToString() doesn't convert byte[] correctly. It uses 
Arrays.toString() but should be newing a string from the byte[]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16815) add e2e tests for reading from postgres with JDBCTableSource and PostgresCatalog

2020-03-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-16815:


 Summary: add e2e tests for reading from postgres with 
JDBCTableSource and PostgresCatalog
 Key: FLINK-16815
 URL: https://issues.apache.org/jira/browse/FLINK-16815
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / JDBC
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16816) planner doesn't parse timestamp array correctly

2020-03-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-16816:


 Summary: planner doesn't parse timestamp array correctly
 Key: FLINK-16816
 URL: https://issues.apache.org/jira/browse/FLINK-16816
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner, Table SQL / Runtime
Reporter: Bowen Li
Assignee: Kurt Young
 Fix For: 1.11.0


planner doesn't parse timestamp array correctly.

 

Repro: 

In a input format (like JBDCInputFormat)'s \{{nextRecord(Row)}} API
 # when setting a timestamp datum as java.sql.Timestamp, it works fine
 # when setting an array of timestamp datums as java.sql.Timestamp[], it breaks 
and below is the strack trace

 
{code:java}
/Caused by: java.lang.ClassCastException: java.sql.Timestamp cannot be cast to 
java.time.LocalDateTime
at 
org.apache.flink.table.dataformat.DataFormatConverters$LocalDateTimeConverter.toInternalImpl(DataFormatConverters.java:748)
at 
org.apache.flink.table.dataformat.DataFormatConverters$ObjectArrayConverter.toBinaryArray(DataFormatConverters.java:1110)
at 
org.apache.flink.table.dataformat.DataFormatConverters$ObjectArrayConverter.toInternalImpl(DataFormatConverters.java:1093)
at 
org.apache.flink.table.dataformat.DataFormatConverters$ObjectArrayConverter.toInternalImpl(DataFormatConverters.java:1068)
at 
org.apache.flink.table.dataformat.DataFormatConverters$DataFormatConverter.toInternal(DataFormatConverters.java:344)
at 
org.apache.flink.table.dataformat.DataFormatConverters$RowConverter.toInternalImpl(DataFormatConverters.java:1377)
at 
org.apache.flink.table.dataformat.DataFormatConverters$RowConverter.toInternalImpl(DataFormatConverters.java:1365)
at 
org.apache.flink.table.dataformat.DataFormatConverters$DataFormatConverter.toInternal(DataFormatConverters.java:344)
at SourceConversion$1.processElement(Unknown Source)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:714)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:689)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:669)
at 
org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:52)
at 
org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:30)
at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
at 
org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:93)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:208)
{code}

seems that planner runtime handles java.sql.Timetamp in these two cases 
differently



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16817) StringUtils.arrayToString() doesn't convert byte[][] correctly

2020-03-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-16817:


 Summary: StringUtils.arrayToString() doesn't convert byte[][] 
correctly
 Key: FLINK-16817
 URL: https://issues.apache.org/jira/browse/FLINK-16817
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Creating a new repo to host Stateful Functions Dockerfiles

2020-03-26 Thread Hequn Cheng
+1 for a separate repository.
The dedicated `flink-docker` repo works fine now. We can do it similarly.

Best,
Hequn

On Fri, Mar 27, 2020 at 1:16 AM Till Rohrmann  wrote:

> +1 for a separate repository.
>
> Cheers,
> Till
>
> On Thu, Mar 26, 2020 at 5:13 PM Ufuk Celebi  wrote:
>
> > +1.
> >
> > The repo creation process is a light-weight, automated process on the ASF
> > side. When Patrick Lucas contributed docker-flink back to the Flink
> > community (as flink-docker), there was virtually no overhead in creating
> > the repository. Reusing build scripts should still be possible at the
> cost
> > of some duplication which is fine imo.
> >
> > – Ufuk
> >
> > On Thu, Mar 26, 2020 at 4:18 PM Stephan Ewen  wrote:
> > >
> > > +1 to a separate repository.
> > >
> > > It seems to be best practice in the docker community.
> > > And since it does not add overhead, why not go with the best practice?
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > > On Thu, Mar 26, 2020 at 4:15 PM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org
> > >
> > wrote:
> > >>
> > >> Hi Flink devs,
> > >>
> > >> As part of a Stateful Functions release, we would like to publish
> > Stateful
> > >> Functions Docker images to Dockerhub as an official image.
> > >>
> > >> Some background context on Stateful Function images, for those who are
> > not
> > >> familiar with the project yet:
> > >>
> > >>- Stateful Function images are built on top of the Flink official
> > >>images, with additional StateFun dependencies being added.
> > >>You can take a look at the scripts we currently use to build the
> > images
> > >>locally for development purposes [1].
> > >>- They are quite important for user experience, since building a
> > Docker
> > >>image is the recommended go-to deployment mode for StateFun user
> > >>applications [2].
> > >>
> > >>
> > >> A prerequisite for all of this is to first decide where we host the
> > >> Stateful Functions Dockerfiles,
> > >> before we can proceed with the process of requesting a new official
> > image
> > >> repository at Dockerhub.
> > >>
> > >> We’re proposing to create a new dedicated repo for this purpose,
> > >> with the name `apache/flink-statefun-docker`.
> > >>
> > >> While we did initially consider integrating the StateFun Dockerfiles
> to
> > be
> > >> hosted together with the Flink ones in the existing
> > `apache/flink-docker`
> > >> repo, we had the following concerns:
> > >>
> > >>- In general, it is a convention that each official Dockerhub image
> > is
> > >>backed by a dedicated source repo hosting the Dockerfiles.
> > >>- The `apache/flink-docker` repo already has quite a few dedicated
> > >>tooling and CI smoke tests specific for the Flink images.
> > >>- Flink and StateFun have separate versioning schemes and
> independent
> > >>release cycles. A new Flink release does not necessarily require a
> > >>“lock-step” to release new StateFun images as well.
> > >>- Considering the above all-together, and the fact that creating a
> > new
> > >>repo is rather low-effort, having a separate repo would probably
> make
> > more
> > >>sense here.
> > >>
> > >>
> > >> What do you think?
> > >>
> > >> Cheers,
> > >> Gordon
> > >>
> > >> [1]
> > >>
> >
> >
> https://github.com/apache/flink-statefun/blob/master/tools/docker/build-stateful-functions.sh
> > >> [2]
> > >>
> >
> >
> https://ci.apache.org/projects/flink/flink-statefun-docs-master/deployment-and-operations/packaging.html
> >
>


[jira] [Created] (FLINK-16818) Optimize data skew when flink write data to hive dynamic partition table

2020-03-26 Thread Jun Zhang (Jira)
Jun Zhang created FLINK-16818:
-

 Summary: Optimize data skew when flink write data to hive dynamic 
partition table
 Key: FLINK-16818
 URL: https://issues.apache.org/jira/browse/FLINK-16818
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Affects Versions: 1.10.0
 Environment: {code:java}
 {code}
Reporter: Jun Zhang
 Fix For: 1.11.0


I read the source table data of hive through flink sql, and then write the 
target table of hive. The target table is a partitioned table. When the data of 
a partition is particularly large, data skew occurs, resulting in a 
particularly long execution time.

By default Configuration, the same sql, hive on spark takes five minutes, and 
flink takes about 40 minutes.

example:

 
{code:java}
// the schema of myparttable

name string,
age int,
PARTITIONED BY ( 
type string, 
day string
)

INSERT OVERWRITE myparttable SELECT name, age, type,day from sourcetable;
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Apache Flink Stateful Functions Release 2.0.0, release candidate #1

2020-03-26 Thread Tzu-Li (Gordon) Tai
Also, here is the documentation for Stateful Functions for those who were
wondering:
master - https://ci.apache.org/projects/flink/flink-statefun-docs-master/
release-2.0 -
https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/

This is not yet visible directly from the Flink website, since the efforts
for incorporating Stateful Functions in the website is still ongoing.

On Fri, Mar 27, 2020 at 12:48 AM Tzu-Li (Gordon) Tai 
wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #0 for the version 2.0.0
> of Apache Flink Stateful Functions,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> **Testing Guideline**
>
> You can find here [1] a doc that we can use for collaborating testing
> efforts.
> The listed testing tasks in the doc also serve as a guideline in what to
> test for this release.
> If you wish to take ownership of a testing task, simply put your name down
> in the "Checked by" field of the task.
>
> **Release Overview**
>
> As an overview, the release consists of the following:
> a) Stateful Functions canonical source distribution, to be deployed to the
> release repository at dist.apache.org
> b) Stateful Functions Python SDK distributions to be deployed to PyPI
> c) Maven artifacts to be deployed to the Maven Central Repository
>
> **Staging Areas to Review**
>
> The staging areas containing the above mentioned artifacts are as follows,
> for your review:
> * All artifacts for a) and b) can be found in the corresponding dev
> repository at dist.apache.org [2]
> * All artifacts for c) can be found at the Apache Nexus Repository [3]
>
> All artifacts are singed with the
> key 1C1E2394D3194E1944613488F320986D35C33D6A [4]
>
> Other links for your review:
> * JIRA release notes [5]
> * source code tag "release-2.0.0-rc0" [6] [7]
>
> **Extra Remarks**
>
> * Part of the release is also official Docker images for Stateful
> Functions. This can be a separate process, since the creation of those
> relies on the fact that we have distribution jars already deployed to
> Maven. I will follow-up with this after these artifacts are officially
> released.
> In the meantime, there is this discussion [8] ongoing about where to host
> the StateFun Dockerfiles.
> * The Flink Website and blog post is also being worked on (by Marta) as
> part of the release, to incorporate the new Stateful Functions project. We
> can follow up with a link to those changes afterwards in this vote thread,
> but that would not block you to test and cast your votes already.
>
> **Vote Duration**
>
> The vote will be open for at least 72 hours *(target end date is next
> Tuesday, April 31).*
> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Gordon
>
> [1]
> https://docs.google.com/document/d/1P9yjwSbPQtul0z2AXMnVolWQbzhxs68suJvzR6xMjcs/edit?usp=sharing
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-statefun-2.0.0-rc1/
> [3]
> https://repository.apache.org/content/repositories/orgapacheflink-1339/
> [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> [5]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346878
> [6]
> https://gitbox.apache.org/repos/asf?p=flink-statefun.git;a=commit;h=ebd7ca866f7d11fa43c7a5bb36861ee1b24b0980
> [7] https://github.com/apache/flink-statefun/tree/release-2.0.0-rc1
> [8]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Creating-a-new-repo-to-host-Stateful-Functions-Dockerfiles-td39342.html
>
> TIP: You can create a `settings.xml` file with these contents:
>
> """
> 
>   
> flink-statefun-2.0.0
>   
>   
> 
>   flink-statefun-2.0.0
>   
> 
>   flink-statefun-2.0.0
>   
> https://repository.apache.org/content/repositories/orgapacheflink-1339/
> 
> 
> 
>   archetype
>   
> https://repository.apache.org/content/repositories/orgapacheflink-1339/
> 
> 
>   
> 
>   
> 
> """
>
> And reference that in you maven commands via `--settings
> path/to/settings.xml`.
> This is useful for creating a quickstart based on the staged release and
> for building against the staged jars.
>


Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-26 Thread Xintong Song
Thanks for updating the FLIP, Yangze.

I agree with Till that we probably want to separate the K8s/Yarn decorator
calls. Users can still configure one driver class, and we can use
`instanceof` to check whether the driver implemented K8s/Yarn specific
interfaces.

Moreover, I'm not sure about exposing entire `ContainerRequest` / `Pod`
(`AbstractKubernetesStepDecorator` directly manipulates on `Pod`) to user
codes. It gives more access to user codes than needed for defining external
resource, which might cause problems. Instead, I would suggest to have
interface like `Map
getYarn/KubernetesExternalResource()` and assemble them into
`ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.

Thank you~

Xintong Song



On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann  wrote:

> Hi everyone,
>
> I'm a bit late to the party. I think the current proposal looks good.
>
> Concerning the ExternalResourceDriver interface defined in the FLIP [1], I
> would suggest to not include the decorator calls for Kubernetes and Yarn in
> the base interface. Instead I would suggest to segregate the deployment
> specific decorator calls into separate interfaces. That way an
> ExternalResourceDriver does not have to support all deployments from the
> very beginning. Moreover, some resources might not be supported by a
> specific deployment target and the natural way to express this would be to
> not implement the respective deployment specific interface.
>
> Moreover, having void
> addExternalResourceToRequest(AMRMClient.ContainerRequest containerRequest)
> in the ExternalResourceDriver interface would require Hadoop on Flink's
> classpath whenever the external resource driver is being used.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
>
> Cheers,
> Till
>
> On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen  wrote:
>
> > Nice, thanks a lot!
> >
> > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo  wrote:
> >
> > > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> > >
> > > I've updated the FLIP accordingly. I do not add a
> > > ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
> > > which takes the responsibility of all relevant operations on both RM
> > > and TM sides.
> > > After a rethink about decoupling the management of external resources
> > > from TaskExecutor, I think we could do the same thing on the
> > > ResourceManager side. We do not need to add a specific allocation
> > > logic to the ResourceManager each time we add a specific external
> > > resource.
> > > - For Yarn, we need the ExternalResourceDriver to edit the
> > > containerRequest.
> > > - For Kubenetes, ExternalResourceDriver could provide a decorator for
> > > the TM pod.
> > >
> > > In this way, just like MetricReporter, we allow users to define their
> > > custom ExternalResourceDriver. It is more extensible and fits the
> > > separation of concerns. For more details, please take a look at [1].
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen  wrote:
> > > >
> > > > This sounds good to go ahead from my side.
> > > >
> > > > I like the approach that Becket suggested - in that case the core
> > > > abstraction that everyone would need to understand would be "external
> > > > resource allocation" and the "ResourceInfoProvider", and the GPU
> > specific
> > > > code would be a specific implementation only known to that component
> > that
> > > > allocates the external resource. That fits the separation of concerns
> > > well.
> > > >
> > > > I also understand that it should not be over-engineered in the first
> > > > version, so some simplification makes sense, and then gradually
> expand
> > > from
> > > > there.
> > > >
> > > > So +1 to go ahead with what was suggested above (Xintong / Becket)
> from
> > > my
> > > > side.
> > > >
> > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song 
> > > wrote:
> > > >
> > > > > Thanks for the comments, Stephan & Becket.
> > > > >
> > > > > @Stephan
> > > > >
> > > > > I see your concern, and I completely agree with you that we should
> > > first
> > > > > think about the "library" / "plugin" / "extension" style if
> possible.
> > > > >
> > > > > If GPUs are sliced and assigned during scheduling, there may be
> > reason,
> > > > > > although it looks that it would belong to the slot then. Is that
> > > what we
> > > > > > are doing here?
> > > > >
> > > > >
> > > > > In the current proposal, we do not have the GPUs sliced and
> assigned
> > to
> > > > > slots, because it could be problematic without dynamic slot
> > allocation.
> > > > > E.g., the number of GPUs might not be evenly divisible by the
> number
> > of
> > > > > slots.
> > > > >
> > > > > I think it makes sense to eventually have the GPUs assigned to
> slots.
> > > Even
> > > > > then, we might still need a TM level GPUManager (

[VOTE] FLIP-115: Filesystem connector in Table

2020-03-26 Thread Jingsong Li
Hi everyone,

I'd like to start the vote of FLIP-115 [1], which introduce Filesystem
table factory in table. This FLIP is discussed in the thread[2].

The vote will be open for at least 72 hours. Unless there is an objection,
I will try to close it by March 30, 2020 03:00 UTC if we have received
sufficient votes.

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-115-Filesystem-connector-in-Table-td38870.html

Best
Jingsong Lee


Re: [DISCUSS] FLIP-108: Add GPU support in Flink

2020-03-26 Thread Yangze Guo
Thanks for the feedback, @Till and @Xintong.

Regarding separating the interface, I'm also +1 with it.

Regarding the resource allocation interface, true, it's dangerous to
give much access to user codes. Changing the return type to Map makes sense to me. AFAIK, it is compatible
with all the first-party supported resources for Yarn/Kubernetes. It
could also free us from the potential dependency issue as well.

Best,
Yangze Guo

On Fri, Mar 27, 2020 at 10:42 AM Xintong Song  wrote:
>
> Thanks for updating the FLIP, Yangze.
>
> I agree with Till that we probably want to separate the K8s/Yarn decorator
> calls. Users can still configure one driver class, and we can use
> `instanceof` to check whether the driver implemented K8s/Yarn specific
> interfaces.
>
> Moreover, I'm not sure about exposing entire `ContainerRequest` / `Pod`
> (`AbstractKubernetesStepDecorator` directly manipulates on `Pod`) to user
> codes. It gives more access to user codes than needed for defining external
> resource, which might cause problems. Instead, I would suggest to have
> interface like `Map
> getYarn/KubernetesExternalResource()` and assemble them into
> `ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann  wrote:
>
> > Hi everyone,
> >
> > I'm a bit late to the party. I think the current proposal looks good.
> >
> > Concerning the ExternalResourceDriver interface defined in the FLIP [1], I
> > would suggest to not include the decorator calls for Kubernetes and Yarn in
> > the base interface. Instead I would suggest to segregate the deployment
> > specific decorator calls into separate interfaces. That way an
> > ExternalResourceDriver does not have to support all deployments from the
> > very beginning. Moreover, some resources might not be supported by a
> > specific deployment target and the natural way to express this would be to
> > not implement the respective deployment specific interface.
> >
> > Moreover, having void
> > addExternalResourceToRequest(AMRMClient.ContainerRequest containerRequest)
> > in the ExternalResourceDriver interface would require Hadoop on Flink's
> > classpath whenever the external resource driver is being used.
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> >
> > Cheers,
> > Till
> >
> > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen  wrote:
> >
> > > Nice, thanks a lot!
> > >
> > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo  wrote:
> > >
> > > > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> > > >
> > > > I've updated the FLIP accordingly. I do not add a
> > > > ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
> > > > which takes the responsibility of all relevant operations on both RM
> > > > and TM sides.
> > > > After a rethink about decoupling the management of external resources
> > > > from TaskExecutor, I think we could do the same thing on the
> > > > ResourceManager side. We do not need to add a specific allocation
> > > > logic to the ResourceManager each time we add a specific external
> > > > resource.
> > > > - For Yarn, we need the ExternalResourceDriver to edit the
> > > > containerRequest.
> > > > - For Kubenetes, ExternalResourceDriver could provide a decorator for
> > > > the TM pod.
> > > >
> > > > In this way, just like MetricReporter, we allow users to define their
> > > > custom ExternalResourceDriver. It is more extensible and fits the
> > > > separation of concerns. For more details, please take a look at [1].
> > > >
> > > > [1]
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen  wrote:
> > > > >
> > > > > This sounds good to go ahead from my side.
> > > > >
> > > > > I like the approach that Becket suggested - in that case the core
> > > > > abstraction that everyone would need to understand would be "external
> > > > > resource allocation" and the "ResourceInfoProvider", and the GPU
> > > specific
> > > > > code would be a specific implementation only known to that component
> > > that
> > > > > allocates the external resource. That fits the separation of concerns
> > > > well.
> > > > >
> > > > > I also understand that it should not be over-engineered in the first
> > > > > version, so some simplification makes sense, and then gradually
> > expand
> > > > from
> > > > > there.
> > > > >
> > > > > So +1 to go ahead with what was suggested above (Xintong / Becket)
> > from
> > > > my
> > > > > side.
> > > > >
> > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song 
> > > > wrote:
> > > > >
> > > > > > Thanks for the comments, Stephan & Becket.
> > > > > >
> > > > > > @Stephan
> > > > > >
> > > > > > I see your concern, and I completely agree with you that we should
> > > > first
> > > > > > think about the "library" / "plugin"

Re: [VOTE] FLIP-102: Add More Metrics to TaskManager

2020-03-26 Thread Xintong Song
Sorry for the late response.

I have shared my suggestions with Yadong & Lining offline. I think it would
be better to also post them here, for the public record.

   - I'm not sure about displaying Total Process Memory Used. Currently, we
   do not have a good way to monitor all memory footprints of the process.
   Metrics for some native memory usages (e.g., thread stack) are absent.
   Displaying a partial used memory size could be confusing for users.
   - I would suggest merge the current Mapped Memory metrics into Direct
   Memory. Actually, the metrics are retrieved from MXBeans for direct buffer
   pool and mapped buffer pool. Both of the two pools are accounted for in
   -XX:MaxDirectMemorySize. There's no Flink configuration that can modify the
   individual pool sizes. Therefore, I think displaying the total Direct
   Memory would be good enough. Moreover, in most use cases the size of mapped
   buffer pool is zero and users do not need to understand what is Mapped
   Memory. For expert users who do need the separated metrics for individual
   pools, they can subscribe the metrics on their own.
   - I would suggest to not display Non-Heap Memory. Despite the name, the
   metrics (also retrieved from MXBeans) actually accounts for metaspace, code
   cache, and compressed class space. It does not account for all JVM native
   memory overheads, e.g., thread stack. That means the metrics of Non-Heap
   Memory do not well correspond to any of the FLIP-49 memory components. They
   account for Flink's JVM Metaspace and part of JVM Overhead. I think this
   brings more confusion then help to users, especially primary users.


Thank you~

Xintong Song



On Thu, Mar 26, 2020 at 6:34 PM Till Rohrmann  wrote:

> Thanks for updating the FLIP Yadong.
>
> What is the difference between managedMemory and managedMemoryTotal
> and networkMemory and networkMemoryTotal in the REST response? If they are
> duplicates, then we might be able to remove one.
>
> Apart from that, the proposal looks good to me.
>
> Pulling also Andrey in to hear his opinion about the representation of the
> memory components.
>
> Cheers,
> Till
>
> On Thu, Mar 19, 2020 at 11:37 AM Yadong Xie  wrote:
>
>> Hi all
>>
>> I have updated the design of the metric page and FLIP doc, please let me
>> know what you think about it
>>
>> FLIP-102:
>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-102%3A+Add+More+Metrics+to+TaskManager
>> POC web:
>>
>> http://101.132.122.69:8081/web/#/task-manager/8e1f1beada3859ee8e46d0960bb1da18/metrics
>>
>> Till Rohrmann  于2020年2月27日周四 下午10:27写道:
>>
>> > Thinking a bit more about the problem whether to report the aggregated
>> > memory statistics or the individual slot statistics, I think reporting
>> it
>> > on a per slot basis won't work nicely together with FLIP-56 (dynamic
>> slot
>> > allocation). The problem is that with FLIP-56, we will no longer have
>> > dedicated slots. The number of slots might change over the lifetime of a
>> > TaskExecutor. Hence, it won't be easy to generate a metric path for
>> every
>> > slot which are furthermore also ephemeral. So maybe, the more general
>> and
>> > easier solution would be to report the overall memory usage of a
>> > TaskExecutor even though it means to do some aggregation on the
>> > TaskExecutor.
>> >
>> > Concerning the JVM limit: Isn't it mainly the code cache? If we display
>> > this value, then we should explain what exactly it means. I fear that
>> most
>> > users won't understand what JVM limit actually means.
>> >
>> > Cheers,
>> > Till
>> >
>> > On Wed, Feb 26, 2020 at 11:15 AM Yadong Xie 
>> wrote:
>> >
>> > > Hi Till
>> > >
>> > > Thanks a lot for your response
>> > >
>> > > > 2. I'm not entirely sure whether I would split the memory ...
>> > >
>> > > Split the memory display comes from the 'ancient' design of the web,
>> it
>> > is
>> > > ok for me to change it following total/heap/managed/network/direct/jvm
>> > > overhead/mapped sequence
>> > >
>> > > > 3. Displaying the memory configurations...
>> > >
>> > > I agree with you that it is not a very nice way, but the hierarchical
>> > > relationship of configurations is too complex and hard to display in
>> the
>> > > other ways (I have tried)
>> > >
>> > > if anyone has a better idea, please feels no hesitates to help me
>> > >
>> > >
>> > > > 4. What does JVM limit mean in Non-heap.JVM-Overhead?
>> > >
>> > > JVM limit is "non-heap max metric minus metaspace configuration" as
>> > > @Xintong
>> > > Song  replyed in this mail thread
>> > >
>> > >
>> > > Till Rohrmann  于2020年2月25日周二 下午6:58写道:
>> > >
>> > > > Thanks for creating this FLIP Yadong. I think your proposal makes it
>> > much
>> > > > easier for the user to understand what's happening on Flink
>> > > TaskManager's.
>> > > >
>> > > > I have some comments:
>> > > >
>> > > > 1. Some of the newly introduced metrics involve computations on the
>> > > > TaskManager. I would like to avoid additional computations
>> introduced
>> > by

Re: [DISCUSS] FLIP-119: Pipelined Region Scheduling

2020-03-26 Thread Yangze Guo
Thanks for driving this discussion, Zhu Zhu & Gary.

I found that the image link in this FLIP is not working well. When I
open that link, Google doc told me that I have no access privilege.
Could you take a look at that issue?

Best,
Yangze Guo

On Fri, Mar 27, 2020 at 1:38 AM Gary Yao  wrote:
>
> Hi community,
>
> In the past releases, we have been working on refactoring Flink's scheduler
> with the goal of making the scheduler extensible [1]. We have rolled out
> most of the intended refactoring in Flink 1.10, and we think it is now time
> to leverage our newly introduced abstractions to implement a new resource
> optimized scheduling strategy: Pipelined Region Scheduling.
>
> This scheduling strategy aims at:
>
> * avoidance of resource deadlocks when running batch jobs
>
> * tunable with respect to resource consumption and throughput
>
> More details can be found in the Wiki [2]. We are looking forward to your
> feedback.
>
> Best,
>
> Zhu Zhu & Gary
>
> [1] https://issues.apache.org/jira/browse/FLINK-10429
>
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling


[jira] [Created] (FLINK-16819) Got KryoException while using UDAF in flink1.9

2020-03-26 Thread Xingxing Di (Jira)
Xingxing Di created FLINK-16819:
---

 Summary: Got KryoException while using UDAF in flink1.9
 Key: FLINK-16819
 URL: https://issues.apache.org/jira/browse/FLINK-16819
 Project: Flink
  Issue Type: Bug
  Components: API / Type Serialization System, Table SQL / Planner
Affects Versions: 1.9.1
 Environment: Flink1.9.1

Apache hadoop 2.7.2
Reporter: Xingxing Di


Recently,  we are trying to upgrade online *sql jobs* from flink1.7 to flink1.9 
, most jobs works fine, but some jobs got  KryoExceptions. 

We found that UDAF will trigger this exception, btw ,we are using blink planner.

Here is the full stack trace:
2020-03-27 11:46:55
com.esotericsoftware.kryo.KryoException: java.lang.IndexOutOfBoundsException: 
Index: 104, Size: 2
Serialization trace:
seed (java.util.Random)
gen (com.tdunning.math.stats.AVLTreeDigest)
at 
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at 
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)
at 
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
at 
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
at 
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:346)
at 
org.apache.flink.util.InstantiationUtil.deserializeFromByteArray(InstantiationUtil.java:536)
at 
org.apache.flink.table.dataformat.BinaryGeneric.getJavaObjectFromBinaryGeneric(BinaryGeneric.java:86)
at 
org.apache.flink.table.dataformat.DataFormatConverters$GenericConverter.toExternalImpl(DataFormatConverters.java:628)
at 
org.apache.flink.table.dataformat.DataFormatConverters$GenericConverter.toExternalImpl(DataFormatConverters.java:633)
at 
org.apache.flink.table.dataformat.DataFormatConverters$DataFormatConverter.toExternal(DataFormatConverters.java:320)
at 
org.apache.flink.table.dataformat.DataFormatConverters$PojoConverter.toExternalImpl(DataFormatConverters.java:1293)
at 
org.apache.flink.table.dataformat.DataFormatConverters$PojoConverter.toExternalImpl(DataFormatConverters.java:1257)
at 
org.apache.flink.table.dataformat.DataFormatConverters$DataFormatConverter.toExternal(DataFormatConverters.java:302)
at GroupAggsHandler$71.setAccumulators(Unknown Source)
at 
org.apache.flink.table.runtime.operators.aggregate.GroupAggFunction.processElement(GroupAggFunction.java:151)
at 
org.apache.flink.table.runtime.operators.aggregate.GroupAggFunction.processElement(GroupAggFunction.java:43)
at 
org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:85)
at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processElement(StreamOneInputProcessor.java:164)
at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:143)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:279)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:301)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:406)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IndexOutOfBoundsException: Index: 104, Size: 2
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at 
com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42)
at com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:805)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:677)
at 
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
... 26 more
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16820) support reading array of timestamp, data, and time in JDBCTableSource

2020-03-26 Thread Bowen Li (Jira)
Bowen Li created FLINK-16820:


 Summary: support reading array of timestamp, data, and time in 
JDBCTableSource
 Key: FLINK-16820
 URL: https://issues.apache.org/jira/browse/FLINK-16820
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / JDBC
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16821) Run Kubernetes test failed with invalid named "minikube"

2020-03-26 Thread Zhijiang (Jira)
Zhijiang created FLINK-16821:


 Summary: Run Kubernetes test failed with invalid named "minikube"
 Key: FLINK-16821
 URL: https://issues.apache.org/jira/browse/FLINK-16821
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes, Tests
Reporter: Zhijiang


This is the test run 
[https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6702&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5]

Log output
{code:java}
2020-03-27T00:07:38.9666021Z Running 'Run Kubernetes test'
2020-03-27T00:07:38.956Z 
==
2020-03-27T00:07:38.9677101Z TEST_DATA_DIR: 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-38967103614
2020-03-27T00:07:41.7529865Z Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
2020-03-27T00:07:41.7721475Z Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.11-SNAPSHOT-bin/flink-1.11-SNAPSHOT
2020-03-27T00:07:41.8208394Z Docker version 19.03.8, build afacb8b7f0
2020-03-27T00:07:42.4793914Z docker-compose version 1.25.4, build 8d51620a
2020-03-27T00:07:42.5359301Z Installing minikube ...
2020-03-27T00:07:42.5494076Z   % Total% Received % Xferd  Average Speed   
TimeTime Time  Current
2020-03-27T00:07:42.5494729Z  Dload  Upload   
Total   SpentLeft  Speed
2020-03-27T00:07:42.5498136Z 
2020-03-27T00:07:42.6214887Z   0 00 00 0  0  0 
--:--:-- --:--:-- --:--:-- 0
2020-03-27T00:07:43.3467750Z   0 00 00 0  0  0 
--:--:-- --:--:-- --:--:-- 0
2020-03-27T00:07:43.3469636Z 100 52.0M  100 52.0M0 0  65.2M  0 
--:--:-- --:--:-- --:--:-- 65.2M
2020-03-27T00:07:43.4262625Z * There is no local cluster named "minikube"
2020-03-27T00:07:43.4264438Z   - To fix this, run: minikube start
2020-03-27T00:07:43.4282404Z Starting minikube ...
2020-03-27T00:07:43.7749694Z * minikube v1.9.0 on Ubuntu 16.04
2020-03-27T00:07:43.7761742Z * Using the none driver based on user configuration
2020-03-27T00:07:43.7762229Z X The none driver requires conntrack to be 
installed for kubernetes version 1.18.0
2020-03-27T00:07:43.8202161Z * There is no local cluster named "minikube"
2020-03-27T00:07:43.8203353Z   - To fix this, run: minikube start
2020-03-27T00:07:43.8568899Z * There is no local cluster named "minikube"
2020-03-27T00:07:43.8570685Z   - To fix this, run: minikube start
2020-03-27T00:07:43.8583793Z Command: start_kubernetes_if_not_running failed. 
Retrying...
2020-03-27T00:07:48.9017252Z * There is no local cluster named "minikube"
2020-03-27T00:07:48.9019347Z   - To fix this, run: minikube start
2020-03-27T00:07:48.9031515Z Starting minikube ...
2020-03-27T00:07:49.0612601Z * minikube v1.9.0 on Ubuntu 16.04
2020-03-27T00:07:49.0616688Z * Using the none driver based on user configuration
2020-03-27T00:07:49.0620173Z X The none driver requires conntrack to be 
installed for kubernetes version 1.18.0
2020-03-27T00:07:49.1040676Z * There is no local cluster named "minikube"
2020-03-27T00:07:49.1042353Z   - To fix this, run: minikube start
2020-03-27T00:07:49.1453522Z * There is no local cluster named "minikube"
2020-03-27T00:07:49.1454594Z   - To fix this, run: minikube start
2020-03-27T00:07:49.1468436Z Command: start_kubernetes_if_not_running failed. 
Retrying...
2020-03-27T00:07:54.1907713Z * There is no local cluster named "minikube"
2020-03-27T00:07:54.1909876Z   - To fix this, run: minikube start
2020-03-27T00:07:54.1921479Z Starting minikube ...
2020-03-27T00:07:54.3388738Z * minikube v1.9.0 on Ubuntu 16.04
2020-03-27T00:07:54.3395499Z * Using the none driver based on user configuration
2020-03-27T00:07:54.3396443Z X The none driver requires conntrack to be 
installed for kubernetes version 1.18.0
2020-03-27T00:07:54.3824399Z * There is no local cluster named "minikube"
2020-03-27T00:07:54.3837652Z   - To fix this, run: minikube start
2020-03-27T00:07:54.4203902Z * There is no local cluster named "minikube"
2020-03-27T00:07:54.4204895Z   - To fix this, run: minikube start
2020-03-27T00:07:54.4217866Z Command: start_kubernetes_if_not_running failed. 
Retrying...
2020-03-27T00:07:59.4235917Z Command: start_kubernetes_if_not_running failed 3 
times.
2020-03-27T00:07:59.4236459Z Could not start minikube. Aborting...
2020-03-27T00:07:59.8439850Z The connection to the server localhost:8080 was 
refused - did you specify the right host or port?
2020-03-27T00:07:59.8939088Z The connection to the server localhost:8080 was 
refused - did you specify the right host or port?
2020-03-27T00:07:59.9515679Z The connection to the server localhost:8080 was 
refused - did you specify the right host or port?
2020-03-27T00:07:59.9528463Z Stopping minikube ...
2020-03-27T00:07:59.9

Re: [Discuss] FLINK-16039 Add API method to get last element in session window

2020-03-26 Thread Manas Kale
Hi Dawid,
Thank you for the response, I see your point. I was perhaps thinking only
from the perspective of my use case where I think such a definition makes
sense and did not account for the general case.

Regards,
Manas


On Thu, Mar 26, 2020 at 8:40 PM Dawid Wysakowicz 
wrote:

> Hi Manas,
>
> First of all I think your understanding of how the session windows work
> is correct.
>
> I tend to slightly disagree that the end for a session window is wrong.
> It is my personal opinion though. I see it this way that a TimeWindow in
> case of a session window is the session itself. The session always ends
> after a period of inactivity. Take a user session on a webpage. Such a
> session does not end/isn't brought down at the time of a last event. It
> is closed after a period of inactivity. In such scenario I think the
> behavior of the session window is correct.
>
> Moreover you can achieve what you are describing with an aggregate[1]
> function. You can easily maintain the biggest number seen for a window.
>
> Lastly, I think the overall feeling in the community is that we are very
> skeptical towards extending the Windows API. From what I've heard and
> experienced the ProcessFunction[2] is a much better principle to build
> custom solutions upon, as in fact its easier to control and even
> understand. That said I am rather against introducing that change.
>
> Best,
>
> Dawid
>
> [1]
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/windows.html#aggregatefunction
>
> [2]
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html#process-function-low-level-operations
>
> On 13/03/2020 09:46, Manas Kale wrote:
> > Hi all,
> > I would like to start a discussion on this feature request (JIRA link).
> > 
> >
> > Consider the events :
> >
> > [1, event], [2, event]
> >
> > where first element is event timestamp in seconds and second element is
> > event code/name.
> >
> > Also consider that an Event time session window with inactivityGap = 2
> > seconds is acting on above stream.
> >
> > When the first event arrives, a session window should be created that is
> > [1,1].
> >
> > When the second event arrives, a new session window should be created
> that
> > is [2,2]. Since this falls within firstWindowTimestamp+inactivityGap, it
> > should be merged into session window [1,2] and  [2,2] should be deleted.
> >
> >
> > *This is my understanding of how session windows are created. Please
> > correct me if wrong.*
> > However, Flink does not follow such a definition of windows semantically.
> > If I call the  getEnd() method of the TimeWindow() class, I get back
> > timestamp + inactivityGap.
> >
> > For the above example, after processing the first element, I would get 1
> +
> > 2 = 3 seconds as the window "end".
> >
> > The actual window end should be the timestamp 1, which is the last event
> in
> > the session window.
> >
> > A solution would be to change the "end" definition of all windows, but I
> > suppose this would be breaking and would need some debate.
> >
> > Therefore, I propose an intermediate solution : add a new API method that
> > keeps track of the last element added in the session window.
> >
> > If there is agreement on this, I would like to start drafting a change
> > document and implement this.
> >
>
>


Re: [DISCUSS]FLIP-113: Support SQL and planner hints

2020-03-26 Thread Danny Chan
Thanks everyone for the feedback ~

- For the global config option belongs to `ExecutionConfigOptions` or
`OptimizerConfigOptions`, i have to strong objections, switch
to `OptimizerConfigOptions` is okey to me and i have updated the WIKI
- For use while-list or black-list, i have opinion with Timo, so black-list

I would fire a Vote if there are no other objections soon, thanks ~

Timo Walther  于2020年3月26日周四 下午6:31写道:

> Hi everyone,
>
> it is not only about security concerns. Hint options should be
> well-defined. We had a couple of people that were concerned about
> changing the semantics with a concept that is called "hint". These
> options are more like "debugging options" while someone is developing a
> connector or using a notebook to quickly produce some rows.
>
> The final pipeline should use a temporary table instead. I suggest to
> use a whitelist and force people to think about what should be exposed
> as a hint. By default, no option should be exposed. It is better to be
> conservative here.
>
> Regards,
> Timo
>
>
> On 26.03.20 10:31, Danny Chan wrote:
> > Thanks Kurt for the suggestion ~
> >
> > In my opinion:
> > - There is no need for TableFormatFactory#supportedHintOptions because
> all
> > the format options can be configured dynamically, they have no security
> > issues
> > - Dynamic table options is not an optimization, it is more like an
> > execution behavior from my side
> >
> > Kurt Young  于2020年3月26日周四 下午4:47写道:
> >
> >> Hi Danny,
> >>
> >> Thanks for the updates. I have 2 comments regarding to latest document:
> >>
> >> 1) I think we also need `*supportedHintOptions*` for
> >> `*TableFormatFactory*`
> >> 2) IMO "dynamic-table-options.enabled" should belong to `
> >> *OptimizerConfigOptions*`
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Thu, Mar 26, 2020 at 4:40 PM Timo Walther 
> wrote:
> >>
> >>> Thanks for the update Danny. +1 for this proposal.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>> On 26.03.20 04:51, Danny Chan wrote:
>  Thanks everyone who engaged in this discussion ~
> 
>  Our goal is "Supports Dynamic Table Options for Flink SQL". After an
>  offline discussion with Kurt, Timo and Dawid, we have made the final
>  conclusion, here is the summary:
> 
> 
>   - Use comment style syntax to specify the dynamic table options:
> >> "/*+
>   *OPTIONS*(k1='v1', k2='v2') */"
>   - Have constraint on the options keys: the options that may bring
> >> in
>   security problems should not be allowed, i.e. Kafka connector
> >>> zookeeper
>   endpoint URL and topic name
>   - Use white-list to control the allowed options for each
> connector,
>   which is more safe for future extention
>   - We allow to enable/disable this feature globally
>   - Implement based on the current code base first, and when
> FLIP-95
> >> is
>   checked in, implement this feature based on new interface
> 
>  Any suggestions are appreciated ~
> 
>  [1]
> 
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL
> 
>  Best,
>  Danny Chan
> 
>  Jark Wu  于2020年3月18日周三 下午10:38写道:
> 
> > Hi everyone,
> >
> > Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid
> >> it
> > doesn't solve the problems but increases some development and
> learning
> > burdens.
> >
> > # increase development and learning burden
> >
> > According to the discussion so far, we want to support overriding a
> >>> subset
> > of options in hints which doesn't affect semantics.
> > With the `supportedHintOptions`, it's up to the connector developers
> >> to
> > decide which options will not affect semantics, and to be hint
> >> options.
> > However, the question is how to distinguish whether an option will
> >>> *affect
> > semantics*? What happens if an option will affect semantics but
> >>> provided as
> > hint options?
> >   From my point of view, it's not easy to distinguish. For example,
> the
> > "format.ignore-parse-error" can be a very useful dynamic option but
> >> that
> > will affect semantic, because the result is different (null vs
> >>> exception).
> > Another example, the "connector.lookup.cache.*" options are also very
> > useful to tune jobs, however, it will also affect the job results. I
> >> can
> > come up many more useful options but may affect semantics.
> >
> > I can see that the community will under endless discussion around
> "can
> >>> this
> > option to be a hint option?",  "wether this option will affect
> >>> semantics?".
> > You can also find that we already have different opinions on
> > "ignore-parse-error". Those discussion is a waste of time! That's not
> >>> what
> > users want!
> > The problem is user need this, this, this options and HOW to expose
> >>> them?
> > We s

Re: [DISCUSS]FLIP-113: Support SQL and planner hints

2020-03-26 Thread Danny Chan
Sorry, i meant white-list ~

Danny Chan  于2020年3月27日周五 下午12:40写道:

> Thanks everyone for the feedback ~
>
> - For the global config option belongs to `ExecutionConfigOptions` or
> `OptimizerConfigOptions`, i have to strong objections, switch
> to `OptimizerConfigOptions` is okey to me and i have updated the WIKI
> - For use while-list or black-list, i have opinion with Timo, so black-list
>
> I would fire a Vote if there are no other objections soon, thanks ~
>
> Timo Walther  于2020年3月26日周四 下午6:31写道:
>
>> Hi everyone,
>>
>> it is not only about security concerns. Hint options should be
>> well-defined. We had a couple of people that were concerned about
>> changing the semantics with a concept that is called "hint". These
>> options are more like "debugging options" while someone is developing a
>> connector or using a notebook to quickly produce some rows.
>>
>> The final pipeline should use a temporary table instead. I suggest to
>> use a whitelist and force people to think about what should be exposed
>> as a hint. By default, no option should be exposed. It is better to be
>> conservative here.
>>
>> Regards,
>> Timo
>>
>>
>> On 26.03.20 10:31, Danny Chan wrote:
>> > Thanks Kurt for the suggestion ~
>> >
>> > In my opinion:
>> > - There is no need for TableFormatFactory#supportedHintOptions because
>> all
>> > the format options can be configured dynamically, they have no security
>> > issues
>> > - Dynamic table options is not an optimization, it is more like an
>> > execution behavior from my side
>> >
>> > Kurt Young  于2020年3月26日周四 下午4:47写道:
>> >
>> >> Hi Danny,
>> >>
>> >> Thanks for the updates. I have 2 comments regarding to latest document:
>> >>
>> >> 1) I think we also need `*supportedHintOptions*` for
>> >> `*TableFormatFactory*`
>> >> 2) IMO "dynamic-table-options.enabled" should belong to `
>> >> *OptimizerConfigOptions*`
>> >>
>> >> Best,
>> >> Kurt
>> >>
>> >>
>> >> On Thu, Mar 26, 2020 at 4:40 PM Timo Walther 
>> wrote:
>> >>
>> >>> Thanks for the update Danny. +1 for this proposal.
>> >>>
>> >>> Regards,
>> >>> Timo
>> >>>
>> >>> On 26.03.20 04:51, Danny Chan wrote:
>>  Thanks everyone who engaged in this discussion ~
>> 
>>  Our goal is "Supports Dynamic Table Options for Flink SQL". After an
>>  offline discussion with Kurt, Timo and Dawid, we have made the final
>>  conclusion, here is the summary:
>> 
>> 
>>   - Use comment style syntax to specify the dynamic table options:
>> >> "/*+
>>   *OPTIONS*(k1='v1', k2='v2') */"
>>   - Have constraint on the options keys: the options that may
>> bring
>> >> in
>>   security problems should not be allowed, i.e. Kafka connector
>> >>> zookeeper
>>   endpoint URL and topic name
>>   - Use white-list to control the allowed options for each
>> connector,
>>   which is more safe for future extention
>>   - We allow to enable/disable this feature globally
>>   - Implement based on the current code base first, and when
>> FLIP-95
>> >> is
>>   checked in, implement this feature based on new interface
>> 
>>  Any suggestions are appreciated ~
>> 
>>  [1]
>> 
>> >>>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL
>> 
>>  Best,
>>  Danny Chan
>> 
>>  Jark Wu  于2020年3月18日周三 下午10:38写道:
>> 
>> > Hi everyone,
>> >
>> > Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid
>> >> it
>> > doesn't solve the problems but increases some development and
>> learning
>> > burdens.
>> >
>> > # increase development and learning burden
>> >
>> > According to the discussion so far, we want to support overriding a
>> >>> subset
>> > of options in hints which doesn't affect semantics.
>> > With the `supportedHintOptions`, it's up to the connector developers
>> >> to
>> > decide which options will not affect semantics, and to be hint
>> >> options.
>> > However, the question is how to distinguish whether an option will
>> >>> *affect
>> > semantics*? What happens if an option will affect semantics but
>> >>> provided as
>> > hint options?
>> >   From my point of view, it's not easy to distinguish. For example,
>> the
>> > "format.ignore-parse-error" can be a very useful dynamic option but
>> >> that
>> > will affect semantic, because the result is different (null vs
>> >>> exception).
>> > Another example, the "connector.lookup.cache.*" options are also
>> very
>> > useful to tune jobs, however, it will also affect the job results. I
>> >> can
>> > come up many more useful options but may affect semantics.
>> >
>> > I can see that the community will under endless discussion around
>> "can
>> >>> this
>> > option to be a hint option?",  "wether this option will affect
>> >>> semantics?".
>> > You can also find that we already have different op

Re: [DISCUSS] FLIP-119: Pipelined Region Scheduling

2020-03-26 Thread Zhu Zhu
Thanks for reporting this Yangze.
I have update the permission to those images. Everyone are able to view
them now.

Thanks,
Zhu Zhu

Yangze Guo  于2020年3月27日周五 上午11:25写道:

> Thanks for driving this discussion, Zhu Zhu & Gary.
>
> I found that the image link in this FLIP is not working well. When I
> open that link, Google doc told me that I have no access privilege.
> Could you take a look at that issue?
>
> Best,
> Yangze Guo
>
> On Fri, Mar 27, 2020 at 1:38 AM Gary Yao  wrote:
> >
> > Hi community,
> >
> > In the past releases, we have been working on refactoring Flink's
> scheduler
> > with the goal of making the scheduler extensible [1]. We have rolled out
> > most of the intended refactoring in Flink 1.10, and we think it is now
> time
> > to leverage our newly introduced abstractions to implement a new resource
> > optimized scheduling strategy: Pipelined Region Scheduling.
> >
> > This scheduling strategy aims at:
> >
> > * avoidance of resource deadlocks when running batch jobs
> >
> > * tunable with respect to resource consumption and throughput
> >
> > More details can be found in the Wiki [2]. We are looking forward to your
> > feedback.
> >
> > Best,
> >
> > Zhu Zhu & Gary
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-10429
> >
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling
>


Re: SerializableHadoopConfiguration

2020-03-26 Thread Sivaprasanna
Till, Stephen, & Others,

I have created a discuss thread a few days back. Attaching the link here.
Appreciate if you could take a look.
https://lists.apache.org/thread.html/rf885987160bede5911a7f61923307a6d5ae07f850da0a90555728e5f%40%3Cdev.flink.apache.org%3E

Please let me know if you want me to improve/edit the content to make it
better.

Thanks,
Sivaprasanna

On Tue, Mar 17, 2020 at 8:22 PM Sivaprasanna 
wrote:

> Hi Till,
>
> Sure. I'll take a look and start a discuss thread soon.
>
> Thanks,
> Sivaprasanna
>
> On Mon, Mar 16, 2020 at 4:01 PM Till Rohrmann 
> wrote:
>
>> Hi Sivaprasanna,
>>
>> do you want to collect the set of Hadoop utility classes which could be
>> moved to a flink-hadoop-utils module and start a discuss thread about it?
>> I
>> think this could be a first good step into cleaning up the module
>> structure
>> a bit.
>>
>> Cheers,
>> Till
>>
>> On Fri, Mar 6, 2020 at 7:27 AM Sivaprasanna 
>> wrote:
>>
>> > That also makes sense but that, I believe, would be a breaking/major
>> > change. If we are okay with merging them together, we can name something
>> > like "flink-hadoop-compress" since SequenceFile is also a Hadoop format
>> and
>> > the existing "flink-compress" module, as of now, deals with Hadoop based
>> > compression.
>> >
>> > On Fri, Mar 6, 2020 at 1:33 AM João Boto  wrote:
>> >
>> > > We could merge the two modules into one?
>> > > sequence-files its another way of compressing files..
>> > >
>> > >
>> > > On 2020/03/05 13:02:46, Sivaprasanna 
>> wrote:
>> > > > Hi Stephen,
>> > > >
>> > > > I guess it is a valid point to have something like
>> > 'flink-hadoop-utils'.
>> > > > Maybe a [DISCUSS] thread can be started to understand what the
>> > community
>> > > > thinks?
>> > > >
>> > > > On Thu, Mar 5, 2020 at 4:22 PM Stephan Ewen 
>> wrote:
>> > > >
>> > > > > Do we have more cases of "common Hadoop Utils"?
>> > > > >
>> > > > > If yes, does it make sense to create a "flink-hadoop-utils" module
>> > with
>> > > > > exactly such classes? It would have an optional dependency on
>> > > > > "flink-shaded-hadoop".
>> > > > >
>> > > > > On Wed, Mar 4, 2020 at 9:12 AM Till Rohrmann <
>> trohrm...@apache.org>
>> > > wrote:
>> > > > >
>> > > > > > Hi Sivaprasanna,
>> > > > > >
>> > > > > > we don't upload the source jars for the flink-shaded modules.
>> > > However you
>> > > > > > can build them yourself and install by cloning the flink-shaded
>> > > > > repository
>> > > > > > [1] and then call `mvn package -Dshade-sources`.
>> > > > > >
>> > > > > > [1] https://github.com/apache/flink-shaded
>> > > > > >
>> > > > > > Cheers,
>> > > > > > Till
>> > > > > >
>> > > > > > On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna <
>> > > sivaprasanna...@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask,
>> if
>> > > any
>> > > > > > Flink
>> > > > > > > module is going to use Hadoop in any way, it will most
>> probably
>> > > include
>> > > > > > > flink-shaded-hadoop-2 as a dependency.
>> > > > > > > However, flink-shaded modules don't have any source files. Is
>> > that
>> > > a
>> > > > > > strict
>> > > > > > > convention that the community follows?
>> > > > > > >
>> > > > > > > -
>> > > > > > > Sivaprasanna
>> > > > > > >
>> > > > > > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna <
>> > > > > sivaprasanna...@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi Arvid,
>> > > > > > > >
>> > > > > > > > Thanks for the quick reply. Yes, it actually makes sense to
>> > avoid
>> > > > > > Hadoop
>> > > > > > > > dependencies from getting into Flink's core modules but I
>> also
>> > > wonder
>> > > > > > if
>> > > > > > > it
>> > > > > > > > will be an overkill to add flink-hadoop-fs as a dependency
>> just
>> > > > > because
>> > > > > > > we
>> > > > > > > > want to use a utility class from that module.
>> > > > > > > >
>> > > > > > > > -
>> > > > > > > > Sivaprasanna
>> > > > > > > >
>> > > > > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise <
>> > ar...@ververica.com>
>> > > > > > wrote:
>> > > > > > > >
>> > > > > > > >> Hi Sivaprasanna,
>> > > > > > > >>
>> > > > > > > >> we actually want to remove Hadoop from all core modules,
>> so we
>> > > could
>> > > > > > not
>> > > > > > > >> place it in some very common place like flink-core.
>> > > > > > > >>
>> > > > > > > >> But I think the module flink-hadoop-fs could be a fitting
>> > place.
>> > > > > > > >>
>> > > > > > > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna <
>> > > > > > sivaprasanna...@gmail.com
>> > > > > > > >
>> > > > > > > >> wrote:
>> > > > > > > >>
>> > > > > > > >> > Hi
>> > > > > > > >> >
>> > > > > > > >> > The flink-sequence-file module has a class named
>> > > > > > > >> > SerializableHadoopConfiguration[1] which is nothing but a
>> > > wrapper
>> > > > > > > class
>> > > > > > > >> for
>> > > > > > > >> > Hadoop Configuration. I believe this class can be moved
>> to a
>> > > > > common
>> > > > > > > >> module
>> > > > > > > >> > s

[jira] [Created] (FLINK-16822) The config set by SET command does not work

2020-03-26 Thread godfrey he (Jira)
godfrey he created FLINK-16822:
--

 Summary: The config set by SET command does not work
 Key: FLINK-16822
 URL: https://issues.apache.org/jira/browse/FLINK-16822
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.10.0
Reporter: godfrey he
 Fix For: 1.11.0


Users can add or change the properties for execution behavior through SET 
command in SQL client, e.g. {{SET execution.parallelism=10}}, {{SET 
table.optimizer.join-reorder-enabled=true}}. But the {{table.xx}} config can't 
change the TableEnvironment behavior, because the property set from CLI does 
not be set into TableEnvironment's table config.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16823) The functioin TIMESTAMPDIFF doesn't perform expected result

2020-03-26 Thread Adam N D DENG (Jira)
Adam N D DENG created FLINK-16823:
-

 Summary: The functioin TIMESTAMPDIFF doesn't perform expected 
result
 Key: FLINK-16823
 URL: https://issues.apache.org/jira/browse/FLINK-16823
 Project: Flink
  Issue Type: Bug
Reporter: Adam N D DENG
 Attachments: image-2020-03-27-13-50-51-955.png

For example,

In mysql bellow sql get result 6, but in flink the output is 5

SELECT timestampdiff (MONTH, TIMESTAMP '2019-09-01 00:00:00',TIMESTAMP 
'2020-03-01 00:00:00' )

 

!image-2020-03-27-13-50-51-955.png!

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Apache Flink Stateful Functions Release 2.0.0, release candidate #1

2020-03-26 Thread Tzu-Li (Gordon) Tai
-1

Already discovered that the source distribution NOTICE file is missing
mentions for font-awesome.
The source of that is bundled under "docs/page/font-awesome/fonts", and is
licensed with SIL OFL 1.1 license, which makes it a requirement to be
listed in the NOTICE file.

I'll open a new RC2 with only the changes to the source NOTICE and LICENSE
files.



On Fri, Mar 27, 2020 at 10:37 AM Tzu-Li (Gordon) Tai 
wrote:

> Also, here is the documentation for Stateful Functions for those who were
> wondering:
> master - https://ci.apache.org/projects/flink/flink-statefun-docs-master/
> release-2.0 -
> https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/
>
> This is not yet visible directly from the Flink website, since the efforts
> for incorporating Stateful Functions in the website is still ongoing.
>
> On Fri, Mar 27, 2020 at 12:48 AM Tzu-Li (Gordon) Tai 
> wrote:
>
>> Hi everyone,
>>
>> Please review and vote on the release candidate #0 for the version 2.0.0
>> of Apache Flink Stateful Functions,
>> as follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>>
>> **Testing Guideline**
>>
>> You can find here [1] a doc that we can use for collaborating testing
>> efforts.
>> The listed testing tasks in the doc also serve as a guideline in what to
>> test for this release.
>> If you wish to take ownership of a testing task, simply put your name
>> down in the "Checked by" field of the task.
>>
>> **Release Overview**
>>
>> As an overview, the release consists of the following:
>> a) Stateful Functions canonical source distribution, to be deployed to
>> the release repository at dist.apache.org
>> b) Stateful Functions Python SDK distributions to be deployed to PyPI
>> c) Maven artifacts to be deployed to the Maven Central Repository
>>
>> **Staging Areas to Review**
>>
>> The staging areas containing the above mentioned artifacts are as
>> follows, for your review:
>> * All artifacts for a) and b) can be found in the corresponding dev
>> repository at dist.apache.org [2]
>> * All artifacts for c) can be found at the Apache Nexus Repository [3]
>>
>> All artifacts are singed with the
>> key 1C1E2394D3194E1944613488F320986D35C33D6A [4]
>>
>> Other links for your review:
>> * JIRA release notes [5]
>> * source code tag "release-2.0.0-rc0" [6] [7]
>>
>> **Extra Remarks**
>>
>> * Part of the release is also official Docker images for Stateful
>> Functions. This can be a separate process, since the creation of those
>> relies on the fact that we have distribution jars already deployed to
>> Maven. I will follow-up with this after these artifacts are officially
>> released.
>> In the meantime, there is this discussion [8] ongoing about where to host
>> the StateFun Dockerfiles.
>> * The Flink Website and blog post is also being worked on (by Marta) as
>> part of the release, to incorporate the new Stateful Functions project. We
>> can follow up with a link to those changes afterwards in this vote thread,
>> but that would not block you to test and cast your votes already.
>>
>> **Vote Duration**
>>
>> The vote will be open for at least 72 hours *(target end date is next
>> Tuesday, April 31).*
>> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>>
>> Thanks,
>> Gordon
>>
>> [1]
>> https://docs.google.com/document/d/1P9yjwSbPQtul0z2AXMnVolWQbzhxs68suJvzR6xMjcs/edit?usp=sharing
>> [2]
>> https://dist.apache.org/repos/dist/dev/flink/flink-statefun-2.0.0-rc1/
>> [3]
>> https://repository.apache.org/content/repositories/orgapacheflink-1339/
>> [4] https://dist.apache.org/repos/dist/release/flink/KEYS
>> [5]
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346878
>> [6]
>> https://gitbox.apache.org/repos/asf?p=flink-statefun.git;a=commit;h=ebd7ca866f7d11fa43c7a5bb36861ee1b24b0980
>> [7] https://github.com/apache/flink-statefun/tree/release-2.0.0-rc1
>> [8]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Creating-a-new-repo-to-host-Stateful-Functions-Dockerfiles-td39342.html
>>
>> TIP: You can create a `settings.xml` file with these contents:
>>
>> """
>> 
>>   
>> flink-statefun-2.0.0
>>   
>>   
>> 
>>   flink-statefun-2.0.0
>>   
>> 
>>   flink-statefun-2.0.0
>>   
>> https://repository.apache.org/content/repositories/orgapacheflink-1339/
>> 
>> 
>> 
>>   archetype
>>   
>> https://repository.apache.org/content/repositories/orgapacheflink-1339/
>> 
>> 
>>   
>> 
>>   
>> 
>> """
>>
>> And reference that in you maven commands via `--settings
>> path/to/settings.xml`.
>> This is useful for creating a quickstart based on the staged release and
>> for building against the staged jars.
>>
>