[GitHub] flink issue #2265: [FLINK-3097] [table] Add support for custom functions in ...

2016-07-19 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2265
  
No, there is no FLIP about it. I think a discussion in JIRA or in this PR 
should be enough. That's why I haven't documented it yet.  I was inspired by 
your 
[document](https://docs.google.com/document/d/1KMUzvBAWSyQ39T8MyxUi0zNHyvLUnyGMPA7_RLSDpFw/edit).
 You are right, `ScalarFunction` has many internal functions but they are not 
exposed to the user, only 2 methods can be overriden. 
An interface is not enough as it might be sometimes necessary to override 
`getReturnType` and `getParameterType`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3097) Add support for custom functions in Table API

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383717#comment-15383717
 ] 

ASF GitHub Bot commented on FLINK-3097:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2265
  
No, there is no FLIP about it. I think a discussion in JIRA or in this PR 
should be enough. That's why I haven't documented it yet.  I was inspired by 
your 
[document](https://docs.google.com/document/d/1KMUzvBAWSyQ39T8MyxUi0zNHyvLUnyGMPA7_RLSDpFw/edit).
 You are right, `ScalarFunction` has many internal functions but they are not 
exposed to the user, only 2 methods can be overriden. 
An interface is not enough as it might be sometimes necessary to override 
`getReturnType` and `getParameterType`.


> Add support for custom functions in Table API
> -
>
> Key: FLINK-3097
> URL: https://issues.apache.org/jira/browse/FLINK-3097
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> Currently, the Table API has a very limited set of built-in functions. 
> Support for custom functions can solve this problem. Adding of a custom row 
> function could look like:
> {code}
> TableEnvironment tableEnv = new TableEnvironment();
> RowFunction rf = new RowFunction() {
> @Override
> public String call(Object[] args) {
> return ((String) args[0]).trim();
> }
> };
> tableEnv.getConfig().registerRowFunction("TRIM", rf,
> BasicTypeInfo.STRING_TYPE_INFO);
> DataSource> input = env.fromElements(
> new Tuple1<>(" 1 "));
> Table table = tableEnv.fromDataSet(input);
> Table result = table.select("TRIM(f0)");
> {code}
> This feature is also necessary as part of FLINK-2099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4183) Move checking for StreamTableEnvironment into validation layer

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383738#comment-15383738
 ] 

ASF GitHub Bot commented on FLINK-4183:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2221
  
I will merge this later today if there are no objections...


> Move checking for StreamTableEnvironment into validation layer
> --
>
> Key: FLINK-4183
> URL: https://issues.apache.org/jira/browse/FLINK-4183
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Minor
>
> Some operators check the environment in `table.scala` instead of doing this 
> during the valdation phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2221: [FLINK-4183] [table] Move checking for StreamTableEnviron...

2016-07-19 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2221
  
I will merge this later today if there are no objections...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-4232) Flink executable does not return correct pid

2016-07-19 Thread David Moravek (JIRA)
David Moravek created FLINK-4232:


 Summary: Flink executable does not return correct pid
 Key: FLINK-4232
 URL: https://issues.apache.org/jira/browse/FLINK-4232
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.0.3
Reporter: David Moravek
Priority: Minor


Eg. when using supervisor, pid returned by ./bin/flink is pid of shell 
executable instead of java process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2268: [FLINK-4232] Make sure ./bin/flink returns correct...

2016-07-19 Thread dmvk
GitHub user dmvk opened a pull request:

https://github.com/apache/flink/pull/2268

[FLINK-4232] Make sure ./bin/flink returns correct pid



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dmvk/flink FLINK-4232

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2268.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2268


commit 17fa3e1e9a3350c843abceed9611218fe0bf287a
Author: David Moravek 
Date:   2016-07-19T08:15:06Z

[FLINK-4232] Make sure ./bin/flink returns correct pid




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4232) Flink executable does not return correct pid

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383770#comment-15383770
 ] 

ASF GitHub Bot commented on FLINK-4232:
---

GitHub user dmvk opened a pull request:

https://github.com/apache/flink/pull/2268

[FLINK-4232] Make sure ./bin/flink returns correct pid



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dmvk/flink FLINK-4232

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2268.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2268


commit 17fa3e1e9a3350c843abceed9611218fe0bf287a
Author: David Moravek 
Date:   2016-07-19T08:15:06Z

[FLINK-4232] Make sure ./bin/flink returns correct pid




> Flink executable does not return correct pid
> 
>
> Key: FLINK-4232
> URL: https://issues.apache.org/jira/browse/FLINK-4232
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.3
>Reporter: David Moravek
>Priority: Minor
>
> Eg. when using supervisor, pid returned by ./bin/flink is pid of shell 
> executable instead of java process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #1989: FLINK-3901 - Added CsvRowInputFormat

2016-07-19 Thread fpompermaier
Github user fpompermaier commented on the issue:

https://github.com/apache/flink/pull/1989
  
Hi @twalthr, any news about this? Are you going to merge this PR in the 
upcoming 1.1 release?
If you were waiting for a feedback you have my +1 about your strategy..go 
ahead and convert the RowCsvInputFormat class to Scala and keep the test code 
in Java!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3901) Create a RowCsvInputFormat to use as default CSV IF in Table API

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383819#comment-15383819
 ] 

ASF GitHub Bot commented on FLINK-3901:
---

Github user fpompermaier commented on the issue:

https://github.com/apache/flink/pull/1989
  
Hi @twalthr, any news about this? Are you going to merge this PR in the 
upcoming 1.1 release?
If you were waiting for a feedback you have my +1 about your strategy..go 
ahead and convert the RowCsvInputFormat class to Scala and keep the test code 
in Java!


> Create a RowCsvInputFormat to use as default CSV IF in Table API
> 
>
> Key: FLINK-3901
> URL: https://issues.apache.org/jira/browse/FLINK-3901
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.0.2
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>Priority: Minor
>  Labels: csv, null-values, row, tuple
>
> At the moment the Table APIs reads CSVs using the TupleCsvInputFormat, that 
> has the big limitation of 25 fields and null handling.
> A new IF producing Row object is indeed necessary to avoid those limitations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4223) Rearrange scaladoc and javadoc for Scala API

2016-07-19 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383846#comment-15383846
 ] 

Maximilian Michels commented on FLINK-4223:
---

Seems like a duplicate of FLINK-3710.

> Rearrange scaladoc and javadoc for Scala API
> 
>
> Key: FLINK-4223
> URL: https://issues.apache.org/jira/browse/FLINK-4223
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Chiwan Park
>Priority: Minor
>  Labels: easyfix, newbie
>
> Currently, some scaladocs for Scala API (Gelly Scala API, FlinkML, Streaming 
> Scala API) are not in scaladoc but in javadoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4223) Rearrange scaladoc and javadoc for Scala API

2016-07-19 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-4223.
-
Resolution: Duplicate

> Rearrange scaladoc and javadoc for Scala API
> 
>
> Key: FLINK-4223
> URL: https://issues.apache.org/jira/browse/FLINK-4223
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Chiwan Park
>Priority: Minor
>  Labels: easyfix, newbie
>
> Currently, some scaladocs for Scala API (Gelly Scala API, FlinkML, Streaming 
> Scala API) are not in scaladoc but in javadoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2265: [FLINK-3097] [table] Add support for custom functions in ...

2016-07-19 Thread wuchong
Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/2265
  
Yes, you are right. I'm just a little concerned about the class name of 
`ScalarFunction`, haha..  

In addition, Java Table API should be `table.select("hashCode(text)");` 
which is better I think.  Assume that the eval function takes two or more 
parameters,  `"udf(a,b)"` will be satisfied and be consistent with Scala Table 
API and SQL on syntax.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3097) Add support for custom functions in Table API

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383858#comment-15383858
 ] 

ASF GitHub Bot commented on FLINK-3097:
---

Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/2265
  
Yes, you are right. I'm just a little concerned about the class name of 
`ScalarFunction`, haha..  

In addition, Java Table API should be `table.select("hashCode(text)");` 
which is better I think.  Assume that the eval function takes two or more 
parameters,  `"udf(a,b)"` will be satisfied and be consistent with Scala Table 
API and SQL on syntax.


> Add support for custom functions in Table API
> -
>
> Key: FLINK-3097
> URL: https://issues.apache.org/jira/browse/FLINK-3097
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> Currently, the Table API has a very limited set of built-in functions. 
> Support for custom functions can solve this problem. Adding of a custom row 
> function could look like:
> {code}
> TableEnvironment tableEnv = new TableEnvironment();
> RowFunction rf = new RowFunction() {
> @Override
> public String call(Object[] args) {
> return ((String) args[0]).trim();
> }
> };
> tableEnv.getConfig().registerRowFunction("TRIM", rf,
> BasicTypeInfo.STRING_TYPE_INFO);
> DataSource> input = env.fromElements(
> new Tuple1<>(" 1 "));
> Table table = tableEnv.fromDataSet(input);
> Table result = table.select("TRIM(f0)");
> {code}
> This feature is also necessary as part of FLINK-2099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2109: [FLINK-3677] FileInputFormat: Allow to specify include/ex...

2016-07-19 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2109
  
I'll try to review this as soon as possible. Maybe @kl0u or @zentol also 
want to have another look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383863#comment-15383863
 ] 

ASF GitHub Bot commented on FLINK-3677:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2109
  
I'll try to review this as soon as possible. Maybe @kl0u or @zentol also 
want to have another look.


> FileInputFormat: Allow to specify include/exclude file name patterns
> 
>
> Key: FLINK-3677
> URL: https://issues.apache.org/jira/browse/FLINK-3677
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Maximilian Michels
>Assignee: Ivan Mushketyk
>Priority: Minor
>  Labels: starter
>
> It would be nice to be able to specify a regular expression to filter files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2265: [FLINK-3097] [table] Add support for custom functions in ...

2016-07-19 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2265
  
Yes, @wuchong's suggestion for the Java Table API seems more extensible. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3097) Add support for custom functions in Table API

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383869#comment-15383869
 ] 

ASF GitHub Bot commented on FLINK-3097:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2265
  
Yes, @wuchong's suggestion for the Java Table API seems more extensible. 


> Add support for custom functions in Table API
> -
>
> Key: FLINK-3097
> URL: https://issues.apache.org/jira/browse/FLINK-3097
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> Currently, the Table API has a very limited set of built-in functions. 
> Support for custom functions can solve this problem. Adding of a custom row 
> function could look like:
> {code}
> TableEnvironment tableEnv = new TableEnvironment();
> RowFunction rf = new RowFunction() {
> @Override
> public String call(Object[] args) {
> return ((String) args[0]).trim();
> }
> };
> tableEnv.getConfig().registerRowFunction("TRIM", rf,
> BasicTypeInfo.STRING_TYPE_INFO);
> DataSource> input = env.fromElements(
> new Tuple1<>(" 1 "));
> Table table = tableEnv.fromDataSet(input);
> Table result = table.select("TRIM(f0)");
> {code}
> This feature is also necessary as part of FLINK-2099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2249: [FLINK-4166] [CLI] Generate different namespaces for Zook...

2016-07-19 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2249
  
Thanks for the update. Could you rebase to the latest master?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4166) Generate automatic different namespaces in Zookeeper for Flink applications

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383888#comment-15383888
 ] 

ASF GitHub Bot commented on FLINK-4166:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2249
  
Thanks for the update. Could you rebase to the latest master?


> Generate automatic different namespaces in Zookeeper for Flink applications
> ---
>
> Key: FLINK-4166
> URL: https://issues.apache.org/jira/browse/FLINK-4166
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.0.3
>Reporter: Stefan Richter
>
> We should automatically generate different namespaces per Flink application 
> in Zookeeper to avoid interference between different applications that refer 
> to the same Zookeeper entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2249: [FLINK-4166] [CLI] Generate different namespaces f...

2016-07-19 Thread StefanRRichter
Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/2249#discussion_r71309594
  
--- Diff: docs/setup/config.md ---
@@ -272,7 +272,9 @@ For example when running Flink on YARN on an 
environment with a restrictive fire
 
 - `recovery.zookeeper.quorum`: Defines the ZooKeeper quorum URL which is 
used to connet to the ZooKeeper cluster when the 'zookeeper' recovery mode is 
selected
 
-- `recovery.zookeeper.path.root`: (Default '/flink') Defines the root dir 
under which the ZooKeeper recovery mode will create znodes.
+- `recovery.zookeeper.path.root`: (Default '/flink') Defines the root dir 
under which the ZooKeeper recovery mode will create namespace directories.
+
+- `recovery.zookeeper.path.namespace`: (Default '/default_ns' in 
standalone mode, or the  under Yarn) Defines the 
subdirectory under the root dir where the ZooKeeper recovery mode will create 
znodes. This allows to isolate multiple applications on the same ZooKeeper. 
--- End diff --

I think the main problem with default namespaces in standalone mode is that 
there is no authority generating reliably unique identifier that we can use to 
label clusters (in contrast to e.g. application ids in Yarn). Also the contents 
of the masters file could be the same for two different clusters running on the 
same machines and collisions could happen. In particular, such collisions could 
be very rare and hence even more surprising to the user. Furthermore, I think 
that having multiple clusters in this way is exactly when you want to start 
Yarn or Mesos.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4166) Generate automatic different namespaces in Zookeeper for Flink applications

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383892#comment-15383892
 ] 

ASF GitHub Bot commented on FLINK-4166:
---

Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/2249#discussion_r71309594
  
--- Diff: docs/setup/config.md ---
@@ -272,7 +272,9 @@ For example when running Flink on YARN on an 
environment with a restrictive fire
 
 - `recovery.zookeeper.quorum`: Defines the ZooKeeper quorum URL which is 
used to connet to the ZooKeeper cluster when the 'zookeeper' recovery mode is 
selected
 
-- `recovery.zookeeper.path.root`: (Default '/flink') Defines the root dir 
under which the ZooKeeper recovery mode will create znodes.
+- `recovery.zookeeper.path.root`: (Default '/flink') Defines the root dir 
under which the ZooKeeper recovery mode will create namespace directories.
+
+- `recovery.zookeeper.path.namespace`: (Default '/default_ns' in 
standalone mode, or the  under Yarn) Defines the 
subdirectory under the root dir where the ZooKeeper recovery mode will create 
znodes. This allows to isolate multiple applications on the same ZooKeeper. 
--- End diff --

I think the main problem with default namespaces in standalone mode is that 
there is no authority generating reliably unique identifier that we can use to 
label clusters (in contrast to e.g. application ids in Yarn). Also the contents 
of the masters file could be the same for two different clusters running on the 
same machines and collisions could happen. In particular, such collisions could 
be very rare and hence even more surprising to the user. Furthermore, I think 
that having multiple clusters in this way is exactly when you want to start 
Yarn or Mesos.


> Generate automatic different namespaces in Zookeeper for Flink applications
> ---
>
> Key: FLINK-4166
> URL: https://issues.apache.org/jira/browse/FLINK-4166
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.0.3
>Reporter: Stefan Richter
>
> We should automatically generate different namespaces per Flink application 
> in Zookeeper to avoid interference between different applications that refer 
> to the same Zookeeper entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2249: [FLINK-4166] [CLI] Generate different namespaces f...

2016-07-19 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2249#discussion_r71311352
  
--- Diff: docs/setup/config.md ---
@@ -272,7 +272,9 @@ For example when running Flink on YARN on an 
environment with a restrictive fire
 
 - `recovery.zookeeper.quorum`: Defines the ZooKeeper quorum URL which is 
used to connet to the ZooKeeper cluster when the 'zookeeper' recovery mode is 
selected
 
-- `recovery.zookeeper.path.root`: (Default '/flink') Defines the root dir 
under which the ZooKeeper recovery mode will create znodes.
+- `recovery.zookeeper.path.root`: (Default '/flink') Defines the root dir 
under which the ZooKeeper recovery mode will create namespace directories.
+
+- `recovery.zookeeper.path.namespace`: (Default '/default_ns' in 
standalone mode, or the  under Yarn) Defines the 
subdirectory under the root dir where the ZooKeeper recovery mode will create 
znodes. This allows to isolate multiple applications on the same ZooKeeper. 
--- End diff --

You're right, it would just be a heuristic for generating a unique 
namespace but it is not unique because multiple Flink instances might be 
running on the same host names (e.g. multiple processes, Docker containers). So 
+1 for not adding something that can cause hard to debug problems. We have 
covered the Yarn use case which makes it very convenient to setup a namespace.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4166) Generate automatic different namespaces in Zookeeper for Flink applications

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383903#comment-15383903
 ] 

ASF GitHub Bot commented on FLINK-4166:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2249#discussion_r71311352
  
--- Diff: docs/setup/config.md ---
@@ -272,7 +272,9 @@ For example when running Flink on YARN on an 
environment with a restrictive fire
 
 - `recovery.zookeeper.quorum`: Defines the ZooKeeper quorum URL which is 
used to connet to the ZooKeeper cluster when the 'zookeeper' recovery mode is 
selected
 
-- `recovery.zookeeper.path.root`: (Default '/flink') Defines the root dir 
under which the ZooKeeper recovery mode will create znodes.
+- `recovery.zookeeper.path.root`: (Default '/flink') Defines the root dir 
under which the ZooKeeper recovery mode will create namespace directories.
+
+- `recovery.zookeeper.path.namespace`: (Default '/default_ns' in 
standalone mode, or the  under Yarn) Defines the 
subdirectory under the root dir where the ZooKeeper recovery mode will create 
znodes. This allows to isolate multiple applications on the same ZooKeeper. 
--- End diff --

You're right, it would just be a heuristic for generating a unique 
namespace but it is not unique because multiple Flink instances might be 
running on the same host names (e.g. multiple processes, Docker containers). So 
+1 for not adding something that can cause hard to debug problems. We have 
covered the Yarn use case which makes it very convenient to setup a namespace.


> Generate automatic different namespaces in Zookeeper for Flink applications
> ---
>
> Key: FLINK-4166
> URL: https://issues.apache.org/jira/browse/FLINK-4166
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.0.3
>Reporter: Stefan Richter
>
> We should automatically generate different namespaces per Flink application 
> in Zookeeper to avoid interference between different applications that refer 
> to the same Zookeeper entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2268: [FLINK-4232] Make sure ./bin/flink returns correct...

2016-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2268


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-4232) Flink executable does not return correct pid

2016-07-19 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek resolved FLINK-4232.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed in 
https://github.com/apache/flink/commit/de8406aab60ddd7ed251965fefb03290d511db13

> Flink executable does not return correct pid
> 
>
> Key: FLINK-4232
> URL: https://issues.apache.org/jira/browse/FLINK-4232
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.3
>Reporter: David Moravek
>Priority: Minor
> Fix For: 1.1.0
>
>
> Eg. when using supervisor, pid returned by ./bin/flink is pid of shell 
> executable instead of java process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4232) Flink executable does not return correct pid

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383908#comment-15383908
 ] 

ASF GitHub Bot commented on FLINK-4232:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2268


> Flink executable does not return correct pid
> 
>
> Key: FLINK-4232
> URL: https://issues.apache.org/jira/browse/FLINK-4232
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.3
>Reporter: David Moravek
>Priority: Minor
> Fix For: 1.1.0
>
>
> Eg. when using supervisor, pid returned by ./bin/flink is pid of shell 
> executable instead of java process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2249: [FLINK-4166] [CLI] Generate different namespaces for Zook...

2016-07-19 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/2249
  
Thanks for the review, Max! I changed the default namespace string and 
rebased the the latest master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4166) Generate automatic different namespaces in Zookeeper for Flink applications

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384002#comment-15384002
 ] 

ASF GitHub Bot commented on FLINK-4166:
---

Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/2249
  
Thanks for the review, Max! I changed the default namespace string and 
rebased the the latest master.


> Generate automatic different namespaces in Zookeeper for Flink applications
> ---
>
> Key: FLINK-4166
> URL: https://issues.apache.org/jira/browse/FLINK-4166
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.0.3
>Reporter: Stefan Richter
>
> We should automatically generate different namespaces per Flink application 
> in Zookeeper to avoid interference between different applications that refer 
> to the same Zookeeper entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

2016-07-19 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71322624
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

It is true that it holds the registered resources but it does not hold the 
launched containers. When a `JobManager` loses its leadership the list of 
registered workers will be cleared. In order to reconstruct the mapping 
`ResourceID --> Container`, you need this new map.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384005#comment-15384005
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71322624
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

It is true that it holds the registered resources but it does not hold the 
launched containers. When a `JobManager` loses its leadership the list of 
registered workers will be cleared. In order to reconstruct the mapping 
`ResourceID --> Container`, you need this new map.


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

2016-07-19 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71323898
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -405,36 +374,13 @@ class JobManager(
 
   currentResourceManager match {
 case Some(rm) =>
-  val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
-  future.onComplete {
-case scala.util.Success(response) =>
-  // the resource manager is available and answered
-  self ! response
-case scala.util.Failure(t) =>
-  t match {
-case _: TimeoutException =>
-  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
-case _ =>
-  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
-  }
-  // slow or unreachable resource manager, register anyway and 
let the rm reconnect
-  self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
-  self ! decorateMessage(new ReconnectResourceManager(rm))
-  }(context.dispatcher)
-
+  log.info(s"Register task manager $resourceId at the resource 
manager.")
+  rm ! decorateMessage(new RegisterResource(msg))
--- End diff --

If I'm not mistaken then there is hardly any difference between a 
registered worker and a container in launch. So in the current implementation 
it shouldn't matter much whether a container is in state "being launched" or 
"launched". Thus, it does not make much of a difference whether this message 
arrives or not.

Given that the `JobManager` does not yet use the RM to allocate new 
resources, it might actually be a good idea to regard the RM as a tool to 
notify the JM about TM failures. Everything else can be added once we actually 
need it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384014#comment-15384014
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71323898
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -405,36 +374,13 @@ class JobManager(
 
   currentResourceManager match {
 case Some(rm) =>
-  val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
-  future.onComplete {
-case scala.util.Success(response) =>
-  // the resource manager is available and answered
-  self ! response
-case scala.util.Failure(t) =>
-  t match {
-case _: TimeoutException =>
-  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
-case _ =>
-  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
-  }
-  // slow or unreachable resource manager, register anyway and 
let the rm reconnect
-  self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
-  self ! decorateMessage(new ReconnectResourceManager(rm))
-  }(context.dispatcher)
-
+  log.info(s"Register task manager $resourceId at the resource 
manager.")
+  rm ! decorateMessage(new RegisterResource(msg))
--- End diff --

If I'm not mistaken then there is hardly any difference between a 
registered worker and a container in launch. So in the current implementation 
it shouldn't matter much whether a container is in state "being launched" or 
"launched". Thus, it does not make much of a difference whether this message 
arrives or not.

Given that the `JobManager` does not yet use the RM to allocate new 
resources, it might actually be a good idea to regard the RM as a tool to 
notify the JM about TM failures. Everything else can be added once we actually 
need it.


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

2016-07-19 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71325133
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -405,36 +374,13 @@ class JobManager(
 
   currentResourceManager match {
 case Some(rm) =>
-  val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
-  future.onComplete {
-case scala.util.Success(response) =>
-  // the resource manager is available and answered
-  self ! response
-case scala.util.Failure(t) =>
-  t match {
-case _: TimeoutException =>
-  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
-case _ =>
-  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
-  }
-  // slow or unreachable resource manager, register anyway and 
let the rm reconnect
-  self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
-  self ! decorateMessage(new ReconnectResourceManager(rm))
-  }(context.dispatcher)
-
+  log.info(s"Register task manager $resourceId at the resource 
manager.")
+  rm ! decorateMessage(new RegisterResource(msg))
--- End diff --

I just don't understand why you remove this functionality. It was not 
broken in any way. Of course, we can always add features later (that is true 
for any component) but it changes the original RM design. If we want to add 
monitoring of the pool size later on, we will have to re-add the proper 
registration at the RM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384026#comment-15384026
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71325133
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -405,36 +374,13 @@ class JobManager(
 
   currentResourceManager match {
 case Some(rm) =>
-  val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
-  future.onComplete {
-case scala.util.Success(response) =>
-  // the resource manager is available and answered
-  self ! response
-case scala.util.Failure(t) =>
-  t match {
-case _: TimeoutException =>
-  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
-case _ =>
-  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
-  }
-  // slow or unreachable resource manager, register anyway and 
let the rm reconnect
-  self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
-  self ! decorateMessage(new ReconnectResourceManager(rm))
-  }(context.dispatcher)
-
+  log.info(s"Register task manager $resourceId at the resource 
manager.")
+  rm ! decorateMessage(new RegisterResource(msg))
--- End diff --

I just don't understand why you remove this functionality. It was not 
broken in any way. Of course, we can always add features later (that is true 
for any component) but it changes the original RM design. If we want to add 
monitoring of the pool size later on, we will have to re-add the proper 
registration at the RM.


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384029#comment-15384029
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71325200
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -405,36 +374,13 @@ class JobManager(
 
   currentResourceManager match {
 case Some(rm) =>
-  val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
-  future.onComplete {
-case scala.util.Success(response) =>
-  // the resource manager is available and answered
-  self ! response
-case scala.util.Failure(t) =>
-  t match {
-case _: TimeoutException =>
-  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
-case _ =>
-  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
-  }
-  // slow or unreachable resource manager, register anyway and 
let the rm reconnect
-  self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
-  self ! decorateMessage(new ReconnectResourceManager(rm))
-  }(context.dispatcher)
-
+  log.info(s"Register task manager $resourceId at the resource 
manager.")
+  rm ! decorateMessage(new RegisterResource(msg))
--- End diff --

But it might be necessary to add the number of launched containers to the 
number of containers in launch in the 
`YarnFlinkResourceManager#getNumWorkersPendingRegistration`. Then the 
`checkWorkersPool` should behave correctly.


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384030#comment-15384030
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71325247
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

What would be the drawback of not cleaning this list?


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2265: [FLINK-3097] [table] Add support for custom functions in ...

2016-07-19 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2265
  
I was also thinking a lot about the names, because we have currently many 
`Function`s in Flink. I chose `UserDefinedFunction` as the top-level function 
for all user-defined functions such as `ScalarFunction`, `TableFunction`, 
`AggregateFunction`, or what ever will come in future.

If you have a look into the tests you will see that the Java API supports 
both: postfix and infix notation. So you can also call functions 
`hashCode(text)` if you like.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

2016-07-19 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71325247
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

What would be the drawback of not cleaning this list?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

2016-07-19 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71325200
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -405,36 +374,13 @@ class JobManager(
 
   currentResourceManager match {
 case Some(rm) =>
-  val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
-  future.onComplete {
-case scala.util.Success(response) =>
-  // the resource manager is available and answered
-  self ! response
-case scala.util.Failure(t) =>
-  t match {
-case _: TimeoutException =>
-  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
-case _ =>
-  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
-  }
-  // slow or unreachable resource manager, register anyway and 
let the rm reconnect
-  self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
-  self ! decorateMessage(new ReconnectResourceManager(rm))
-  }(context.dispatcher)
-
+  log.info(s"Register task manager $resourceId at the resource 
manager.")
+  rm ! decorateMessage(new RegisterResource(msg))
--- End diff --

But it might be necessary to add the number of launched containers to the 
number of containers in launch in the 
`YarnFlinkResourceManager#getNumWorkersPendingRegistration`. Then the 
`checkWorkersPool` should behave correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3097) Add support for custom functions in Table API

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384031#comment-15384031
 ] 

ASF GitHub Bot commented on FLINK-3097:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2265
  
I was also thinking a lot about the names, because we have currently many 
`Function`s in Flink. I chose `UserDefinedFunction` as the top-level function 
for all user-defined functions such as `ScalarFunction`, `TableFunction`, 
`AggregateFunction`, or what ever will come in future.

If you have a look into the tests you will see that the Java API supports 
both: postfix and infix notation. So you can also call functions 
`hashCode(text)` if you like.


> Add support for custom functions in Table API
> -
>
> Key: FLINK-3097
> URL: https://issues.apache.org/jira/browse/FLINK-3097
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> Currently, the Table API has a very limited set of built-in functions. 
> Support for custom functions can solve this problem. Adding of a custom row 
> function could look like:
> {code}
> TableEnvironment tableEnv = new TableEnvironment();
> RowFunction rf = new RowFunction() {
> @Override
> public String call(Object[] args) {
> return ((String) args[0]).trim();
> }
> };
> tableEnv.getConfig().registerRowFunction("TRIM", rf,
> BasicTypeInfo.STRING_TYPE_INFO);
> DataSource> input = env.fromElements(
> new Tuple1<>(" 1 "));
> Table table = tableEnv.fromDataSet(input);
> Table result = table.select("TRIM(f0)");
> {code}
> This feature is also necessary as part of FLINK-2099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4166) Generate automatic different namespaces in Zookeeper for Flink applications

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384035#comment-15384035
 ] 

ASF GitHub Bot commented on FLINK-4166:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2249
  
Thank you! Merging after tests pass.


> Generate automatic different namespaces in Zookeeper for Flink applications
> ---
>
> Key: FLINK-4166
> URL: https://issues.apache.org/jira/browse/FLINK-4166
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.0.3
>Reporter: Stefan Richter
>
> We should automatically generate different namespaces per Flink application 
> in Zookeeper to avoid interference between different applications that refer 
> to the same Zookeeper entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2249: [FLINK-4166] [CLI] Generate different namespaces for Zook...

2016-07-19 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2249
  
Thank you! Merging after tests pass.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2078: [FLINK-2985] Allow different field names for unionAll() i...

2016-07-19 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2078
  
Thanks for the contribution @gallenvara. I will merge it now...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2985) Allow different field names for unionAll() in Table API

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384040#comment-15384040
 ] 

ASF GitHub Bot commented on FLINK-2985:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2078
  
Thanks for the contribution @gallenvara. I will merge it now...


> Allow different field names for unionAll() in Table API
> ---
>
> Key: FLINK-2985
> URL: https://issues.apache.org/jira/browse/FLINK-2985
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Priority: Minor
>
> The recently merged `unionAll` operator checks if the field names of the left 
> and right side are equal. Actually, this is not necessary. The union operator 
> in SQL checks only the types and uses the names of left side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2078: [FLINK-2985] Allow different field names for union...

2016-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2078


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2985) Allow different field names for unionAll() in Table API

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384046#comment-15384046
 ] 

ASF GitHub Bot commented on FLINK-2985:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2078


> Allow different field names for unionAll() in Table API
> ---
>
> Key: FLINK-2985
> URL: https://issues.apache.org/jira/browse/FLINK-2985
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Priority: Minor
>
> The recently merged `unionAll` operator checks if the field names of the left 
> and right side are equal. Actually, this is not necessary. The union operator 
> in SQL checks only the types and uses the names of left side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2985) Allow different field names for unionAll() in Table API

2016-07-19 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-2985.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed in 5758c91999efcf45457e4c25d4a75b13ce13e486.

> Allow different field names for unionAll() in Table API
> ---
>
> Key: FLINK-2985
> URL: https://issues.apache.org/jira/browse/FLINK-2985
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Priority: Minor
> Fix For: 1.1.0
>
>
> The recently merged `unionAll` operator checks if the field names of the left 
> and right side are equal. Actually, this is not necessary. The union operator 
> in SQL checks only the types and uses the names of left side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2257: [FLINK-4152] Allow re-registration of TMs at resource man...

2016-07-19 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2257
  
I think by imposing this contract we actually introduced problems we didn't 
have before. Thus, in order to remedy these problems for the upcoming release 
and go back a bit in the direction of the pre-RM age, I loosened the contract 
of the RM so that it can no longer reject TM registrations. This makes sense in 
my opinion, since we don't have a mean to shut down orphaned TMs anyway.

So in this version, the RM's task is to ensure that at least a predefined 
set of resources is allocated and to notify the JM about a TM death (not 
strictly mandatory).

What we could actually do is to also register orphaned TMs (or ones 
registered by a different RM). Then we wouldn't have the problem that we 
allocate too many resources for a JM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384069#comment-15384069
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2257
  
I think by imposing this contract we actually introduced problems we didn't 
have before. Thus, in order to remedy these problems for the upcoming release 
and go back a bit in the direction of the pre-RM age, I loosened the contract 
of the RM so that it can no longer reject TM registrations. This makes sense in 
my opinion, since we don't have a mean to shut down orphaned TMs anyway.

So in this version, the RM's task is to ensure that at least a predefined 
set of resources is allocated and to notify the JM about a TM death (not 
strictly mandatory).

What we could actually do is to also register orphaned TMs (or ones 
registered by a different RM). Then we wouldn't have the problem that we 
allocate too many resources for a JM.


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

2016-07-19 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71327721
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -405,36 +374,13 @@ class JobManager(
 
   currentResourceManager match {
 case Some(rm) =>
-  val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
-  future.onComplete {
-case scala.util.Success(response) =>
-  // the resource manager is available and answered
-  self ! response
-case scala.util.Failure(t) =>
-  t match {
-case _: TimeoutException =>
-  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
-case _ =>
-  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
-  }
-  // slow or unreachable resource manager, register anyway and 
let the rm reconnect
-  self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
-  self ! decorateMessage(new ReconnectResourceManager(rm))
-  }(context.dispatcher)
-
+  log.info(s"Register task manager $resourceId at the resource 
manager.")
+  rm ! decorateMessage(new RegisterResource(msg))
--- End diff --

Because it added complexity which is no longer needed. The current design 
still allows to change it later on. Therefore I don't see a problem in removing 
it. Just because we might need it in the future, is imho not a good reason to 
keep unused code around.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384077#comment-15384077
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71327721
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
@@ -405,36 +374,13 @@ class JobManager(
 
   currentResourceManager match {
 case Some(rm) =>
-  val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
-  future.onComplete {
-case scala.util.Success(response) =>
-  // the resource manager is available and answered
-  self ! response
-case scala.util.Failure(t) =>
-  t match {
-case _: TimeoutException =>
-  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
-case _ =>
-  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
-  }
-  // slow or unreachable resource manager, register anyway and 
let the rm reconnect
-  self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
-  self ! decorateMessage(new ReconnectResourceManager(rm))
-  }(context.dispatcher)
-
+  log.info(s"Register task manager $resourceId at the resource 
manager.")
+  rm ! decorateMessage(new RegisterResource(msg))
--- End diff --

Because it added complexity which is no longer needed. The current design 
still allows to change it later on. Therefore I don't see a problem in removing 
it. Just because we might need it in the future, is imho not a good reason to 
keep unused code around.


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2491) Operators are not participating in state checkpointing in some cases

2016-07-19 Thread Valentin Denisenkov (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384081#comment-15384081
 ] 

Valentin Denisenkov commented on FLINK-2491:


Can you please give an ETA for this issue?

> Operators are not participating in state checkpointing in some cases
> 
>
> Key: FLINK-2491
> URL: https://issues.apache.org/jira/browse/FLINK-2491
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Márton Balassi
>Priority: Critical
> Fix For: 1.0.0
>
>
> While implementing a test case for the Kafka Consumer, I came across the 
> following bug:
> Consider the following topology, with the operator parallelism in parentheses:
> Source (2) --> Sink (1).
> In this setup, the {{snapshotState()}} method is called on the source, but 
> not on the Sink.
> The sink receives the generated data.
> The only one of the two sources is generating data.
> I've implemented a test case for this, you can find it here: 
> https://github.com/rmetzger/flink/blob/para_checkpoint_bug/flink-tests/src/test/java/org/apache/flink/test/checkpointing/ParallelismChangeCheckpoinedITCase.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2257: [FLINK-4152] Allow re-registration of TMs at resource man...

2016-07-19 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2257
  
I don't feel particularly great about breaking the contract between 
JobManager and ResourceManager but it is a path we can go until we expand the 
ResourceManager capabilities. The figures here would have to be updated as 
well: https://issues.apache.org/jira/browse/FLINK-3543

Overall, the changes look good to me. I don't think we can reach consensus 
on the registration matter. As of now, I don't want to block release fixes. So 
+1 to merge if you feel like it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384094#comment-15384094
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2257
  
I don't feel particularly great about breaking the contract between 
JobManager and ResourceManager but it is a path we can go until we expand the 
ResourceManager capabilities. The figures here would have to be updated as 
well: https://issues.apache.org/jira/browse/FLINK-3543

Overall, the changes look good to me. I don't think we can reach consensus 
on the registration matter. As of now, I don't want to block release fixes. So 
+1 to merge if you feel like it.


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

2016-07-19 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71331665
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

You tell me, since you've authored this component. I guess there was a 
reason why you clear the list of registered resources when a JM loses its 
leadership.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

2016-07-19 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71331743
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

I basically introduced this map to not break the semantic contract of the 
`FlinkResourceManager` component.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384104#comment-15384104
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71331743
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

I basically introduced this map to not break the semantic contract of the 
`FlinkResourceManager` component.


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384103#comment-15384103
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71331665
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

You tell me, since you've authored this component. I guess there was a 
reason why you clear the list of registered resources when a JM loses its 
leadership.


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4233) Simplify leader election / leader session ID assignment

2016-07-19 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4233:
---

 Summary: Simplify leader election / leader session ID assignment
 Key: FLINK-4233
 URL: https://issues.apache.org/jira/browse/FLINK-4233
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Coordination
Affects Versions: 1.0.3
Reporter: Stephan Ewen


Currently, there are two separate actions and znodes involved in leader 
election and communication of the leader session ID and leader URL.

This leads to some quite elaborate code that tries to make sure that the leader 
session ID and leader URL always eventually converge to those of the leader.

It is simpler to just encode both the ID and the URL into an id-string that is 
attached to the leader latch znode. One would have to create a new leader latch 
each time a contender re-applies for leadership.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2249: [FLINK-4166] [CLI] Generate different namespaces for Zook...

2016-07-19 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2249
  
Travis cache seems to be corrupted. Have you run `mvn verify` from the 
command-line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4166) Generate automatic different namespaces in Zookeeper for Flink applications

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384105#comment-15384105
 ] 

ASF GitHub Bot commented on FLINK-4166:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2249
  
Travis cache seems to be corrupted. Have you run `mvn verify` from the 
command-line?


> Generate automatic different namespaces in Zookeeper for Flink applications
> ---
>
> Key: FLINK-4166
> URL: https://issues.apache.org/jira/browse/FLINK-4166
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.0.3
>Reporter: Stefan Richter
>
> We should automatically generate different namespaces per Flink application 
> in Zookeeper to avoid interference between different applications that refer 
> to the same Zookeeper entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2182: [Flink-4130] CallGenerator could generate illegal code wh...

2016-07-19 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2182
  
Thanks for the contribution @unsleepy22.
I changed the code a little bit, so that we can prevent an `if(false)` 
branch.

Will merge...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2182: [Flink-4130] CallGenerator could generate illegal ...

2016-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2182


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2149: [FLINK-4084] Add configDir parameter to CliFronten...

2016-07-19 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71332501
  
--- Diff: docs/apis/cli.md ---
@@ -187,6 +187,8 @@ Action "run" compiles and runs a program.
   java.net.URLClassLoader}.
  -d,--detachedIf present, runs the job in 
detached
   mode
+--configDir The configuration directory with 
which
--- End diff --

I think there is a tab, we use only spaces for formatting here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4084) Add configDir parameter to CliFrontend and flink shell script

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384109#comment-15384109
 ] 

ASF GitHub Bot commented on FLINK-4084:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71332501
  
--- Diff: docs/apis/cli.md ---
@@ -187,6 +187,8 @@ Action "run" compiles and runs a program.
   java.net.URLClassLoader}.
  -d,--detachedIf present, runs the job in 
detached
   mode
+--configDir The configuration directory with 
which
--- End diff --

I think there is a tab, we use only spaces for formatting here.


> Add configDir parameter to CliFrontend and flink shell script
> -
>
> Key: FLINK-4084
> URL: https://issues.apache.org/jira/browse/FLINK-4084
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Andrea Sella
>Priority: Minor
>
> At the moment there is no other way than the environment variable 
> FLINK_CONF_DIR to specify the configuration directory for the CliFrontend if 
> it is started via the flink shell script. In order to improve the user 
> exprience, I would propose to introduce a {{--configDir}} parameter which the 
> user can use to specify a configuration directory more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2265: [FLINK-3097] [table] Add support for custom functions in ...

2016-07-19 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2265
  
Ah ok, thats perfect! (about infix and postfix)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3097) Add support for custom functions in Table API

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384111#comment-15384111
 ] 

ASF GitHub Bot commented on FLINK-3097:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2265
  
Ah ok, thats perfect! (about infix and postfix)


> Add support for custom functions in Table API
> -
>
> Key: FLINK-3097
> URL: https://issues.apache.org/jira/browse/FLINK-3097
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> Currently, the Table API has a very limited set of built-in functions. 
> Support for custom functions can solve this problem. Adding of a custom row 
> function could look like:
> {code}
> TableEnvironment tableEnv = new TableEnvironment();
> RowFunction rf = new RowFunction() {
> @Override
> public String call(Object[] args) {
> return ((String) args[0]).trim();
> }
> };
> tableEnv.getConfig().registerRowFunction("TRIM", rf,
> BasicTypeInfo.STRING_TYPE_INFO);
> DataSource> input = env.fromElements(
> new Tuple1<>(" 1 "));
> Table table = tableEnv.fromDataSet(input);
> Table result = table.select("TRIM(f0)");
> {code}
> This feature is also necessary as part of FLINK-2099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2491) Operators are not participating in state checkpointing in some cases

2016-07-19 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384113#comment-15384113
 ] 

Aljoscha Krettek commented on FLINK-2491:
-

I think we want to tackle this for the 1.2 release, which should happen roughly 
3 months after 1.1 is out. We're currently in the last stages of releasing 1.1.

> Operators are not participating in state checkpointing in some cases
> 
>
> Key: FLINK-2491
> URL: https://issues.apache.org/jira/browse/FLINK-2491
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Márton Balassi
>Priority: Critical
> Fix For: 1.0.0
>
>
> While implementing a test case for the Kafka Consumer, I came across the 
> following bug:
> Consider the following topology, with the operator parallelism in parentheses:
> Source (2) --> Sink (1).
> In this setup, the {{snapshotState()}} method is called on the source, but 
> not on the Sink.
> The sink receives the generated data.
> The only one of the two sources is generating data.
> I've implemented a test case for this, you can find it here: 
> https://github.com/rmetzger/flink/blob/para_checkpoint_bug/flink-tests/src/test/java/org/apache/flink/test/checkpointing/ParallelismChangeCheckpoinedITCase.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4130) CallGenerator could generate illegal code when taking no operands

2016-07-19 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-4130.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed in dd53831aa00e8160e8db14d6de186ce8f1f82b92.

> CallGenerator could generate illegal code when taking no operands
> -
>
> Key: FLINK-4130
> URL: https://issues.apache.org/jira/browse/FLINK-4130
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.1.0
>Reporter: Cody
>Priority: Minor
> Fix For: 1.1.0
>
>
> In CallGenerator, when a call takes no operands, and null check is enabled, 
> it will generate code like:
> boolean isNull$17 = ;
> which will fail to compile at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3097) Add support for custom functions in Table API

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384115#comment-15384115
 ] 

ASF GitHub Bot commented on FLINK-3097:
---

Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/2265
  
Yes, I see. That's great!


> Add support for custom functions in Table API
> -
>
> Key: FLINK-3097
> URL: https://issues.apache.org/jira/browse/FLINK-3097
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> Currently, the Table API has a very limited set of built-in functions. 
> Support for custom functions can solve this problem. Adding of a custom row 
> function could look like:
> {code}
> TableEnvironment tableEnv = new TableEnvironment();
> RowFunction rf = new RowFunction() {
> @Override
> public String call(Object[] args) {
> return ((String) args[0]).trim();
> }
> };
> tableEnv.getConfig().registerRowFunction("TRIM", rf,
> BasicTypeInfo.STRING_TYPE_INFO);
> DataSource> input = env.fromElements(
> new Tuple1<>(" 1 "));
> Table table = tableEnv.fromDataSet(input);
> Table result = table.select("TRIM(f0)");
> {code}
> This feature is also necessary as part of FLINK-2099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2265: [FLINK-3097] [table] Add support for custom functions in ...

2016-07-19 Thread wuchong
Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/2265
  
Yes, I see. That's great!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2257: [FLINK-4152] Allow re-registration of TMs at resource man...

2016-07-19 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2257
  
My main concern is actually that the JM-RM interaction is not well tested. 
Thus, I fear the more complex the code is the more possibility for mistakes 
there are. For example, I couldn't find a test where we test the 
`ReconnectResourceManager` functionality. Given that we're about to release 1.1 
shortly, I would go for the simplest approach. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384133#comment-15384133
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2257
  
My main concern is actually that the JM-RM interaction is not well tested. 
Thus, I fear the more complex the code is the more possibility for mistakes 
there are. For example, I couldn't find a test where we test the 
`ReconnectResourceManager` functionality. Given that we're about to release 1.1 
shortly, I would go for the simplest approach. 


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2269: [FLINK-4190] Generalise RollingSink to work with a...

2016-07-19 Thread joshfg
GitHub user joshfg opened a pull request:

https://github.com/apache/flink/pull/2269

[FLINK-4190] Generalise RollingSink to work with arbitrary buckets

I've created a new bucketing package with a BucketingSink, which improves 
on the existing RollingSink by enabling arbitrary bucketing, rather than just 
rolling files based on system time.

The main changes to support this are:

- The Bucketer interface now takes the sink's input element as a generic 
parameter, enabling us to bucket based on attributes of the sink's input.
- While maintaining the same rolling mechanics of the existing 
implementation (e.g. rolling when the file size reaches a threshold), the sink 
implementation can now have many 'active' buckets at any point in time. The 
checkpointing mechanics have been extended to support maintaining the state of 
multiple active buckets and files, instead of just one.
- For use cases where the buckets being written to are changing over time, 
the sink now needs to determine when a bucket has become 'inactive', in order 
to flush and close the file. In the existing implementation, this is simply 
when the bucket path changes. Instead, we now determine a bucket as inactive if 
it hasn't been written to recently. To support this there are two additional 
user configurable settings: inactiveBucketCheckInterval and 
inactiveBucketThreshold.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/joshfg/flink flink-4190

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2269.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2269


commit 2011e47de6c8b3c087772c84b4b3e44210dbe50c
Author: Josh 
Date:   2016-07-12T17:38:54Z

[FLINK-4190] Generalise RollingSink to work with arbitrary buckets




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4190) Generalise RollingSink to work with arbitrary buckets

2016-07-19 Thread Josh Forman-Gornall (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384206#comment-15384206
 ] 

Josh Forman-Gornall commented on FLINK-4190:


Ok cool, thanks! I've just submitted a pull request. 

> Generalise RollingSink to work with arbitrary buckets
> -
>
> Key: FLINK-4190
> URL: https://issues.apache.org/jira/browse/FLINK-4190
> Project: Flink
>  Issue Type: Improvement
>  Components: filesystem-connector, Streaming Connectors
>Reporter: Josh Forman-Gornall
>Assignee: Josh Forman-Gornall
>Priority: Minor
>
> The current RollingSink implementation appears to be intended for writing to 
> directories that are bucketed by system time (e.g. minutely) and to only be 
> writing to one file within one bucket at any point in time. When the system 
> time determines that the current bucket should be changed, the current bucket 
> and file are closed and a new bucket and file are created. The sink cannot be 
> used for the more general problem of writing to arbitrary buckets, perhaps 
> determined by an attribute on the element/tuple being processed.
> There are three limitations which prevent the existing sink from being used 
> for more general problems:
> - Only bucketing by the current system time is supported, and not by e.g. an 
> attribute of the element being processed by the sink.
> - Whenever the sink sees a change in the bucket being written to, it flushes 
> the file and moves on to the new bucket. Therefore the sink cannot have more 
> than one bucket/file open at a time. Additionally the checkpointing mechanics 
> only support saving the state of one active bucket and file.
> - The sink determines that it should 'close' an active bucket and file when 
> the bucket path changes. We need another way to determine when a bucket has 
> become inactive and needs to be closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2257: [FLINK-4152] Allow re-registration of TMs at resou...

2016-07-19 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71350758
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

>You tell me, since you've authored this component. 

I think there is a misunderstanding. I did author this component but it is 
not mine. I know you spend some time working on this and I just want to 
understand your motives. I thought actually you had tried this fix since you 
mentioned it in the JIRA.

I think clearing the list is simply a bug and artifact of an old code 
design where the ResourceManager would be the central instance for TaskManager 
registration. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4230) Session Windowing IT Case

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384242#comment-15384242
 ] 

ASF GitHub Bot commented on FLINK-4230:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2263
  
Nice pice of code! I finally understood how it works... 😃 

Some remarks about the code: in some places there are method names that 
seem to stem from an initial implementation but don't match the current code 
anymore. For example, `SessionEventGeneratorDataSource.createTestStream()` 
returns a "generator" so it could be called `createGenerator()`.  Also, there 
are some unused methods (for example in `EventGeneratorFactory`) and methods 
with generated Javadoc that don't have any actual content. Could you please 
have another pass over the code and remove the unused methods and remove or fix 
the Javadoc. Some of the classes could also use a class-level Javadoc.

In `SessionEventGeneratorImpl`, the name `generateLateTimestamp()` might be 
a bit misleading. It just creates timestamps in the range of allowed 
timestamps. Both `InLatenessGenerator` and `AfterLatenessGenerator` use the 
method in the same way, just the behavior of `canGenerateEventAtWatermark()` 
determines whether the generated elements will be late or not. Here, a good 
comment on `canGenerateEventAtWatermark()` might help on the base interface. 
Also, it might make sense to make the testing source non-parallel. If we have 
parallelism 2 and one source regularly advances the watermark but the other 
source never advances the watermark the elements that are generated as "late" 
by the first source are not considered late at the window operator because the 
watermark at the window operator cannot advance.


> Session Windowing IT Case
> -
>
> Key: FLINK-4230
> URL: https://issues.apache.org/jira/browse/FLINK-4230
> Project: Flink
>  Issue Type: Test
>  Components: DataStream API, Local Runtime
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>
> An ITCase for Session Windows is missing that tests correct behavior under 
> several parallel sessions, with timely events, late events within and after 
> the lateness interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2263: [FLINK-4230] [DataStreamAPI] Add Session Windowing ITCase

2016-07-19 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2263
  
Nice pice of code! I finally understood how it works... 😃 

Some remarks about the code: in some places there are method names that 
seem to stem from an initial implementation but don't match the current code 
anymore. For example, `SessionEventGeneratorDataSource.createTestStream()` 
returns a "generator" so it could be called `createGenerator()`.  Also, there 
are some unused methods (for example in `EventGeneratorFactory`) and methods 
with generated Javadoc that don't have any actual content. Could you please 
have another pass over the code and remove the unused methods and remove or fix 
the Javadoc. Some of the classes could also use a class-level Javadoc.

In `SessionEventGeneratorImpl`, the name `generateLateTimestamp()` might be 
a bit misleading. It just creates timestamps in the range of allowed 
timestamps. Both `InLatenessGenerator` and `AfterLatenessGenerator` use the 
method in the same way, just the behavior of `canGenerateEventAtWatermark()` 
determines whether the generated elements will be late or not. Here, a good 
comment on `canGenerateEventAtWatermark()` might help on the base interface. 
Also, it might make sense to make the testing source non-parallel. If we have 
parallelism 2 and one source regularly advances the watermark but the other 
source never advances the watermark the elements that are generated as "late" 
by the first source are not considered late at the window operator because the 
watermark at the window operator cannot advance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-3792) RowTypeInfo equality should not depend on field names

2016-07-19 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-3792.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed as part of FLINK-2985.

> RowTypeInfo equality should not depend on field names
> -
>
> Key: FLINK-3792
> URL: https://issues.apache.org/jira/browse/FLINK-3792
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
> Fix For: 1.1.0
>
>
> Currently, two Rows with the same field types but different field names are 
> not considered equal by the Table API and SQL. This behavior might create 
> problems, e.g. it makes the following union query fail:
> {code}
> SELECT STREAM a, b, c FROM T1 UNION ALL 
> (SELECT STREAM d, e, f FROM T2 WHERE d < 3)
> {code}
> where a, b, c and d, e, f are fields of corresponding types.
> {code}
> Cannot union streams of different types: org.apache.flink.api.table.Row(a: 
> Integer, b: Long, c: String) and org.apache.flink.api.table.Row(d: Integer, 
> e: Long, f: String)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4183) Move checking for StreamTableEnvironment into validation layer

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384245#comment-15384245
 ] 

ASF GitHub Bot commented on FLINK-4183:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2221
  
Merging...


> Move checking for StreamTableEnvironment into validation layer
> --
>
> Key: FLINK-4183
> URL: https://issues.apache.org/jira/browse/FLINK-4183
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Minor
>
> Some operators check the environment in `table.scala` instead of doing this 
> during the valdation phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2221: [FLINK-4183] [table] Move checking for StreamTableEnviron...

2016-07-19 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2221
  
Merging...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4190) Generalise RollingSink to work with arbitrary buckets

2016-07-19 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384250#comment-15384250
 ] 

Aljoscha Krettek commented on FLINK-4190:
-

Did you open the PR with the correct FLINK-4190 tag? Normally, it should show 
up here.

Could you post a link to the PR?

> Generalise RollingSink to work with arbitrary buckets
> -
>
> Key: FLINK-4190
> URL: https://issues.apache.org/jira/browse/FLINK-4190
> Project: Flink
>  Issue Type: Improvement
>  Components: filesystem-connector, Streaming Connectors
>Reporter: Josh Forman-Gornall
>Assignee: Josh Forman-Gornall
>Priority: Minor
>
> The current RollingSink implementation appears to be intended for writing to 
> directories that are bucketed by system time (e.g. minutely) and to only be 
> writing to one file within one bucket at any point in time. When the system 
> time determines that the current bucket should be changed, the current bucket 
> and file are closed and a new bucket and file are created. The sink cannot be 
> used for the more general problem of writing to arbitrary buckets, perhaps 
> determined by an attribute on the element/tuple being processed.
> There are three limitations which prevent the existing sink from being used 
> for more general problems:
> - Only bucketing by the current system time is supported, and not by e.g. an 
> attribute of the element being processed by the sink.
> - Whenever the sink sees a change in the bucket being written to, it flushes 
> the file and moves on to the new bucket. Therefore the sink cannot have more 
> than one bucket/file open at a time. Additionally the checkpointing mechanics 
> only support saving the state of one active bucket and file.
> - The sink determines that it should 'close' an active bucket and file when 
> the bucket path changes. We need another way to determine when a bucket has 
> become inactive and needs to be closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2221: [FLINK-4183] [table] Move checking for StreamTable...

2016-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2221


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4183) Move checking for StreamTableEnvironment into validation layer

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384251#comment-15384251
 ] 

ASF GitHub Bot commented on FLINK-4183:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2221


> Move checking for StreamTableEnvironment into validation layer
> --
>
> Key: FLINK-4183
> URL: https://issues.apache.org/jira/browse/FLINK-4183
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Minor
>
> Some operators check the environment in `table.scala` instead of doing this 
> during the valdation phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384234#comment-15384234
 ] 

ASF GitHub Bot commented on FLINK-4152:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2257#discussion_r71350758
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting 
for it to register */
private final Map containersInLaunch;
 
+   /** The container where a TaskManager has been started and is running 
in */
+   private final Map containersLaunched;
--- End diff --

>You tell me, since you've authored this component. 

I think there is a misunderstanding. I did author this component but it is 
not mine. I know you spend some time working on this and I just want to 
understand your motives. I thought actually you had tried this fix since you 
mentioned it in the JIRA.

I think clearing the list is simply a bug and artifact of an old code 
design where the ResourceManager would be the central instance for TaskManager 
registration. 


> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4183) Move checking for StreamTableEnvironment into validation layer

2016-07-19 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-4183.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed in e85f787b280b63960e7f3add5aa8613b4ee23795.

> Move checking for StreamTableEnvironment into validation layer
> --
>
> Key: FLINK-4183
> URL: https://issues.apache.org/jira/browse/FLINK-4183
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Minor
> Fix For: 1.1.0
>
>
> Some operators check the environment in `table.scala` instead of doing this 
> during the valdation phase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2263: [FLINK-4230] [DataStreamAPI] Add Session Windowing ITCase

2016-07-19 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/2263
  
Thanks a lot for the review. I agree on all your points and will address 
them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4084) Add configDir parameter to CliFrontend and flink shell script

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384268#comment-15384268
 ] 

ASF GitHub Bot commented on FLINK-4084:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71353752
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java 
---
@@ -451,4 +479,25 @@ public static InfoOptions parseInfoCommand(String[] 
args) throws CliArgsExceptio
}
}
 
+   public static MainOptions parseMainCommand(String[] args) throws 
CliArgsException {
+
+   // drop all arguments after an action
+   final List params= Arrays.asList(args);
--- End diff --

space missing


> Add configDir parameter to CliFrontend and flink shell script
> -
>
> Key: FLINK-4084
> URL: https://issues.apache.org/jira/browse/FLINK-4084
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Andrea Sella
>Priority: Minor
>
> At the moment there is no other way than the environment variable 
> FLINK_CONF_DIR to specify the configuration directory for the CliFrontend if 
> it is started via the flink shell script. In order to improve the user 
> exprience, I would propose to introduce a {{--configDir}} parameter which the 
> user can use to specify a configuration directory more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2149: [FLINK-4084] Add configDir parameter to CliFronten...

2016-07-19 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71353752
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java 
---
@@ -451,4 +479,25 @@ public static InfoOptions parseInfoCommand(String[] 
args) throws CliArgsExceptio
}
}
 
+   public static MainOptions parseMainCommand(String[] args) throws 
CliArgsException {
+
+   // drop all arguments after an action
+   final List params= Arrays.asList(args);
--- End diff --

space missing


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4230) Session Windowing IT Case

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384274#comment-15384274
 ] 

ASF GitHub Bot commented on FLINK-4230:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2263
  
Oh, and I forgot: the checking for the correct number of elements can be 
moved out of the window function and into the test itself, like this:

```
JobExecutionResult result = env.execute();

Assert.assertEquals(
(LATE_EVENTS_PER_SESSION + 1) * NUMBER_OF_SESSIONS * 
EVENTS_PER_SESSION,
result.getAccumulatorResult(SESSION_COUNTER_ON_TIME_KEY));
Assert.assertEquals(
NUMBER_OF_SESSIONS * (LATE_EVENTS_PER_SESSION * 
(LATE_EVENTS_PER_SESSION + 1) / 2),
result.getAccumulatorResult(SESSION_COUNTER_LATE_KEY));
```

Also, you can let the test class extend 
`StreamingMultipleProgramsTestBase`. This will setup a testing cluster with 
parallelism 4. You can then use this inside your test:
```
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment()
```

If you make the source non-parallel the window operator will then run with 
parallelism 4 and counting the number of elements after the job is done will 
accumulate the counts from all parallel instances.


> Session Windowing IT Case
> -
>
> Key: FLINK-4230
> URL: https://issues.apache.org/jira/browse/FLINK-4230
> Project: Flink
>  Issue Type: Test
>  Components: DataStream API, Local Runtime
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>
> An ITCase for Session Windows is missing that tests correct behavior under 
> several parallel sessions, with timely events, late events within and after 
> the lateness interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2263: [FLINK-4230] [DataStreamAPI] Add Session Windowing ITCase

2016-07-19 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2263
  
Oh, and I forgot: the checking for the correct number of elements can be 
moved out of the window function and into the test itself, like this:

```
JobExecutionResult result = env.execute();

Assert.assertEquals(
(LATE_EVENTS_PER_SESSION + 1) * NUMBER_OF_SESSIONS * 
EVENTS_PER_SESSION,
result.getAccumulatorResult(SESSION_COUNTER_ON_TIME_KEY));
Assert.assertEquals(
NUMBER_OF_SESSIONS * (LATE_EVENTS_PER_SESSION * 
(LATE_EVENTS_PER_SESSION + 1) / 2),
result.getAccumulatorResult(SESSION_COUNTER_LATE_KEY));
```

Also, you can let the test class extend 
`StreamingMultipleProgramsTestBase`. This will setup a testing cluster with 
parallelism 4. You can then use this inside your test:
```
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment()
```

If you make the source non-parallel the window operator will then run with 
parallelism 4 and counting the number of elements after the job is done will 
accumulate the counts from all parallel instances.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2149: [FLINK-4084] Add configDir parameter to CliFronten...

2016-07-19 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71354207
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java 
---
@@ -451,4 +479,25 @@ public static InfoOptions parseInfoCommand(String[] 
args) throws CliArgsExceptio
}
}
 
+   public static MainOptions parseMainCommand(String[] args) throws 
CliArgsException {
+
+   // drop all arguments after an action
+   final List params= Arrays.asList(args);
+   for (String action: CliFrontend.ACTIONS) {
+   int index = params.indexOf(action);
+   if(index != -1) {
+   args = Arrays.copyOfRange(args, 0, index);
--- End diff --

I think dropping the args is not necessary if you use the parsing below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4230) Session Windowing IT Case

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384252#comment-15384252
 ] 

ASF GitHub Bot commented on FLINK-4230:
---

Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/2263
  
Thanks a lot for the review. I agree on all your points and will address 
them.


> Session Windowing IT Case
> -
>
> Key: FLINK-4230
> URL: https://issues.apache.org/jira/browse/FLINK-4230
> Project: Flink
>  Issue Type: Test
>  Components: DataStream API, Local Runtime
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>
> An ITCase for Session Windows is missing that tests correct behavior under 
> several parallel sessions, with timely events, late events within and after 
> the lateness interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4084) Add configDir parameter to CliFrontend and flink shell script

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384276#comment-15384276
 ] 

ASF GitHub Bot commented on FLINK-4084:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71354207
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java 
---
@@ -451,4 +479,25 @@ public static InfoOptions parseInfoCommand(String[] 
args) throws CliArgsExceptio
}
}
 
+   public static MainOptions parseMainCommand(String[] args) throws 
CliArgsException {
+
+   // drop all arguments after an action
+   final List params= Arrays.asList(args);
+   for (String action: CliFrontend.ACTIONS) {
+   int index = params.indexOf(action);
+   if(index != -1) {
+   args = Arrays.copyOfRange(args, 0, index);
--- End diff --

I think dropping the args is not necessary if you use the parsing below.


> Add configDir parameter to CliFrontend and flink shell script
> -
>
> Key: FLINK-4084
> URL: https://issues.apache.org/jira/browse/FLINK-4084
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Andrea Sella
>Priority: Minor
>
> At the moment there is no other way than the environment variable 
> FLINK_CONF_DIR to specify the configuration directory for the CliFrontend if 
> it is started via the flink shell script. In order to improve the user 
> exprience, I would propose to introduce a {{--configDir}} parameter which the 
> user can use to specify a configuration directory more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4084) Add configDir parameter to CliFrontend and flink shell script

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384278#comment-15384278
 ] 

ASF GitHub Bot commented on FLINK-4084:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71354329
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java ---
@@ -148,4 +148,5 @@ public boolean getDetachedMode() {
public String getSavepointPath() {
return savepointPath;
}
+
--- End diff --

The changes in this file are not necessary. Could you revert?


> Add configDir parameter to CliFrontend and flink shell script
> -
>
> Key: FLINK-4084
> URL: https://issues.apache.org/jira/browse/FLINK-4084
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Andrea Sella
>Priority: Minor
>
> At the moment there is no other way than the environment variable 
> FLINK_CONF_DIR to specify the configuration directory for the CliFrontend if 
> it is started via the flink shell script. In order to improve the user 
> exprience, I would propose to introduce a {{--configDir}} parameter which the 
> user can use to specify a configuration directory more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4084) Add configDir parameter to CliFrontend and flink shell script

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384279#comment-15384279
 ] 

ASF GitHub Bot commented on FLINK-4084:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71354413
  
--- Diff: 
flink-clients/src/test/java/org/apache/flink/client/CliFrontendMainTest.java ---
@@ -0,0 +1,39 @@
+package org.apache.flink.client;
+
+
+import org.apache.flink.client.cli.CliArgsException;
+import org.apache.flink.client.cli.CliFrontendParser;
+import org.apache.flink.client.cli.MainOptions;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.FileNotFoundException;
+import java.net.MalformedURLException;
+
+import static org.apache.flink.client.CliFrontendTestUtils.getTestJarPath;
+import static org.junit.Assert.assertEquals;
+
+public class CliFrontendMainTest {
--- End diff --

Maybe `CliFrontendMainArgsTest`?


> Add configDir parameter to CliFrontend and flink shell script
> -
>
> Key: FLINK-4084
> URL: https://issues.apache.org/jira/browse/FLINK-4084
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Andrea Sella
>Priority: Minor
>
> At the moment there is no other way than the environment variable 
> FLINK_CONF_DIR to specify the configuration directory for the CliFrontend if 
> it is started via the flink shell script. In order to improve the user 
> exprience, I would propose to introduce a {{--configDir}} parameter which the 
> user can use to specify a configuration directory more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4084) Add configDir parameter to CliFrontend and flink shell script

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384283#comment-15384283
 ] 

ASF GitHub Bot commented on FLINK-4084:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71354777
  
--- Diff: flink-dist/src/main/flink-bin/bin/flink ---
@@ -17,21 +17,39 @@
 # limitations under the License.
--- End diff --

You change the mode from 644 to 755.


> Add configDir parameter to CliFrontend and flink shell script
> -
>
> Key: FLINK-4084
> URL: https://issues.apache.org/jira/browse/FLINK-4084
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Andrea Sella
>Priority: Minor
>
> At the moment there is no other way than the environment variable 
> FLINK_CONF_DIR to specify the configuration directory for the CliFrontend if 
> it is started via the flink shell script. In order to improve the user 
> exprience, I would propose to introduce a {{--configDir}} parameter which the 
> user can use to specify a configuration directory more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2149: [FLINK-4084] Add configDir parameter to CliFronten...

2016-07-19 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71354132
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java 
---
@@ -451,4 +479,25 @@ public static InfoOptions parseInfoCommand(String[] 
args) throws CliArgsExceptio
}
}
 
+   public static MainOptions parseMainCommand(String[] args) throws 
CliArgsException {
+
+   // drop all arguments after an action
+   final List params= Arrays.asList(args);
+   for (String action: CliFrontend.ACTIONS) {
+   int index = params.indexOf(action);
+   if(index != -1) {
+   args = Arrays.copyOfRange(args, 0, index);
+   break;
+   }
+   }
+
+   try {
+   DefaultParser parser = new DefaultParser();
+   CommandLine line = parser.parse(MAIN_OPTIONS, args, 
false);
--- End diff --

Actually you can use `CommandLine line = parser.parse(MAIN_OPTIONS, args, 
true);` to halt once you reach a non-arg.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4190) Generalise RollingSink to work with arbitrary buckets

2016-07-19 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384249#comment-15384249
 ] 

Aljoscha Krettek commented on FLINK-4190:
-

Did you open the PR with the correct FLINK-4190 tag? Normally, it should show 
up here.

Could you post a link to the PR?

> Generalise RollingSink to work with arbitrary buckets
> -
>
> Key: FLINK-4190
> URL: https://issues.apache.org/jira/browse/FLINK-4190
> Project: Flink
>  Issue Type: Improvement
>  Components: filesystem-connector, Streaming Connectors
>Reporter: Josh Forman-Gornall
>Assignee: Josh Forman-Gornall
>Priority: Minor
>
> The current RollingSink implementation appears to be intended for writing to 
> directories that are bucketed by system time (e.g. minutely) and to only be 
> writing to one file within one bucket at any point in time. When the system 
> time determines that the current bucket should be changed, the current bucket 
> and file are closed and a new bucket and file are created. The sink cannot be 
> used for the more general problem of writing to arbitrary buckets, perhaps 
> determined by an attribute on the element/tuple being processed.
> There are three limitations which prevent the existing sink from being used 
> for more general problems:
> - Only bucketing by the current system time is supported, and not by e.g. an 
> attribute of the element being processed by the sink.
> - Whenever the sink sees a change in the bucket being written to, it flushes 
> the file and moves on to the new bucket. Therefore the sink cannot have more 
> than one bucket/file open at a time. Additionally the checkpointing mechanics 
> only support saving the state of one active bucket and file.
> - The sink determines that it should 'close' an active bucket and file when 
> the bucket path changes. We need another way to determine when a bucket has 
> become inactive and needs to be closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4084) Add configDir parameter to CliFrontend and flink shell script

2016-07-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384275#comment-15384275
 ] 

ASF GitHub Bot commented on FLINK-4084:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/2149#discussion_r71354132
  
--- Diff: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java 
---
@@ -451,4 +479,25 @@ public static InfoOptions parseInfoCommand(String[] 
args) throws CliArgsExceptio
}
}
 
+   public static MainOptions parseMainCommand(String[] args) throws 
CliArgsException {
+
+   // drop all arguments after an action
+   final List params= Arrays.asList(args);
+   for (String action: CliFrontend.ACTIONS) {
+   int index = params.indexOf(action);
+   if(index != -1) {
+   args = Arrays.copyOfRange(args, 0, index);
+   break;
+   }
+   }
+
+   try {
+   DefaultParser parser = new DefaultParser();
+   CommandLine line = parser.parse(MAIN_OPTIONS, args, 
false);
--- End diff --

Actually you can use `CommandLine line = parser.parse(MAIN_OPTIONS, args, 
true);` to halt once you reach a non-arg.


> Add configDir parameter to CliFrontend and flink shell script
> -
>
> Key: FLINK-4084
> URL: https://issues.apache.org/jira/browse/FLINK-4084
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Andrea Sella
>Priority: Minor
>
> At the moment there is no other way than the environment variable 
> FLINK_CONF_DIR to specify the configuration directory for the CliFrontend if 
> it is started via the flink shell script. In order to improve the user 
> exprience, I would propose to introduce a {{--configDir}} parameter which the 
> user can use to specify a configuration directory more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >