[GitHub] flink pull request: [Discuss] Simplify SplittableIterator interfac...

2015-01-26 Thread rmetzger
GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/338

[Discuss] Simplify SplittableIterator interface

While working on something, I found the SplittableIterator interface 
unnecessary complicated.
Let me know if you agree to merge this simplification.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink fix_interface

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/338.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #338


commit 4c2e3fb272c263d149e7b839fb2d9d496232a8be
Author: Robert Metzger 
Date:   2015-01-25T18:52:02Z

Simplify SplittableIterator interface

commit fc0aff770fc51f7b944c6ef1c7f17c4a79d0c2ca
Author: Robert Metzger 
Date:   2015-01-25T18:55:46Z

fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1433] Add HADOOP_CLASSPATH to start scr...

2015-01-26 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/337#issuecomment-71429315
  
It depends on what the user does with the `HADOOP_CLASSPATH`.
In my understanding, it is meant as a variable for adding 3rd party jar 
files to Hadoop. The jar files of hadoop are added to the `CLASSPATH` variable 
in the `libexec/hadoop-config.sh` script. There, you see variables like 
`HADOOP_COMMON_LIB_JARS_DIR`, `HDFS_LIB_JARS_DIR`, `YARN_LIB_JARS_DIR`, ... 
being added to the CLASSPATH. In the very last step, they add the 
HADOOP_CLASSPATH variable (by default to the end of the classpath, but there is 
an additional option to put it in front of it).

I found that we need to add this on Google Compute Engine's Hadoop 
deployment. They have their Google Storage configured by default but it 
currently doesn't work in non-yarn setups because the Google Storage jar is not 
in our classpath. On these clusters, the `HADOOP_CLASSPATH` variable contains 
the path to the storage-jar.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1433) Add HADOOP_CLASSPATH to start scripts

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291582#comment-14291582
 ] 

ASF GitHub Bot commented on FLINK-1433:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/337#issuecomment-71429315
  
It depends on what the user does with the `HADOOP_CLASSPATH`.
In my understanding, it is meant as a variable for adding 3rd party jar 
files to Hadoop. The jar files of hadoop are added to the `CLASSPATH` variable 
in the `libexec/hadoop-config.sh` script. There, you see variables like 
`HADOOP_COMMON_LIB_JARS_DIR`, `HDFS_LIB_JARS_DIR`, `YARN_LIB_JARS_DIR`, ... 
being added to the CLASSPATH. In the very last step, they add the 
HADOOP_CLASSPATH variable (by default to the end of the classpath, but there is 
an additional option to put it in front of it).

I found that we need to add this on Google Compute Engine's Hadoop 
deployment. They have their Google Storage configured by default but it 
currently doesn't work in non-yarn setups because the Google Storage jar is not 
in our classpath. On these clusters, the `HADOOP_CLASSPATH` variable contains 
the path to the storage-jar.


> Add HADOOP_CLASSPATH to start scripts
> -
>
> Key: FLINK-1433
> URL: https://issues.apache.org/jira/browse/FLINK-1433
> Project: Flink
>  Issue Type: Improvement
>Reporter: Robert Metzger
> Fix For: 0.8.1
>
>
> With the Hadoop file system wrapper, its important to have access to the 
> hadoop filesystem classes.
> The HADOOP_CLASSPATH seems to be a standard environment variable used by 
> Hadoop for such libraries.
> Deployments like Google Compute Cloud set this variable containing the 
> "Google Cloud Storage Hadoop Wrapper". So if users want to use the Cloud 
> Storage in an non-yarn environment, we need to address this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1389] Allow changing the filenames of t...

2015-01-26 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71432009
  
I've completely changed the mechanism of setting a custom file name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1389) Allow setting custom file extensions for files created by the FileOutputFormat

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291603#comment-14291603
 ] 

ASF GitHub Bot commented on FLINK-1389:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71432009
  
I've completely changed the mechanism of setting a custom file name.


> Allow setting custom file extensions for files created by the FileOutputFormat
> --
>
> Key: FLINK-1389
> URL: https://issues.apache.org/jira/browse/FLINK-1389
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
>
> A user requested the ability to name avro files with the "avro" extension.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1389] Allow changing the filenames of t...

2015-01-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71433679
  
Hmmm, using the String pattern seems to be much more comfortable for users, 
no?

If a user wants to have the data written out with some kind of filename 
pattern, she needs to implement a new IF and overwrite a method instead of 
simply setting a configuration parameter.
The only thing you gain is that you can do some "fancy" arithmetics (+/- 1) 
with the task number.
Not sure if that's worth it. The former way was very elegant and clean, IMO.

If you persist on the having the option to choose between 0 and 1-based 
indexing, I would vote to go with the former variant and keep both parameters.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1389) Allow setting custom file extensions for files created by the FileOutputFormat

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291612#comment-14291612
 ] 

ASF GitHub Bot commented on FLINK-1389:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71433679
  
Hmmm, using the String pattern seems to be much more comfortable for users, 
no?

If a user wants to have the data written out with some kind of filename 
pattern, she needs to implement a new IF and overwrite a method instead of 
simply setting a configuration parameter.
The only thing you gain is that you can do some "fancy" arithmetics (+/- 1) 
with the task number.
Not sure if that's worth it. The former way was very elegant and clean, IMO.

If you persist on the having the option to choose between 0 and 1-based 
indexing, I would vote to go with the former variant and keep both parameters.


> Allow setting custom file extensions for files created by the FileOutputFormat
> --
>
> Key: FLINK-1389
> URL: https://issues.apache.org/jira/browse/FLINK-1389
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
>
> A user requested the ability to name avro files with the "avro" extension.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1389] Allow changing the filenames of t...

2015-01-26 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71434721
  
The user who requested this feature actually asked for a custom method to 
overwrite.

There is one more important use-case for a custom method: If users want to 
have files named exactly like hadoop, they also need a method. Hadoop is using 
6-digit numbers, filled up with zeroes (for example part-m-01).

I'm tired of changing the pull request until everybody is happy. Its very 
inefficient and I have better stuff to do with my time. If you want, you can 
change the code once its merged, but I have rewritten this 3 times now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1389) Allow setting custom file extensions for files created by the FileOutputFormat

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291617#comment-14291617
 ] 

ASF GitHub Bot commented on FLINK-1389:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71434721
  
The user who requested this feature actually asked for a custom method to 
overwrite.

There is one more important use-case for a custom method: If users want to 
have files named exactly like hadoop, they also need a method. Hadoop is using 
6-digit numbers, filled up with zeroes (for example part-m-01).

I'm tired of changing the pull request until everybody is happy. Its very 
inefficient and I have better stuff to do with my time. If you want, you can 
change the code once its merged, but I have rewritten this 3 times now.


> Allow setting custom file extensions for files created by the FileOutputFormat
> --
>
> Key: FLINK-1389
> URL: https://issues.apache.org/jira/browse/FLINK-1389
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
>
> A user requested the ability to name avro files with the "avro" extension.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1295][FLINK-883] Allow to deploy 'job o...

2015-01-26 Thread rmetzger
Github user rmetzger closed the pull request at:

https://github.com/apache/flink/pull/292


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1295) Add option to Flink client to start a YARN session per job

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291620#comment-14291620
 ] 

ASF GitHub Bot commented on FLINK-1295:
---

Github user rmetzger closed the pull request at:

https://github.com/apache/flink/pull/292


> Add option to Flink client to start a YARN session per job
> --
>
> Key: FLINK-1295
> URL: https://issues.apache.org/jira/browse/FLINK-1295
> Project: Flink
>  Issue Type: New Feature
>  Components: YARN Client
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 0.9
>
>
> Currently, Flink users can only launch Flink on YARN as a "YARN session" 
> (meaning a long-running YARN application that can run multiple Flink jobs)
> Users have requested to extend the Flink Client to allocate YARN containers 
> only for executing a single job.
> As part of this pull request, I would suggest to refactor the YARN Client to 
> make it more modular and object oriented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1389] Allow changing the filenames of t...

2015-01-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71435854
  
Well, you would have saved everybody's time of you had made this 
requirements clear from the beginning. Besides your first two versions didn't 
comply with these new requirements either...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1389) Allow setting custom file extensions for files created by the FileOutputFormat

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291622#comment-14291622
 ] 

ASF GitHub Bot commented on FLINK-1389:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71435854
  
Well, you would have saved everybody's time of you had made this 
requirements clear from the beginning. Besides your first two versions didn't 
comply with these new requirements either...


> Allow setting custom file extensions for files created by the FileOutputFormat
> --
>
> Key: FLINK-1389
> URL: https://issues.apache.org/jira/browse/FLINK-1389
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
>
> A user requested the ability to name avro files with the "avro" extension.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Add support for Subclasses, Interfaces, Abstra...

2015-01-26 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/236#issuecomment-71436404
  
I will have to rework this now that the support for registering Types and 
Serializers at Kryo was merged.

The POJO subclass with tagging is slower because we do additional checks 
and lookups: Upon serialisation we perform a map lookup to check whether the 
subclass is actually a registered class. When deserialising we have to fetch 
the correct subclass serialiser from an array of subclass serialisers.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Add support for Subclasses, Interfaces, Abstra...

2015-01-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/236#issuecomment-71436626
  
@aljoscha  Have a look at #316 where I took this PR, rebased it, and fixed 
some problems with Pojo types.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1234) Make Hadoop2 profile default

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291628#comment-14291628
 ] 

ASF GitHub Bot commented on FLINK-1234:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/295


> Make Hadoop2 profile default
> 
>
> Key: FLINK-1234
> URL: https://issues.apache.org/jira/browse/FLINK-1234
> Project: Flink
>  Issue Type: Improvement
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 0.8
>
>
> As per mailing list discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1234] Update documentation configuratio...

2015-01-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/295


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1419) DistributedCache doesn't preserver files for subsequent operations

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291639#comment-14291639
 ] 

ASF GitHub Bot commented on FLINK-1419:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/339

[FLINK-1419] [runtime] DC properly synchronized

Addresses the issue of files not being preserved in subsequent operations.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/incubator-flink dc_cache_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/339.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #339


commit 5c9059d3ce58d8415ce374927dd253579a5fd741
Author: zentol 
Date:   2015-01-26T10:07:53Z

[FLINK-1419] [runtime] DC properly synchronized




> DistributedCache doesn't preserver files for subsequent operations
> --
>
> Key: FLINK-1419
> URL: https://issues.apache.org/jira/browse/FLINK-1419
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.8, 0.9
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> When subsequent operations want to access the same files in the DC it 
> frequently happens that the files are not created for the following operation.
> This is fairly odd, since the DC is supposed to either a) preserve files when 
> another operation kicks in within a certain time window, or b) just recreate 
> the deleted files. Both things don't happen.
> Increasing the time window had no effect.
> I'd like to use this issue as a starting point for a more general discussion 
> about the DistributedCache. 
> Currently:
> 1. all files reside in a common job-specific directory
> 2. are deleted during the job.
>  
> One thing that was brought up about Trait 1 is that it basically forbids 
> modification of the files, concurrent access and all. Personally I'm not sure 
> if this a problem. Changing it to a task-specific place solved the issue 
> though.
> I'm more concerned about Trait #2. Besides the mentioned issue, the deletion 
> is realized with the scheduler, which adds a lot of complexity to the current 
> code. (It really is a pain to work on...) 
> If we moved the deletion to the end of the job it could be done as a clean-up 
> step in the TaskManager, With this we could reduce the DC to a 
> cacheFile(String source) method, the delete method in the TM, and throw out 
> everything else.
> Also, the current implementation implies that big files may be copied 
> multiple times. This may be undesired, depending on how big the files are.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1419] [runtime] DC properly synchronize...

2015-01-26 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/339

[FLINK-1419] [runtime] DC properly synchronized

Addresses the issue of files not being preserved in subsequent operations.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/incubator-flink dc_cache_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/339.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #339


commit 5c9059d3ce58d8415ce374927dd253579a5fd741
Author: zentol 
Date:   2015-01-26T10:07:53Z

[FLINK-1419] [runtime] DC properly synchronized




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1425] [streaming] Add scheduling of all...

2015-01-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/330


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1425) Turn lazy operator execution off for streaming programs

2015-01-26 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi resolved FLINK-1425.

Resolution: Fixed

Fixed in ad31f61.

> Turn lazy operator execution off for streaming programs
> ---
>
> Key: FLINK-1425
> URL: https://issues.apache.org/jira/browse/FLINK-1425
> Project: Flink
>  Issue Type: Task
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Gyula Fora
>Assignee: Ufuk Celebi
>
> Streaming programs currently use the same lazy operator execution model as 
> batch programs. This makes the functionality of some operators like time 
> based windowing very awkward, since they start computing windows based on the 
> start of the operator.
> Also, one should expect for streaming programs to run continuously so there 
> is not much to gain from lazy execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1425) Turn lazy operator execution off for streaming programs

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291641#comment-14291641
 ] 

ASF GitHub Bot commented on FLINK-1425:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/330


> Turn lazy operator execution off for streaming programs
> ---
>
> Key: FLINK-1425
> URL: https://issues.apache.org/jira/browse/FLINK-1425
> Project: Flink
>  Issue Type: Task
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Gyula Fora
>Assignee: Ufuk Celebi
>
> Streaming programs currently use the same lazy operator execution model as 
> batch programs. This makes the functionality of some operators like time 
> based windowing very awkward, since they start computing windows based on the 
> start of the operator.
> Also, one should expect for streaming programs to run continuously so there 
> is not much to gain from lazy execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1389] Allow changing the filenames of t...

2015-01-26 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71439040
  
You are right, I should have made the requirements clear in the beginning. 
I actually had a discussion with the user which approach is the best.
I took the view that a string based method is easier for now and 
implemented it.
Then, we had the whole discussion here and I thought, now, that everybody 
is unhappy with my approach, I better do exactly what the user wants, instead 
of going with a strong opinion.

And now I'm confused and upset. But to be realistic: We only had one user 
so who wanted to change the filenames at all. If there is ever going to be a 
second or third user, they either have to do another contribution or overwrite 
their input format. 

http://bikeshed.com/ (see also: 
http://en.wikipedia.org/wiki/Parkinson%27s_law_of_triviality)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1389) Allow setting custom file extensions for files created by the FileOutputFormat

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291644#comment-14291644
 ] 

ASF GitHub Bot commented on FLINK-1389:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71439040
  
You are right, I should have made the requirements clear in the beginning. 
I actually had a discussion with the user which approach is the best.
I took the view that a string based method is easier for now and 
implemented it.
Then, we had the whole discussion here and I thought, now, that everybody 
is unhappy with my approach, I better do exactly what the user wants, instead 
of going with a strong opinion.

And now I'm confused and upset. But to be realistic: We only had one user 
so who wanted to change the filenames at all. If there is ever going to be a 
second or third user, they either have to do another contribution or overwrite 
their input format. 

http://bikeshed.com/ (see also: 
http://en.wikipedia.org/wiki/Parkinson%27s_law_of_triviality)


> Allow setting custom file extensions for files created by the FileOutputFormat
> --
>
> Key: FLINK-1389
> URL: https://issues.apache.org/jira/browse/FLINK-1389
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
>
> A user requested the ability to name avro files with the "avro" extension.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1392) Serializing Protobuf - issue 1

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1392:
--
Issue Type: Sub-task  (was: Bug)
Parent: FLINK-1417

> Serializing Protobuf - issue 1
> --
>
> Key: FLINK-1392
> URL: https://issues.apache.org/jira/browse/FLINK-1392
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Felix Neutatz
>Assignee: Robert Metzger
>Priority: Minor
>
> Hi, I started to experiment with Parquet using Protobuf.
> When I use the standard Protobuf class: 
> com.twitter.data.proto.tutorial.AddressBookProtos
> The code which I run, can be found here: 
> [https://github.com/FelixNeutatz/incubator-flink/blob/ParquetAtFlink/flink-addons/flink-hadoop-compatibility/src/main/java/org/apache/flink/hadoopcompatibility/mapreduce/example/ParquetProtobufOutput.java]
> I get the following exception:
> {code:xml}
> Exception in thread "main" java.lang.Exception: Deserializing the 
> InputFormat (org.apache.flink.api.java.io.CollectionInputFormat) failed: 
> Could not read the user code wrapper: Error while deserializing element from 
> collection
>   at 
> org.apache.flink.runtime.jobgraph.InputFormatVertex.initializeOnMaster(InputFormatVertex.java:60)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$5.apply(JobManager.scala:179)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1$$anonfun$applyOrElse$5.apply(JobManager.scala:172)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:172)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:34)
>   at 
> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:27)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>   at 
> org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:27)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:52)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: 
> org.apache.flink.runtime.operators.util.CorruptConfigurationException: Could 
> not read the user code wrapper: Error while deserializing element from 
> collection
>   at 
> org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:285)
>   at 
> org.apache.flink.runtime.jobgraph.InputFormatVertex.initializeOnMaster(InputFormatVertex.java:57)
>   ... 25 more
> Caused by: java.io.IOException: Error while deserializing element from 
> collection
>   at 
> org.apache.flink.api.java.io.CollectionInputFormat.readObject(CollectionInputFormat.java:108)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>   at 

[jira] [Updated] (FLINK-1395) Add Jodatime support to Kryo

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1395:
--
Issue Type: Sub-task  (was: Bug)
Parent: FLINK-1417

> Add Jodatime support to Kryo
> 
>
> Key: FLINK-1395
> URL: https://issues.apache.org/jira/browse/FLINK-1395
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Robert Metzger
>Assignee: Aljoscha Krettek
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1391) Kryo fails to properly serialize avro collection types

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1391:
--
Issue Type: Sub-task  (was: Improvement)
Parent: FLINK-1417

> Kryo fails to properly serialize avro collection types
> --
>
> Key: FLINK-1391
> URL: https://issues.apache.org/jira/browse/FLINK-1391
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 0.8, 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Before FLINK-610, Avro was the default generic serializer.
> Now, special types coming from Avro are handled by Kryo .. which seems to 
> cause errors like:
> {code}
> Exception in thread "main" 
> org.apache.flink.runtime.client.JobExecutionException: 
> java.lang.NullPointerException
>   at org.apache.avro.generic.GenericData$Array.add(GenericData.java:200)
>   at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:116)
>   at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:22)
>   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
>   at 
> org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:143)
>   at 
> org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:148)
>   at 
> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:244)
>   at 
> org.apache.flink.runtime.plugable.DeserializationDelegate.read(DeserializationDelegate.java:56)
>   at 
> org.apache.flink.runtime.io.network.serialization.AdaptiveSpanningRecordDeserializer.getNextRecord(AdaptiveSpanningRecordDeserializer.java:71)
>   at 
> org.apache.flink.runtime.io.network.channels.InputChannel.readRecord(InputChannel.java:189)
>   at 
> org.apache.flink.runtime.io.network.gates.InputGate.readRecord(InputGate.java:176)
>   at 
> org.apache.flink.runtime.io.network.api.MutableRecordReader.next(MutableRecordReader.java:51)
>   at 
> org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:53)
>   at 
> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:170)
>   at 
> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
>   at java.lang.Thread.run(Thread.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1395) Add Jodatime support to Kryo

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1395:
-

Assignee: Robert Metzger  (was: Aljoscha Krettek)

> Add Jodatime support to Kryo
> 
>
> Key: FLINK-1395
> URL: https://issues.apache.org/jira/browse/FLINK-1395
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1395) Add Jodatime support to Kryo

2015-01-26 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291653#comment-14291653
 ] 

Robert Metzger commented on FLINK-1395:
---

[~aljoscha] I'm going to assign the issue to myself because I'm fixing this as 
part of FLINK-1417 (see also the mailing list discussion on this).

> Add Jodatime support to Kryo
> 
>
> Key: FLINK-1395
> URL: https://issues.apache.org/jira/browse/FLINK-1395
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Robert Metzger
>Assignee: Aljoscha Krettek
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1389] Allow changing the filenames of t...

2015-01-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71440329
  
No worries ;-) I understand that discussing such a "trivial" feature feels 
like a waste of time. Unfortunately these are the features that are easy to 
comment on for "everybody" ;-)

Anyway, I guess we could have both methods besides each other if this 
becomes an issue, i.e., add your String approach as configuration option to 
FileInputFormat and interprete it in the default `getDirectoryFileName()` 
method.

For now, you got my +1 for this change. It does not change the default 
behavior and adds a feature.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1389) Allow setting custom file extensions for files created by the FileOutputFormat

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291656#comment-14291656
 ] 

ASF GitHub Bot commented on FLINK-1389:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/301#issuecomment-71440329
  
No worries ;-) I understand that discussing such a "trivial" feature feels 
like a waste of time. Unfortunately these are the features that are easy to 
comment on for "everybody" ;-)

Anyway, I guess we could have both methods besides each other if this 
becomes an issue, i.e., add your String approach as configuration option to 
FileInputFormat and interprete it in the default `getDirectoryFileName()` 
method.

For now, you got my +1 for this change. It does not change the default 
behavior and adds a feature.


> Allow setting custom file extensions for files created by the FileOutputFormat
> --
>
> Key: FLINK-1389
> URL: https://issues.apache.org/jira/browse/FLINK-1389
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
>
> A user requested the ability to name avro files with the "avro" extension.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1389] Allow changing the filenames of t...

2015-01-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/301


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1389) Allow setting custom file extensions for files created by the FileOutputFormat

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291657#comment-14291657
 ] 

ASF GitHub Bot commented on FLINK-1389:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/301


> Allow setting custom file extensions for files created by the FileOutputFormat
> --
>
> Key: FLINK-1389
> URL: https://issues.apache.org/jira/browse/FLINK-1389
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
>
> A user requested the ability to name avro files with the "avro" extension.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1389) Allow setting custom file extensions for files created by the FileOutputFormat

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1389.
---
   Resolution: Fixed
Fix Version/s: 0.8.1
   0.9

Merged to master in http://git-wip-us.apache.org/repos/asf/flink/commit/268ff7a0
Merged to 0.8 in http://git-wip-us.apache.org/repos/asf/flink/commit/00117f7f

> Allow setting custom file extensions for files created by the FileOutputFormat
> --
>
> Key: FLINK-1389
> URL: https://issues.apache.org/jira/browse/FLINK-1389
> Project: Flink
>  Issue Type: New Feature
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Minor
> Fix For: 0.9, 0.8.1
>
>
> A user requested the ability to name avro files with the "avro" extension.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1303) HadoopInputFormat does not work with Scala API

2015-01-26 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291683#comment-14291683
 ] 

Aljoscha Krettek commented on FLINK-1303:
-

I think we can easily rework this to return Scala Tuples for a Scala 
HadoopInputFormat.

> HadoopInputFormat does not work with Scala API
> --
>
> Key: FLINK-1303
> URL: https://issues.apache.org/jira/browse/FLINK-1303
> Project: Flink
>  Issue Type: Bug
>  Components: Scala API
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 0.9
>
>
> It fails because the HadoopInputFormat uses the Flink Tuple2 type. For this, 
> type extraction fails at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1452) Add "flink-contrib" maven module and README.md with the rules

2015-01-26 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1452:
-

 Summary: Add "flink-contrib" maven module and README.md with the 
rules
 Key: FLINK-1452
 URL: https://issues.apache.org/jira/browse/FLINK-1452
 Project: Flink
  Issue Type: New Feature
Reporter: Robert Metzger
Assignee: Robert Metzger


I'll also create a JIRA component



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1452) Add "flink-contrib" maven module and README.md with the rules

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1452:
--
Component/s: flink-contrib

> Add "flink-contrib" maven module and README.md with the rules
> -
>
> Key: FLINK-1452
> URL: https://issues.apache.org/jira/browse/FLINK-1452
> Project: Flink
>  Issue Type: New Feature
>  Components: flink-contrib
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> I'll also create a JIRA component



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1453) Integration tests for YARN failing on OS X

2015-01-26 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1453:
-

 Summary: Integration tests for YARN failing on OS X
 Key: FLINK-1453
 URL: https://issues.apache.org/jira/browse/FLINK-1453
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Robert Metzger
Assignee: Robert Metzger


The flink yarn tests are failing on OS X, most likely through a port conflict:

{code}
11:59:38,870 INFO  org.eclipse.jetty.util.log   
 - jetty-0.9-SNAPSHOT
11:59:38,885 WARN  org.eclipse.jetty.util.log   
 - FAILED SelectChannelConnector@0.0.0.0:8081: java.net.BindException: Address 
already in use
11:59:38,885 WARN  org.eclipse.jetty.util.log   
 - FAILED org.eclipse.jetty.server.Server@281c7736: java.net.BindException: 
Address already in use
11:59:38,892 ERROR akka.actor.OneForOneStrategy 
 - Address already in use
akka.actor.ActorInitializationException: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:164)
at akka.actor.ActorCell.create(ActorCell.scala:596)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:456)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at 
org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:208)
at 
org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:288)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:55)
at org.eclipse.jetty.server.Server.doStart(Server.java:254)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:55)
at 
org.apache.flink.runtime.jobmanager.web.WebInfoServer.start(WebInfoServer.java:198)
at 
org.apache.flink.runtime.jobmanager.WithWebServer$class.$init$(WithWebServer.scala:28)
at 
org.apache.flink.yarn.ApplicationMaster$$anonfun$startJobManager$2$$anon$1.(ApplicationMaster.scala:181)
at 
org.apache.flink.yarn.ApplicationMaster$$anonfun$startJobManager$2.apply(ApplicationMaster.scala:181)
at 
org.apache.flink.yarn.ApplicationMaster$$anonfun$startJobManager$2.apply(ApplicationMaster.scala:181)
at akka.actor.TypedCreatorFunctionConsumer.produce(Props.scala:343)
at akka.actor.Props.newActor(Props.scala:252)
at akka.actor.ActorCell.newActor(ActorCell.scala:552)
at akka.actor.ActorCell.create(ActorCell.scala:578)
... 10 more
{code}
The issue does not appear on Travis or on Arch Linux (however, tests are also 
failing on some Ubuntu versions)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1369] [types] Add support for Subclasse...

2015-01-26 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/316#issuecomment-71445361
  
It's almost the same, except for the change to handle Interfaces and 
Abstract Classes with GenericTypeInfo, correct?

The part that changes the KryoSerializer must be adapted because of my 
recently merged PR that allows registering types and serializers at Kryo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1369) The Pojo Serializers/Comparators fail when using Subclasses or Interfaces

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291700#comment-14291700
 ] 

ASF GitHub Bot commented on FLINK-1369:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/316#issuecomment-71445361
  
It's almost the same, except for the change to handle Interfaces and 
Abstract Classes with GenericTypeInfo, correct?

The part that changes the KryoSerializer must be adapted because of my 
recently merged PR that allows registering types and serializers at Kryo.


> The Pojo Serializers/Comparators fail when using Subclasses or Interfaces
> -
>
> Key: FLINK-1369
> URL: https://issues.apache.org/jira/browse/FLINK-1369
> Project: Flink
>  Issue Type: Improvement
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-1436] refactor CLiFrontend to provide m...

2015-01-26 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/331#issuecomment-71446044
  
@rmetzger @StephanEwen I squashed the commits, merged the recent changes in 
the CliFrontend on the master, and addressed 
[FLINK-1424](https://issues.apache.org/jira/browse/FLINK-1424).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1424) bin/flink run does not recognize -c parameter anymore

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291702#comment-14291702
 ] 

ASF GitHub Bot commented on FLINK-1424:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/331#issuecomment-71446044
  
@rmetzger @StephanEwen I squashed the commits, merged the recent changes in 
the CliFrontend on the master, and addressed 
[FLINK-1424](https://issues.apache.org/jira/browse/FLINK-1424).


> bin/flink run does not recognize -c parameter anymore
> -
>
> Key: FLINK-1424
> URL: https://issues.apache.org/jira/browse/FLINK-1424
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager
>Affects Versions: master
>Reporter: Carsten Brandt
>
> bin/flink binary does not recognize `-c` parameter anymore which specifies 
> the class to run:
> {noformat}
> $ ./flink run "/path/to/target/impro3-ws14-flink-1.0-SNAPSHOT.jar" -c 
> de.tu_berlin.impro3.flink.etl.FollowerGraphGenerator /tmp/flink/testgraph.txt 
> 1
> usage: emma-experiments-impro3-ss14-flink
>[-?]
> emma-experiments-impro3-ss14-flink: error: unrecognized arguments: '-c'
> {noformat}
> before this command worked fine and executed the job.
> I tracked it down to the following commit using `git bisect`:
> {noformat}
> 93eadca782ee8c77f89609f6d924d73021dcdda9 is the first bad commit
> commit 93eadca782ee8c77f89609f6d924d73021dcdda9
> Author: Alexander Alexandrov 
> Date:   Wed Dec 24 13:49:56 2014 +0200
> [FLINK-1027] [cli] Added support for '--' and '-' prefixed tokens in CLI 
> program arguments.
> 
> This closes #278
> :04 04 a1358e6f7fe308b4d51a47069f190a29f87fdeda 
> d6f11bbc9444227d5c6297ec908e44b9644289a9 Mflink-clients
> {noformat}
> https://github.com/apache/flink/commit/93eadca782ee8c77f89609f6d924d73021dcdda9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1352) Buggy registration from TaskManager to JobManager

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291703#comment-14291703
 ] 

ASF GitHub Bot commented on FLINK-1352:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/328#discussion_r23523589
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala
 ---
@@ -175,62 +175,79 @@ import scala.collection.JavaConverters._
   }
 
   private def tryJobManagerRegistration(): Unit = {
-registrationAttempts = 0
-import context.dispatcher
-registrationScheduler = Some(context.system.scheduler.schedule(
-  TaskManager.REGISTRATION_DELAY, TaskManager.REGISTRATION_INTERVAL,
-  self, RegisterAtJobManager))
+registrationDuration = 0 seconds
+
+registered = false
+
+context.system.scheduler.scheduleOnce(registrationDelay, self, 
RegisterAtJobManager)
   }
 
   override def receiveWithLogMessages: Receive = {
 case RegisterAtJobManager => {
-  registrationAttempts += 1
+  if(!registered) {
+registrationDuration += registrationDelay
+// double delay for exponential backoff
+registrationDelay *= 2
 
-  if (registered) {
-registrationScheduler.foreach(_.cancel())
-  }
-  else if (registrationAttempts <= 
TaskManager.MAX_REGISTRATION_ATTEMPTS) {
+if (registrationDuration > maxRegistrationDuration) {
+  log.warning("TaskManager could not register at JobManager {} 
after {}.", jobManagerAkkaURL,
 
-log.info("Try to register at master {}. Attempt #{}", 
jobManagerAkkaURL,
-  registrationAttempts)
-val jobManager = context.actorSelection(jobManagerAkkaURL)
+maxRegistrationDuration)
 
-jobManager ! RegisterTaskManager(connectionInfo, 
hardwareDescription, numberOfSlots)
-  }
-  else {
-log.error("TaskManager could not register at JobManager.");
-self ! PoisonPill
+  self ! PoisonPill
+} else if (!registered) {
+  log.info(s"Try to register at master ${jobManagerAkkaURL}. 
${registrationAttempts}. " +
+s"Attempt")
+  val jobManager = context.actorSelection(jobManagerAkkaURL)
+
+  jobManager ! RegisterTaskManager(connectionInfo, 
hardwareDescription, numberOfSlots)
+
+  context.system.scheduler.scheduleOnce(registrationDelay, self, 
RegisterAtJobManager)
+}
   }
 }
 
 case AcknowledgeRegistration(id, blobPort) => {
-  if (!registered) {
+  if(!registered) {
+finishRegistration(id, blobPort)
 registered = true
-currentJobManager = sender
-instanceID = id
-
-context.watch(currentJobManager)
-
-log.info("TaskManager successfully registered at JobManager {}.",
-  currentJobManager.path.toString)
-
-setupNetworkEnvironment()
-setupLibraryCacheManager(blobPort)
+  } else {
+if (log.isDebugEnabled) {
--- End diff --

You're right Henry, I'll remove it.


> Buggy registration from TaskManager to JobManager
> -
>
> Key: FLINK-1352
> URL: https://issues.apache.org/jira/browse/FLINK-1352
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> The JobManager's InstanceManager may refuse the registration attempt from a 
> TaskManager, because it has this taskmanager already connected, or,in the 
> future, because the TaskManager has been blacklisted as unreliable.
> Unpon refused registration, the instance ID is null, to signal that refused 
> registration. TaskManager reacts incorrectly to such methods, assuming 
> successful registration
> Possible solution: JobManager sends back a dedicated "RegistrationRefused" 
> message, if the instance manager returns null as the registration result. If 
> the TastManager receives that before being registered, it knows that the 
> registration response was lost (which should not happen on TCP and it would 
> indicate a corrupt connection)
> Followup question: Does it make sense to have the TaskManager trying 
> indefinitely to connect to the JobManager. With increasing interval (from 
> seconds to minutes)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1352] [runtime] Fix buggy registration ...

2015-01-26 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/328#discussion_r23523589
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala
 ---
@@ -175,62 +175,79 @@ import scala.collection.JavaConverters._
   }
 
   private def tryJobManagerRegistration(): Unit = {
-registrationAttempts = 0
-import context.dispatcher
-registrationScheduler = Some(context.system.scheduler.schedule(
-  TaskManager.REGISTRATION_DELAY, TaskManager.REGISTRATION_INTERVAL,
-  self, RegisterAtJobManager))
+registrationDuration = 0 seconds
+
+registered = false
+
+context.system.scheduler.scheduleOnce(registrationDelay, self, 
RegisterAtJobManager)
   }
 
   override def receiveWithLogMessages: Receive = {
 case RegisterAtJobManager => {
-  registrationAttempts += 1
+  if(!registered) {
+registrationDuration += registrationDelay
+// double delay for exponential backoff
+registrationDelay *= 2
 
-  if (registered) {
-registrationScheduler.foreach(_.cancel())
-  }
-  else if (registrationAttempts <= 
TaskManager.MAX_REGISTRATION_ATTEMPTS) {
+if (registrationDuration > maxRegistrationDuration) {
+  log.warning("TaskManager could not register at JobManager {} 
after {}.", jobManagerAkkaURL,
 
-log.info("Try to register at master {}. Attempt #{}", 
jobManagerAkkaURL,
-  registrationAttempts)
-val jobManager = context.actorSelection(jobManagerAkkaURL)
+maxRegistrationDuration)
 
-jobManager ! RegisterTaskManager(connectionInfo, 
hardwareDescription, numberOfSlots)
-  }
-  else {
-log.error("TaskManager could not register at JobManager.");
-self ! PoisonPill
+  self ! PoisonPill
+} else if (!registered) {
+  log.info(s"Try to register at master ${jobManagerAkkaURL}. 
${registrationAttempts}. " +
+s"Attempt")
+  val jobManager = context.actorSelection(jobManagerAkkaURL)
+
+  jobManager ! RegisterTaskManager(connectionInfo, 
hardwareDescription, numberOfSlots)
+
+  context.system.scheduler.scheduleOnce(registrationDelay, self, 
RegisterAtJobManager)
+}
   }
 }
 
 case AcknowledgeRegistration(id, blobPort) => {
-  if (!registered) {
+  if(!registered) {
+finishRegistration(id, blobPort)
 registered = true
-currentJobManager = sender
-instanceID = id
-
-context.watch(currentJobManager)
-
-log.info("TaskManager successfully registered at JobManager {}.",
-  currentJobManager.path.toString)
-
-setupNetworkEnvironment()
-setupLibraryCacheManager(blobPort)
+  } else {
+if (log.isDebugEnabled) {
--- End diff --

You're right Henry, I'll remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1352] [runtime] Fix buggy registration ...

2015-01-26 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/328#discussion_r23523674
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala
 ---
@@ -175,62 +175,79 @@ import scala.collection.JavaConverters._
   }
 
   private def tryJobManagerRegistration(): Unit = {
-registrationAttempts = 0
-import context.dispatcher
-registrationScheduler = Some(context.system.scheduler.schedule(
-  TaskManager.REGISTRATION_DELAY, TaskManager.REGISTRATION_INTERVAL,
-  self, RegisterAtJobManager))
+registrationDuration = 0 seconds
+
+registered = false
+
+context.system.scheduler.scheduleOnce(registrationDelay, self, 
RegisterAtJobManager)
   }
 
   override def receiveWithLogMessages: Receive = {
 case RegisterAtJobManager => {
-  registrationAttempts += 1
+  if(!registered) {
+registrationDuration += registrationDelay
+// double delay for exponential backoff
+registrationDelay *= 2
 
-  if (registered) {
-registrationScheduler.foreach(_.cancel())
-  }
-  else if (registrationAttempts <= 
TaskManager.MAX_REGISTRATION_ATTEMPTS) {
+if (registrationDuration > maxRegistrationDuration) {
+  log.warning("TaskManager could not register at JobManager {} 
after {}.", jobManagerAkkaURL,
 
-log.info("Try to register at master {}. Attempt #{}", 
jobManagerAkkaURL,
-  registrationAttempts)
-val jobManager = context.actorSelection(jobManagerAkkaURL)
+maxRegistrationDuration)
 
-jobManager ! RegisterTaskManager(connectionInfo, 
hardwareDescription, numberOfSlots)
-  }
-  else {
-log.error("TaskManager could not register at JobManager.");
-self ! PoisonPill
+  self ! PoisonPill
+} else if (!registered) {
+  log.info(s"Try to register at master ${jobManagerAkkaURL}. 
${registrationAttempts}. " +
+s"Attempt")
+  val jobManager = context.actorSelection(jobManagerAkkaURL)
+
+  jobManager ! RegisterTaskManager(connectionInfo, 
hardwareDescription, numberOfSlots)
+
+  context.system.scheduler.scheduleOnce(registrationDelay, self, 
RegisterAtJobManager)
+}
   }
 }
 
 case AcknowledgeRegistration(id, blobPort) => {
-  if (!registered) {
+  if(!registered) {
+finishRegistration(id, blobPort)
 registered = true
-currentJobManager = sender
-instanceID = id
-
-context.watch(currentJobManager)
-
-log.info("TaskManager successfully registered at JobManager {}.",
-  currentJobManager.path.toString)
-
-setupNetworkEnvironment()
-setupLibraryCacheManager(blobPort)
+  } else {
+if (log.isDebugEnabled) {
+  log.debug("The TaskManager {} is already registered at the 
JobManager {}, but received " +
+"another AcknowledgeRegistration message.", self.path, 
currentJobManager.path)
+}
+  }
+}
 
-heartbeatScheduler = Some(context.system.scheduler.schedule(
-  TaskManager.HEARTBEAT_INTERVAL, TaskManager.HEARTBEAT_INTERVAL, 
self, SendHeartbeat))
+case AlreadyRegistered(id, blobPort) =>
+  if(!registered) {
+log.warning("The TaskManager {} seems to be already registered at 
the JobManager {} even" +
+  "though it has not yet finished the registration process.", 
self.path, sender.path)
 
-profiler foreach {
-  _.tell(RegisterProfilingListener, 
JobManager.getProfiler(currentJobManager))
+finishRegistration(id, blobPort)
+registered = true
+  } else {
+// ignore AlreadyRegistered messages which arrived after 
AcknowledgeRegistration
+if(log.isDebugEnabled){
+  log.debug("The TaskManager {} has already been registered at the 
JobManager {}.",
+self.path, sender.path)
 }
+  }
 
-for (listener <- waitForRegistration) {
-  listener ! RegisteredAtJobManager
-}
+case RefuseRegistration(reason) =>
+  if(!registered) {
+log.error("The registration of task manager {} was refused by the 
job manager {} " +
+  "because {}.", self.path, jobManagerAkkaURL, reason)
 
-waitForRegistration.clear()
+// Shut task manager down
+self ! PoisonPill
+  } else {
+// ignore RefuseRegistration messages which arrived after 
AcknowledgeRegistration
+if(log.isDebugEnabled) {
---

[jira] [Commented] (FLINK-1352) Buggy registration from TaskManager to JobManager

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291708#comment-14291708
 ] 

ASF GitHub Bot commented on FLINK-1352:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/328#discussion_r23523674
  
--- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala
 ---
@@ -175,62 +175,79 @@ import scala.collection.JavaConverters._
   }
 
   private def tryJobManagerRegistration(): Unit = {
-registrationAttempts = 0
-import context.dispatcher
-registrationScheduler = Some(context.system.scheduler.schedule(
-  TaskManager.REGISTRATION_DELAY, TaskManager.REGISTRATION_INTERVAL,
-  self, RegisterAtJobManager))
+registrationDuration = 0 seconds
+
+registered = false
+
+context.system.scheduler.scheduleOnce(registrationDelay, self, 
RegisterAtJobManager)
   }
 
   override def receiveWithLogMessages: Receive = {
 case RegisterAtJobManager => {
-  registrationAttempts += 1
+  if(!registered) {
+registrationDuration += registrationDelay
+// double delay for exponential backoff
+registrationDelay *= 2
 
-  if (registered) {
-registrationScheduler.foreach(_.cancel())
-  }
-  else if (registrationAttempts <= 
TaskManager.MAX_REGISTRATION_ATTEMPTS) {
+if (registrationDuration > maxRegistrationDuration) {
+  log.warning("TaskManager could not register at JobManager {} 
after {}.", jobManagerAkkaURL,
 
-log.info("Try to register at master {}. Attempt #{}", 
jobManagerAkkaURL,
-  registrationAttempts)
-val jobManager = context.actorSelection(jobManagerAkkaURL)
+maxRegistrationDuration)
 
-jobManager ! RegisterTaskManager(connectionInfo, 
hardwareDescription, numberOfSlots)
-  }
-  else {
-log.error("TaskManager could not register at JobManager.");
-self ! PoisonPill
+  self ! PoisonPill
+} else if (!registered) {
+  log.info(s"Try to register at master ${jobManagerAkkaURL}. 
${registrationAttempts}. " +
+s"Attempt")
+  val jobManager = context.actorSelection(jobManagerAkkaURL)
+
+  jobManager ! RegisterTaskManager(connectionInfo, 
hardwareDescription, numberOfSlots)
+
+  context.system.scheduler.scheduleOnce(registrationDelay, self, 
RegisterAtJobManager)
+}
   }
 }
 
 case AcknowledgeRegistration(id, blobPort) => {
-  if (!registered) {
+  if(!registered) {
+finishRegistration(id, blobPort)
 registered = true
-currentJobManager = sender
-instanceID = id
-
-context.watch(currentJobManager)
-
-log.info("TaskManager successfully registered at JobManager {}.",
-  currentJobManager.path.toString)
-
-setupNetworkEnvironment()
-setupLibraryCacheManager(blobPort)
+  } else {
+if (log.isDebugEnabled) {
+  log.debug("The TaskManager {} is already registered at the 
JobManager {}, but received " +
+"another AcknowledgeRegistration message.", self.path, 
currentJobManager.path)
+}
+  }
+}
 
-heartbeatScheduler = Some(context.system.scheduler.schedule(
-  TaskManager.HEARTBEAT_INTERVAL, TaskManager.HEARTBEAT_INTERVAL, 
self, SendHeartbeat))
+case AlreadyRegistered(id, blobPort) =>
+  if(!registered) {
+log.warning("The TaskManager {} seems to be already registered at 
the JobManager {} even" +
+  "though it has not yet finished the registration process.", 
self.path, sender.path)
 
-profiler foreach {
-  _.tell(RegisterProfilingListener, 
JobManager.getProfiler(currentJobManager))
+finishRegistration(id, blobPort)
+registered = true
+  } else {
+// ignore AlreadyRegistered messages which arrived after 
AcknowledgeRegistration
+if(log.isDebugEnabled){
+  log.debug("The TaskManager {} has already been registered at the 
JobManager {}.",
+self.path, sender.path)
 }
+  }
 
-for (listener <- waitForRegistration) {
-  listener ! RegisteredAtJobManager
-}
+case RefuseRegistration(reason) =>
+  if(!registered) {
+log.error("The registration of task manager {} was refused by the 
job manager {} " +
+  "because {}.", self.path, jobManagerAkkaURL, reason)
 
-wai

[GitHub] flink pull request: [FLINK-1369] [types] Add support for Subclasse...

2015-01-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/316#issuecomment-71447121
  
Yes, Pojo types with no internal fields would screw up the flat field 
addressing of the optimizer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1369) The Pojo Serializers/Comparators fail when using Subclasses or Interfaces

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291711#comment-14291711
 ] 

ASF GitHub Bot commented on FLINK-1369:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/316#issuecomment-71447121
  
Yes, Pojo types with no internal fields would screw up the flat field 
addressing of the optimizer.


> The Pojo Serializers/Comparators fail when using Subclasses or Interfaces
> -
>
> Key: FLINK-1369
> URL: https://issues.apache.org/jira/browse/FLINK-1369
> Project: Flink
>  Issue Type: Improvement
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1450) Add Fold operator to the Streaming api

2015-01-26 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291714#comment-14291714
 ] 

Fabian Hueske commented on FLINK-1450:
--

I believe this is equivalent (or very similar) to the Reduce operator of the 
batch API (not to be confused with the GroupReduce operator). 
If that's the case, we should call it Reduce to use equivalent terms for batch 
and streaming.

> Add Fold operator to the Streaming api
> --
>
> Key: FLINK-1450
> URL: https://issues.apache.org/jira/browse/FLINK-1450
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Gyula Fora
>Priority: Minor
>  Labels: starter
>
> The streaming API currently doesn't support a fold operator.
> This operator would work as the foldLeft method in Scala. This would allow 
> effective implementations in a lot of cases where a the simple reduce is 
> inappropriate due to different return types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1428] Update dataset_transformations.md

2015-01-26 Thread FelixNeutatz
GitHub user FelixNeutatz opened a pull request:

https://github.com/apache/flink/pull/340

[FLINK-1428] Update dataset_transformations.md



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/FelixNeutatz/incubator-flink 
FelixNeutatz-patch-typos

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/340.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #340


commit 9dc2054d1421c4ba9b685d5390528ffc4a92b112
Author: FelixNeutatz 
Date:   2015-01-26T11:53:38Z

Update dataset_transformations.md




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-1428) Typos in Java code example for RichGroupReduceFunction

2015-01-26 Thread Felix Neutatz (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felix Neutatz reassigned FLINK-1428:


Assignee: Felix Neutatz

> Typos in Java code example for RichGroupReduceFunction
> --
>
> Key: FLINK-1428
> URL: https://issues.apache.org/jira/browse/FLINK-1428
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Felix Neutatz
>Assignee: Felix Neutatz
>Priority: Minor
>
> http://flink.apache.org/docs/0.7-incubating/dataset_transformations.html
> String key = null //missing ';'
> public void combine(Iterable> in,
>   Collector> out))
> --> one ')' too much



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1428) Typos in Java code example for RichGroupReduceFunction

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291743#comment-14291743
 ] 

ASF GitHub Bot commented on FLINK-1428:
---

GitHub user FelixNeutatz opened a pull request:

https://github.com/apache/flink/pull/340

[FLINK-1428] Update dataset_transformations.md



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/FelixNeutatz/incubator-flink 
FelixNeutatz-patch-typos

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/340.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #340


commit 9dc2054d1421c4ba9b685d5390528ffc4a92b112
Author: FelixNeutatz 
Date:   2015-01-26T11:53:38Z

Update dataset_transformations.md




> Typos in Java code example for RichGroupReduceFunction
> --
>
> Key: FLINK-1428
> URL: https://issues.apache.org/jira/browse/FLINK-1428
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Felix Neutatz
>Priority: Minor
>
> http://flink.apache.org/docs/0.7-incubating/dataset_transformations.html
> String key = null //missing ';'
> public void combine(Iterable> in,
>   Collector> out))
> --> one ')' too much



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291766#comment-14291766
 ] 

ASF GitHub Bot commented on FLINK-377:
--

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-71453267
  
I've rebased and update this PR.

Notable new stuff:
* hybrid mode removed
* documentation update and integrated into website
* **chaining** on the python side (map,flatmap, filter, combine)
* groupreduce/cogroup reworked - grouping done on python side
* iterators passed to UDF's now iterable
* **lambda support**
* **test coverage** (works from IDE, maven and on travis)


> Create a general purpose framework for language bindings
> 
>
> Key: FLINK-377
> URL: https://issues.apache.org/jira/browse/FLINK-377
> Project: Flink
>  Issue Type: Improvement
>Reporter: GitHub Import
>  Labels: github-import
> Fix For: pre-apache
>
>
> A general purpose API to run operators with arbitrary binaries. 
> This will allow to run Stratosphere programs written in Python, JavaScript, 
> Ruby, Go or whatever you like. 
> We suggest using Google Protocol Buffers for data serialization. This is the 
> list of languages that currently support ProtoBuf: 
> https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns 
> Very early prototype with python: 
> https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing 
> protobuf)
> For Ruby: https://github.com/infochimps-labs/wukong
> Two new students working at Stratosphere (@skunert and @filiphaase) are 
> working on this.
> The reference binding language will be for Python, but other bindings are 
> very welcome.
> The best name for this so far is "stratosphere-lang-bindings".
> I created this issue to track the progress (and give everybody a chance to 
> comment on this)
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/377
> Created by: [rmetzger|https://github.com/rmetzger]
> Labels: enhancement, 
> Assignee: [filiphaase|https://github.com/filiphaase]
> Created at: Tue Jan 07 19:47:20 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...

2015-01-26 Thread zentol
Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-71453267
  
I've rebased and update this PR.

Notable new stuff:
* hybrid mode removed
* documentation update and integrated into website
* **chaining** on the python side (map,flatmap, filter, combine)
* groupreduce/cogroup reworked - grouping done on python side
* iterators passed to UDF's now iterable
* **lambda support**
* **test coverage** (works from IDE, maven and on travis)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1415] Akka cleanups

2015-01-26 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/319#discussion_r23526787
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/scheduler/SlotAvailabilityListener.java
 ---
@@ -21,7 +21,7 @@
 import org.apache.flink.runtime.instance.Instance;
 
 /**
- * A SlotAvailabilityListener can be notified when new {@link 
org.apache.flink.runtime.instance.AllocatedSlot}s become available
+ * A SlotAvailabilityListener can be notified when new {@link 
org.apache.flink.runtime.instance.AllocatedSlot2}s become available
--- End diff --

I guess `AllocatedSlot2` is an automatic rename leftover


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1415) Akka cleanups

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291769#comment-14291769
 ] 

ASF GitHub Bot commented on FLINK-1415:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/319#discussion_r23526787
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/scheduler/SlotAvailabilityListener.java
 ---
@@ -21,7 +21,7 @@
 import org.apache.flink.runtime.instance.Instance;
 
 /**
- * A SlotAvailabilityListener can be notified when new {@link 
org.apache.flink.runtime.instance.AllocatedSlot}s become available
+ * A SlotAvailabilityListener can be notified when new {@link 
org.apache.flink.runtime.instance.AllocatedSlot2}s become available
--- End diff --

I guess `AllocatedSlot2` is an automatic rename leftover


> Akka cleanups
> -
>
> Key: FLINK-1415
> URL: https://issues.apache.org/jira/browse/FLINK-1415
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> Currently, Akka has many different timeout values. From a user perspective, 
> it would be helpful to deduce all different timeouts from a single timeout 
> value. Additionally, the user should still be able to define specific values 
> for the different timeouts.
> Akka uses the akka.jobmanager.url config parameter to override the jobmanager 
> address and the port in case of a local setup. This mechanism is not safe 
> since it is exposed to the user. Thus, the mechanism should be replaced.
> The notifyExecutionStateChange method allows objects to access the internal 
> state of the TaskManager actor. This causes NullPointerExceptions when 
> shutting down the actor. This method should be removed to avoid accessing the 
> internal state of an actor by another object.
> With the latest Akka changes, the TaskManager watches the JobManager in order 
> to detect when it died or lost the connection to the TaskManager. This 
> behaviour should be tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1415] Akka cleanups

2015-01-26 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/319#discussion_r23526836
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/scheduler/SlotSharingGroupAssignment.java
 ---
@@ -42,86 +44,82 @@
 public class SlotSharingGroupAssignment implements Serializable {
 
static final long serialVersionUID = 42L;
-   
+
private static final Logger LOG = Scheduler.LOG;
-   
+
private transient final Object lock = new Object();
-   
+
/** All slots currently allocated to this sharing group */
private final Set allSlots = new 
LinkedHashSet();
-   
+
/** The slots available per vertex type (jid), keyed by instance, to 
make them locatable */
private final Map>> 
availableSlotsPerJid = new LinkedHashMap>>();
-   
-   
+
// 

-   
-   
-   public SubSlot addNewSlotWithTask(AllocatedSlot slot, ExecutionVertex 
vertex) {
-   JobVertexID id = vertex.getJobvertexId();
-   return addNewSlotWithTask(slot, id, id);
-   }
-   
-   public SubSlot addNewSlotWithTask(AllocatedSlot slot, ExecutionVertex 
vertex, CoLocationConstraint constraint) {
-   AbstractID groupId = constraint.getGroupId();
-   return addNewSlotWithTask(slot, groupId, null);
-   }
-   
-   private SubSlot addNewSlotWithTask(AllocatedSlot slot, AbstractID 
groupId, JobVertexID vertexId) {
-   
-   final SharedSlot sharedSlot = new SharedSlot(slot, this);
-   final Instance location = slot.getInstance();
-   
+
+   public SimpleSlot addSharedSlotAndAllocateSubSlot(SharedSlot 
sharedSlot, Locality locality,
+   
AbstractID groupId, CoLocationConstraint constraint) {
--- End diff --

indentation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1415) Akka cleanups

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291770#comment-14291770
 ] 

ASF GitHub Bot commented on FLINK-1415:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/319#discussion_r23526836
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/scheduler/SlotSharingGroupAssignment.java
 ---
@@ -42,86 +44,82 @@
 public class SlotSharingGroupAssignment implements Serializable {
 
static final long serialVersionUID = 42L;
-   
+
private static final Logger LOG = Scheduler.LOG;
-   
+
private transient final Object lock = new Object();
-   
+
/** All slots currently allocated to this sharing group */
private final Set allSlots = new 
LinkedHashSet();
-   
+
/** The slots available per vertex type (jid), keyed by instance, to 
make them locatable */
private final Map>> 
availableSlotsPerJid = new LinkedHashMap>>();
-   
-   
+
// 

-   
-   
-   public SubSlot addNewSlotWithTask(AllocatedSlot slot, ExecutionVertex 
vertex) {
-   JobVertexID id = vertex.getJobvertexId();
-   return addNewSlotWithTask(slot, id, id);
-   }
-   
-   public SubSlot addNewSlotWithTask(AllocatedSlot slot, ExecutionVertex 
vertex, CoLocationConstraint constraint) {
-   AbstractID groupId = constraint.getGroupId();
-   return addNewSlotWithTask(slot, groupId, null);
-   }
-   
-   private SubSlot addNewSlotWithTask(AllocatedSlot slot, AbstractID 
groupId, JobVertexID vertexId) {
-   
-   final SharedSlot sharedSlot = new SharedSlot(slot, this);
-   final Instance location = slot.getInstance();
-   
+
+   public SimpleSlot addSharedSlotAndAllocateSubSlot(SharedSlot 
sharedSlot, Locality locality,
+   
AbstractID groupId, CoLocationConstraint constraint) {
--- End diff --

indentation?


> Akka cleanups
> -
>
> Key: FLINK-1415
> URL: https://issues.apache.org/jira/browse/FLINK-1415
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> Currently, Akka has many different timeout values. From a user perspective, 
> it would be helpful to deduce all different timeouts from a single timeout 
> value. Additionally, the user should still be able to define specific values 
> for the different timeouts.
> Akka uses the akka.jobmanager.url config parameter to override the jobmanager 
> address and the port in case of a local setup. This mechanism is not safe 
> since it is exposed to the user. Thus, the mechanism should be replaced.
> The notifyExecutionStateChange method allows objects to access the internal 
> state of the TaskManager actor. This causes NullPointerExceptions when 
> shutting down the actor. This method should be removed to avoid accessing the 
> internal state of an actor by another object.
> With the latest Akka changes, the TaskManager watches the JobManager in order 
> to detect when it died or lost the connection to the TaskManager. This 
> behaviour should be tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1428] Update dataset_transformations.md

2015-01-26 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/340#issuecomment-71453921
  
+1

Will merge this later to `master` and `release-0.8`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1428) Typos in Java code example for RichGroupReduceFunction

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291771#comment-14291771
 ] 

ASF GitHub Bot commented on FLINK-1428:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/340#issuecomment-71453921
  
+1

Will merge this later to `master` and `release-0.8`.


> Typos in Java code example for RichGroupReduceFunction
> --
>
> Key: FLINK-1428
> URL: https://issues.apache.org/jira/browse/FLINK-1428
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Felix Neutatz
>Assignee: Felix Neutatz
>Priority: Minor
>
> http://flink.apache.org/docs/0.7-incubating/dataset_transformations.html
> String key = null //missing ';'
> public void combine(Iterable> in,
>   Collector> out))
> --> one ')' too much



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1415] Akka cleanups

2015-01-26 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/319#discussion_r23526978
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java ---
@@ -329,6 +330,15 @@ public void unregisterMemoryManager(MemoryManager 
memoryManager) {
}
}
 
+   protected void notifyExecutionStateChange(ExecutionState executionState,
+   
Throwable optionalError) {
--- End diff --

This also seems weird 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1330] [build] Build creates a link in t...

2015-01-26 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/333#issuecomment-71454018
  
Very nice. +1

Will merge this later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1415) Akka cleanups

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291773#comment-14291773
 ] 

ASF GitHub Bot commented on FLINK-1415:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/319#discussion_r23526978
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java ---
@@ -329,6 +330,15 @@ public void unregisterMemoryManager(MemoryManager 
memoryManager) {
}
}
 
+   protected void notifyExecutionStateChange(ExecutionState executionState,
+   
Throwable optionalError) {
--- End diff --

This also seems weird 


> Akka cleanups
> -
>
> Key: FLINK-1415
> URL: https://issues.apache.org/jira/browse/FLINK-1415
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> Currently, Akka has many different timeout values. From a user perspective, 
> it would be helpful to deduce all different timeouts from a single timeout 
> value. Additionally, the user should still be able to define specific values 
> for the different timeouts.
> Akka uses the akka.jobmanager.url config parameter to override the jobmanager 
> address and the port in case of a local setup. This mechanism is not safe 
> since it is exposed to the user. Thus, the mechanism should be replaced.
> The notifyExecutionStateChange method allows objects to access the internal 
> state of the TaskManager actor. This causes NullPointerExceptions when 
> shutting down the actor. This method should be removed to avoid accessing the 
> internal state of an actor by another object.
> With the latest Akka changes, the TaskManager watches the JobManager in order 
> to detect when it died or lost the connection to the TaskManager. This 
> behaviour should be tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1330) Restructure directory layout

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291774#comment-14291774
 ] 

ASF GitHub Bot commented on FLINK-1330:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/333#issuecomment-71454018
  
Very nice. +1

Will merge this later.


> Restructure directory layout
> 
>
> Key: FLINK-1330
> URL: https://issues.apache.org/jira/browse/FLINK-1330
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Documentation
>Reporter: Max Michels
>Priority: Minor
>  Labels: usability
>
> When building Flink, the build results can currently be found under 
> "flink-root/flink-dist/target/flink-$FLINKVERSION-incubating-SNAPSHOT-bin/flink-$YARNVERSION-$FLINKVERSION-incubating-SNAPSHOT/".
> I think we could improve the directory layout with the following:
> - provide the bin folder in the root by default
> - let the start up and submissions scripts in bin assemble the class path
> - in case the project hasn't been build yet, inform the user
> The changes would make it easier to work with Flink from source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1428] Update dataset_transformations.md

2015-01-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/340


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1428) Typos in Java code example for RichGroupReduceFunction

2015-01-26 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi resolved FLINK-1428.

Resolution: Fixed

Fixed in 06b2acf.

> Typos in Java code example for RichGroupReduceFunction
> --
>
> Key: FLINK-1428
> URL: https://issues.apache.org/jira/browse/FLINK-1428
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Felix Neutatz
>Assignee: Felix Neutatz
>Priority: Minor
>
> http://flink.apache.org/docs/0.7-incubating/dataset_transformations.html
> String key = null //missing ';'
> public void combine(Iterable> in,
>   Collector> out))
> --> one ')' too much



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1428) Typos in Java code example for RichGroupReduceFunction

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291779#comment-14291779
 ] 

ASF GitHub Bot commented on FLINK-1428:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/340


> Typos in Java code example for RichGroupReduceFunction
> --
>
> Key: FLINK-1428
> URL: https://issues.apache.org/jira/browse/FLINK-1428
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Felix Neutatz
>Assignee: Felix Neutatz
>Priority: Minor
>
> http://flink.apache.org/docs/0.7-incubating/dataset_transformations.html
> String key = null //missing ';'
> public void combine(Iterable> in,
>   Collector> out))
> --> one ')' too much



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1330] [build] Build creates a link in t...

2015-01-26 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/333#discussion_r23527422
  
--- Diff: flink-dist/pom.xml ---
@@ -436,6 +436,37 @@ under the License.



+
+   
+   
+   com.pyx4j
+   maven-junction-plugin
+   1.0.3
+   
+   
+   package
+   
+   link
+   
+   
+   
+   unlink
+   clean
+   
+   unlink
+   
+   
+   
+   
+   
+   
+   
${basedir}/../build-target
--- End diff --

Isn't based deprecated? In the next line you use project.basedir.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1330) Restructure directory layout

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291781#comment-14291781
 ] 

ASF GitHub Bot commented on FLINK-1330:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/333#discussion_r23527422
  
--- Diff: flink-dist/pom.xml ---
@@ -436,6 +436,37 @@ under the License.



+
+   
+   
+   com.pyx4j
+   maven-junction-plugin
+   1.0.3
+   
+   
+   package
+   
+   link
+   
+   
+   
+   unlink
+   clean
+   
+   unlink
+   
+   
+   
+   
+   
+   
+   
${basedir}/../build-target
--- End diff --

Isn't based deprecated? In the next line you use project.basedir.


> Restructure directory layout
> 
>
> Key: FLINK-1330
> URL: https://issues.apache.org/jira/browse/FLINK-1330
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Documentation
>Reporter: Max Michels
>Priority: Minor
>  Labels: usability
>
> When building Flink, the build results can currently be found under 
> "flink-root/flink-dist/target/flink-$FLINKVERSION-incubating-SNAPSHOT-bin/flink-$YARNVERSION-$FLINKVERSION-incubating-SNAPSHOT/".
> I think we could improve the directory layout with the following:
> - provide the bin folder in the root by default
> - let the start up and submissions scripts in bin assemble the class path
> - in case the project hasn't been build yet, inform the user
> The changes would make it easier to work with Flink from source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1168] Adds multi-char field delimiter s...

2015-01-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/264#issuecomment-71455623
  
Updated the PR and will merge once Travis completed the build.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1168) Support multi-character field delimiters in CSVInputFormats

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291783#comment-14291783
 ] 

ASF GitHub Bot commented on FLINK-1168:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/264#issuecomment-71455623
  
Updated the PR and will merge once Travis completed the build.


> Support multi-character field delimiters in CSVInputFormats
> ---
>
> Key: FLINK-1168
> URL: https://issues.apache.org/jira/browse/FLINK-1168
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.7.0-incubating
>Reporter: Fabian Hueske
>Assignee: Manu Kaul
>Priority: Minor
>  Labels: starter
>
> The CSVInputFormat supports multi-char (String) line delimiters, but only 
> single-char (char) field delimiters.
> This issue proposes to add support for multi-char field delimiters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1415] Akka cleanups

2015-01-26 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/319#issuecomment-71455790
  
This is a huge change. I took a look over the code, but I don't have enough 
experience with the scheduler to understand these changes.

I would suggest to merge this rather soon because its touching a lot of 
code due to minor scala style changes (semicolons, removal of parentheses from 
no-arg methods, unneeded { } and so on)
+1 for the added documentation to the classes!

The bug in FLINK-1453 would be more obvious with these changes were merged. 
That's another motivation for me to push this pull request forward ;)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1415) Akka cleanups

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291785#comment-14291785
 ] 

ASF GitHub Bot commented on FLINK-1415:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/319#issuecomment-71455790
  
This is a huge change. I took a look over the code, but I don't have enough 
experience with the scheduler to understand these changes.

I would suggest to merge this rather soon because its touching a lot of 
code due to minor scala style changes (semicolons, removal of parentheses from 
no-arg methods, unneeded { } and so on)
+1 for the added documentation to the classes!

The bug in FLINK-1453 would be more obvious with these changes were merged. 
That's another motivation for me to push this pull request forward ;)



> Akka cleanups
> -
>
> Key: FLINK-1415
> URL: https://issues.apache.org/jira/browse/FLINK-1415
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> Currently, Akka has many different timeout values. From a user perspective, 
> it would be helpful to deduce all different timeouts from a single timeout 
> value. Additionally, the user should still be able to define specific values 
> for the different timeouts.
> Akka uses the akka.jobmanager.url config parameter to override the jobmanager 
> address and the port in case of a local setup. This mechanism is not safe 
> since it is exposed to the user. Thus, the mechanism should be replaced.
> The notifyExecutionStateChange method allows objects to access the internal 
> state of the TaskManager actor. This causes NullPointerExceptions when 
> shutting down the actor. This method should be removed to avoid accessing the 
> internal state of an actor by another object.
> With the latest Akka changes, the TaskManager watches the JobManager in order 
> to detect when it died or lost the connection to the TaskManager. This 
> behaviour should be tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1344] [streaming] Added static StreamEx...

2015-01-26 Thread senorcarbone
GitHub user senorcarbone opened a pull request:

https://github.com/apache/flink/pull/341

[FLINK-1344] [streaming] Added static StreamExecutionEnvironment 
initialisation and Implicits for scala sources

This PR addresses the ticket [1] for further scala constructs 
interoperability. I had to add static StreamExecutionEnvironment initialisation 
to make the implicit conversion possible. 

[1] https://issues.apache.org/jira/browse/FLINK-1344

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink scala-seq

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/341.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #341


commit 2ea675895d605e5c0a442171388c7be9361acf79
Author: Paris Carbone 
Date:   2015-01-23T16:23:46Z

[FLINK-1344] [streaming] [scala] Added implicits from scala seq to 
datastream and static StreamExecutionEnvironment initialization




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1344) Add implicit conversion from scala streams to DataStreams

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291787#comment-14291787
 ] 

ASF GitHub Bot commented on FLINK-1344:
---

GitHub user senorcarbone opened a pull request:

https://github.com/apache/flink/pull/341

[FLINK-1344] [streaming] Added static StreamExecutionEnvironment 
initialisation and Implicits for scala sources

This PR addresses the ticket [1] for further scala constructs 
interoperability. I had to add static StreamExecutionEnvironment initialisation 
to make the implicit conversion possible. 

[1] https://issues.apache.org/jira/browse/FLINK-1344

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink scala-seq

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/341.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #341


commit 2ea675895d605e5c0a442171388c7be9361acf79
Author: Paris Carbone 
Date:   2015-01-23T16:23:46Z

[FLINK-1344] [streaming] [scala] Added implicits from scala seq to 
datastream and static StreamExecutionEnvironment initialization




> Add implicit conversion from scala streams to DataStreams
> -
>
> Key: FLINK-1344
> URL: https://issues.apache.org/jira/browse/FLINK-1344
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Paris Carbone
>Assignee: Paris Carbone
>Priority: Trivial
>
> Source definitions in the scala-api work either with a collector in a UDF, or 
> by passing a Seq that could also be a lazily generated scala Stream. To 
> encourage a purely functional coding style in the streaming scala-api while 
> also adding some interoperability with scala constructs (ie. Streams, Lists) 
> it would be nice to add an implicit conversion from Seq[T] to DataStream[T]. 
> [EDIT] The StreamExecutionEnvironment should be statically initialised to 
> allow for an implicit source definition
> (An upcoming idea would be for sinks to also support wrapping up flink 
> streams to scala streams for full interoperability with scala streaming code.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1320) Add an off-heap variant of the managed memory

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1320:
--
Assignee: Max Michels

> Add an off-heap variant of the managed memory
> -
>
> Key: FLINK-1320
> URL: https://issues.apache.org/jira/browse/FLINK-1320
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Reporter: Stephan Ewen
>Assignee: Max Michels
>Priority: Minor
>
> For (nearly) all memory that Flink accumulates (in the form of sort buffers, 
> hash tables, caching), we use a special way of representing data serialized 
> across a set of memory pages. The big work lies in the way the algorithms are 
> implemented to operate on pages, rather than on objects.
> The core class for the memory is the {{MemorySegment}}, which has all methods 
> to set and get primitives values efficiently. It is a somewhat simpler (and 
> faster) variant of a HeapByteBuffer.
> As such, it should be straightforward to create a version where the memory 
> segment is not backed by a heap byte[], but by memory allocated outside the 
> JVM, in a similar way as the NIO DirectByteBuffers, or the Netty direct 
> buffers do it.
> This may have multiple advantages:
>   - We reduce the size of the JVM heap (garbage collected) and the number and 
> size of long living alive objects. For large JVM sizes, this may improve 
> performance quite a bit. Utilmately, we would in many cases reduce JVM size 
> to 1/3 to 1/2 and keep the remaining memory outside the JVM.
>   - We save copies when we move memory pages to disk (spilling) or through 
> the network (shuffling / broadcasting / forward piping)
> The changes required to implement this are
>   - Add a {{UnmanagedMemorySegment}} that only stores the memory adress as a 
> long, and the segment size. It is initialized from a DirectByteBuffer.
>   - Allow the MemoryManager to allocate these MemorySegments, instead of the 
> current ones.
>   - Make sure that the startup script pick up the mode and configure the heap 
> size and the max direct memory properly.
> Since the MemorySegment is probably the most performance critical class in 
> Flink, we must take care that we do this right. The following are critical 
> considerations:
>   - If we want both solutions (heap and off-heap) to exist side-by-side 
> (configurable), we must make the base MemorySegment abstract and implement 
> two versions (heap and off-heap).
>   - To get the best performance, we need to make sure that only one class 
> gets loaded (or at least ever used), to ensure optimal JIT de-virtualization 
> and inlining.
>   - We should carefully measure the performance of both variants. From 
> previous micro benchmarks, I remember that individual byte accesses in 
> DirectByteBuffers (off-heap) were slightly slower than on-heap, any larger 
> accesses were equally good or slightly better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1424) bin/flink run does not recognize -c parameter anymore

2015-01-26 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1424:
--
Assignee: Max Michels

> bin/flink run does not recognize -c parameter anymore
> -
>
> Key: FLINK-1424
> URL: https://issues.apache.org/jira/browse/FLINK-1424
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager
>Affects Versions: master
>Reporter: Carsten Brandt
>Assignee: Max Michels
>
> bin/flink binary does not recognize `-c` parameter anymore which specifies 
> the class to run:
> {noformat}
> $ ./flink run "/path/to/target/impro3-ws14-flink-1.0-SNAPSHOT.jar" -c 
> de.tu_berlin.impro3.flink.etl.FollowerGraphGenerator /tmp/flink/testgraph.txt 
> 1
> usage: emma-experiments-impro3-ss14-flink
>[-?]
> emma-experiments-impro3-ss14-flink: error: unrecognized arguments: '-c'
> {noformat}
> before this command worked fine and executed the job.
> I tracked it down to the following commit using `git bisect`:
> {noformat}
> 93eadca782ee8c77f89609f6d924d73021dcdda9 is the first bad commit
> commit 93eadca782ee8c77f89609f6d924d73021dcdda9
> Author: Alexander Alexandrov 
> Date:   Wed Dec 24 13:49:56 2014 +0200
> [FLINK-1027] [cli] Added support for '--' and '-' prefixed tokens in CLI 
> program arguments.
> 
> This closes #278
> :04 04 a1358e6f7fe308b4d51a47069f190a29f87fdeda 
> d6f11bbc9444227d5c6297ec908e44b9644289a9 Mflink-clients
> {noformat}
> https://github.com/apache/flink/commit/93eadca782ee8c77f89609f6d924d73021dcdda9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1330] [build] Build creates a link in t...

2015-01-26 Thread uce
Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/333#discussion_r23528552
  
--- Diff: flink-dist/pom.xml ---
@@ -436,6 +436,37 @@ under the License.



+
+   
+   
+   com.pyx4j
+   maven-junction-plugin
+   1.0.3
+   
+   
+   package
+   
+   link
+   
+   
+   
+   unlink
+   clean
+   
+   unlink
+   
+   
+   
+   
+   
+   
+   
${basedir}/../build-target
--- End diff --

I couldn't find a document stating that $basedir is deprecated, but I think 
you are right in the sense that the project prefix is used for everything 
related to the POM of the project (I think in previous versions the (now 
deprecated) prefix was `pom` and both the `version` and `basedir` properties 
are "built-ins").

We use $basedir in other places as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1330) Restructure directory layout

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291796#comment-14291796
 ] 

ASF GitHub Bot commented on FLINK-1330:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/333#discussion_r23528552
  
--- Diff: flink-dist/pom.xml ---
@@ -436,6 +436,37 @@ under the License.



+
+   
+   
+   com.pyx4j
+   maven-junction-plugin
+   1.0.3
+   
+   
+   package
+   
+   link
+   
+   
+   
+   unlink
+   clean
+   
+   unlink
+   
+   
+   
+   
+   
+   
+   
${basedir}/../build-target
--- End diff --

I couldn't find a document stating that $basedir is deprecated, but I think 
you are right in the sense that the project prefix is used for everything 
related to the POM of the project (I think in previous versions the (now 
deprecated) prefix was `pom` and both the `version` and `basedir` properties 
are "built-ins").

We use $basedir in other places as well.


> Restructure directory layout
> 
>
> Key: FLINK-1330
> URL: https://issues.apache.org/jira/browse/FLINK-1330
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Documentation
>Reporter: Max Michels
>Priority: Minor
>  Labels: usability
>
> When building Flink, the build results can currently be found under 
> "flink-root/flink-dist/target/flink-$FLINKVERSION-incubating-SNAPSHOT-bin/flink-$YARNVERSION-$FLINKVERSION-incubating-SNAPSHOT/".
> I think we could improve the directory layout with the following:
> - provide the bin folder in the root by default
> - let the start up and submissions scripts in bin assemble the class path
> - in case the project hasn't been build yet, inform the user
> The changes would make it easier to work with Flink from source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1415) Akka cleanups

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291800#comment-14291800
 ] 

ASF GitHub Bot commented on FLINK-1415:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/319#issuecomment-71458216
  
I think @StephanEwen is reviewing the critical part.


> Akka cleanups
> -
>
> Key: FLINK-1415
> URL: https://issues.apache.org/jira/browse/FLINK-1415
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> Currently, Akka has many different timeout values. From a user perspective, 
> it would be helpful to deduce all different timeouts from a single timeout 
> value. Additionally, the user should still be able to define specific values 
> for the different timeouts.
> Akka uses the akka.jobmanager.url config parameter to override the jobmanager 
> address and the port in case of a local setup. This mechanism is not safe 
> since it is exposed to the user. Thus, the mechanism should be replaced.
> The notifyExecutionStateChange method allows objects to access the internal 
> state of the TaskManager actor. This causes NullPointerExceptions when 
> shutting down the actor. This method should be removed to avoid accessing the 
> internal state of an actor by another object.
> With the latest Akka changes, the TaskManager watches the JobManager in order 
> to detect when it died or lost the connection to the TaskManager. This 
> behaviour should be tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1415] Akka cleanups

2015-01-26 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/319#issuecomment-71458216
  
I think @StephanEwen is reviewing the critical part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...

2015-01-26 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-71458700
  
Wow, great news! :-)

In general, I think we really have to do something about getting the 
changes in. The PR is growing faster than its getting feedback. Has anybody 
looked into this and tried it out recently?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291803#comment-14291803
 ] 

ASF GitHub Bot commented on FLINK-377:
--

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/202#issuecomment-71458700
  
Wow, great news! :-)

In general, I think we really have to do something about getting the 
changes in. The PR is growing faster than its getting feedback. Has anybody 
looked into this and tried it out recently?


> Create a general purpose framework for language bindings
> 
>
> Key: FLINK-377
> URL: https://issues.apache.org/jira/browse/FLINK-377
> Project: Flink
>  Issue Type: Improvement
>Reporter: GitHub Import
>  Labels: github-import
> Fix For: pre-apache
>
>
> A general purpose API to run operators with arbitrary binaries. 
> This will allow to run Stratosphere programs written in Python, JavaScript, 
> Ruby, Go or whatever you like. 
> We suggest using Google Protocol Buffers for data serialization. This is the 
> list of languages that currently support ProtoBuf: 
> https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns 
> Very early prototype with python: 
> https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing 
> protobuf)
> For Ruby: https://github.com/infochimps-labs/wukong
> Two new students working at Stratosphere (@skunert and @filiphaase) are 
> working on this.
> The reference binding language will be for Python, but other bindings are 
> very welcome.
> The best name for this so far is "stratosphere-lang-bindings".
> I created this issue to track the progress (and give everybody a chance to 
> comment on this)
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/377
> Created by: [rmetzger|https://github.com/rmetzger]
> Labels: enhancement, 
> Assignee: [filiphaase|https://github.com/filiphaase]
> Created at: Tue Jan 07 19:47:20 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1443) Add replicated data source

2015-01-26 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-1443:


Assignee: Fabian Hueske

> Add replicated data source
> --
>
> Key: FLINK-1443
> URL: https://issues.apache.org/jira/browse/FLINK-1443
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, JobManager, Optimizer
>Affects Versions: 0.9
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Minor
>
> This issue proposes to add support for data sources that read the same data 
> in all parallel instances. This feature can be useful, if the data is 
> replicated to all machines in a cluster and can be locally read. 
> For example, a replicated input format can be used for a broadcast join 
> without sending any data over the network.
> The following changes are necessary to achieve this:
> 1) Add a replicating InputSplitAssigner which assigns all splits to the all 
> parallel instances. This requires also to extend the InputSplitAssigner 
> interface to identify the exact parallel instance that requests an InputSplit 
> (currently only the hostname is provided).
> 2) Make sure that the DOP of the replicated data source is identical to the 
> DOP of its successor.
> 3) Let the optimizer know that the data is replicated and ensure that plan 
> enumeration works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1105) Add support for locally sorted output

2015-01-26 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-1105:


Assignee: Fabian Hueske

> Add support for locally sorted output
> -
>
> Key: FLINK-1105
> URL: https://issues.apache.org/jira/browse/FLINK-1105
> Project: Flink
>  Issue Type: Sub-task
>  Components: Java API
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Minor
>
> This feature will make it possible to sort the output which is sent to an 
> OutputFormat to obtain a locally sorted result.
> This feature was available in the "old" Java API and has not be ported to the 
> new Java API yet. Hence optimizer and runtime should already have support for 
> this feature. However, the API and job generation part is missing.
> It is also a subfeature of FLINK-598 which will provide also globally sorted 
> results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Mk amulti char delim

2015-01-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/247


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1168) Support multi-character field delimiters in CSVInputFormats

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291830#comment-14291830
 ] 

ASF GitHub Bot commented on FLINK-1168:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/264


> Support multi-character field delimiters in CSVInputFormats
> ---
>
> Key: FLINK-1168
> URL: https://issues.apache.org/jira/browse/FLINK-1168
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.7.0-incubating
>Reporter: Fabian Hueske
>Assignee: Manu Kaul
>Priority: Minor
>  Labels: starter
>
> The CSVInputFormat supports multi-char (String) line delimiters, but only 
> single-char (char) field delimiters.
> This issue proposes to add support for multi-char field delimiters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1168] Adds multi-char field delimiter s...

2015-01-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/264


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1168) Support multi-character field delimiters in CSVInputFormats

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291831#comment-14291831
 ] 

ASF GitHub Bot commented on FLINK-1168:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/247


> Support multi-character field delimiters in CSVInputFormats
> ---
>
> Key: FLINK-1168
> URL: https://issues.apache.org/jira/browse/FLINK-1168
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.7.0-incubating
>Reporter: Fabian Hueske
>Assignee: Manu Kaul
>Priority: Minor
>  Labels: starter
>
> The CSVInputFormat supports multi-char (String) line delimiters, but only 
> single-char (char) field delimiters.
> This issue proposes to add support for multi-char field delimiters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1168) Support multi-character field delimiters in CSVInputFormats

2015-01-26 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-1168.
--
Resolution: Fixed

Fixed with 0548a93dfc555a5403590f147d4850c730facaf6

> Support multi-character field delimiters in CSVInputFormats
> ---
>
> Key: FLINK-1168
> URL: https://issues.apache.org/jira/browse/FLINK-1168
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.7.0-incubating
>Reporter: Fabian Hueske
>Assignee: Manu Kaul
>Priority: Minor
>  Labels: starter
>
> The CSVInputFormat supports multi-char (String) line delimiters, but only 
> single-char (char) field delimiters.
> This issue proposes to add support for multi-char field delimiters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1330] [build] Build creates a link in t...

2015-01-26 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/333#issuecomment-71464164
  
-1
I think we need to add the `build-target` "directory" into the list of 
ignored directories for apache rat. Rat will fail subsequent builds
```
1 Unknown Licenses

***

Unapproved licenses:

  build-target/conf/slaves

***
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1330) Restructure directory layout

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291839#comment-14291839
 ] 

ASF GitHub Bot commented on FLINK-1330:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/333#issuecomment-71464164
  
-1
I think we need to add the `build-target` "directory" into the list of 
ignored directories for apache rat. Rat will fail subsequent builds
```
1 Unknown Licenses

***

Unapproved licenses:

  build-target/conf/slaves

***
```


> Restructure directory layout
> 
>
> Key: FLINK-1330
> URL: https://issues.apache.org/jira/browse/FLINK-1330
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Documentation
>Reporter: Max Michels
>Priority: Minor
>  Labels: usability
>
> When building Flink, the build results can currently be found under 
> "flink-root/flink-dist/target/flink-$FLINKVERSION-incubating-SNAPSHOT-bin/flink-$YARNVERSION-$FLINKVERSION-incubating-SNAPSHOT/".
> I think we could improve the directory layout with the following:
> - provide the bin folder in the root by default
> - let the start up and submissions scripts in bin assemble the class path
> - in case the project hasn't been build yet, inform the user
> The changes would make it easier to work with Flink from source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons (...

2015-01-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-71465448
  
Haven't had a closer look yet, but one thing that I noticed is the naming 
of the test files. 
In the current codebase all tests are named XyzTest (or XyzITCase) instead 
of TestXyz. Not sure if its worth changing though...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1450) Add Fold operator to the Streaming api

2015-01-26 Thread Gyula Fora (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291852#comment-14291852
 ] 

Gyula Fora commented on FLINK-1450:
---

It is actually something quite different.

With reduce you reduce 2 values of the same type to 1.
With fold you take an initial value of arbitrary type and you write a function 
that takes as input 2 paramaters, 1 of the type of the datastream/set and 1 of 
the initial value type. And the function outputs a new value of this arbitrary 
type

This is actually more similar to reduceGroup but you cannot reduce the whole 
stream at once so thats not applicable in the streaming api. 

> Add Fold operator to the Streaming api
> --
>
> Key: FLINK-1450
> URL: https://issues.apache.org/jira/browse/FLINK-1450
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Gyula Fora
>Priority: Minor
>  Labels: starter
>
> The streaming API currently doesn't support a fold operator.
> This operator would work as the foldLeft method in Scala. This would allow 
> effective implementations in a lot of cases where a the simple reduce is 
> inappropriate due to different return types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1201) Graph API for Flink

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291851#comment-14291851
 ] 

ASF GitHub Bot commented on FLINK-1201:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-71465448
  
Haven't had a closer look yet, but one thing that I noticed is the naming 
of the test files. 
In the current codebase all tests are named XyzTest (or XyzITCase) instead 
of TestXyz. Not sure if its worth changing though...


> Graph API for Flink 
> 
>
> Key: FLINK-1201
> URL: https://issues.apache.org/jira/browse/FLINK-1201
> Project: Flink
>  Issue Type: New Feature
>Reporter: Kostas Tzoumas
>Assignee: Vasia Kalavri
>
> This issue tracks the development of a Graph API/DSL for Flink.
> Until the code is pushed to the Flink repository, collaboration is happening 
> here: https://github.com/project-flink/flink-graph



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1454) CliFrontend blocks for 100 seconds when submitting to a non-existent JobManager

2015-01-26 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1454:
-

 Summary: CliFrontend blocks for 100 seconds when submitting to a 
non-existent JobManager
 Key: FLINK-1454
 URL: https://issues.apache.org/jira/browse/FLINK-1454
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.9
Reporter: Robert Metzger


When a user tries to submit a job to a job manager which doesn't exist at all, 
the CliFrontend blocks for 100 seconds.

Ideally, Akka would fail because it can not connect to the given hostname:port.
 
{code}
./bin/flink run ./examples/flink-java-examples-0.9-SNAPSHOT-WordCount.jar -c 
foo.Baz
org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error.
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:449)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:350)
at org.apache.flink.client.program.Client.run(Client.java:242)
at 
org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:389)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:362)
at 
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1078)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1102)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [100 
seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at 
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.flink.runtime.akka.AkkaUtils$.ask(AkkaUtils.scala:265)
at 
org.apache.flink.runtime.client.JobClient$.uploadJarFiles(JobClient.scala:169)
at 
org.apache.flink.runtime.client.JobClient.uploadJarFiles(JobClient.scala)
at org.apache.flink.client.program.Client.run(Client.java:314)
at org.apache.flink.client.program.Client.run(Client.java:296)
at org.apache.flink.client.program.Client.run(Client.java:290)
at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:55)
at 
org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:434)
... 6 more

The exception above occurred while trying to run your command.
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-1436] refactor CLiFrontend to provide m...

2015-01-26 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/331#issuecomment-71467375
  
I think it would be better not to print the help if the user specified 
something incorrectly. Maybe just the error message and a note that -h prints 
the help?

I've tried out the change, but now, the message is as the very bottom of 
the output. Its now probably even harder to find it.

**Bad** (see below for *Good*)

```
robert@robert-da ~/flink-workdir/flink2/build-target (git)-[flink-1436] % 
./bin/flink ./examples/flink-java-examples-0.9-SNAPSHOT-WordCount.jar 

Action "run" compiles and runs a program.

  Syntax: run [OPTIONS]  
  "run" action options:
 -c,--classClass with the program entry point 
("main"
  method or "getPlan()" method. Only 
needed
  if the JAR file does not specify the 
class
  in its manifest.
 -m,--jobmanager   Address of the JobManager (master) to
  which to connect. Specify 
'yarn-cluster'
  as the JobManager to deploy a YARN 
cluster
  for the job. Use this flag to connect 
to a
  different JobManager than the one
  specified in the configuration.
 -p,--parallelismThe parallelism with which to run the
  program. Optional flag to override the
  default value specified in the
  configuration.
  Additional arguments if -m yarn-cluster is set:
 -yD Dynamic properties
 -yj,--yarnjar   Path to Flink jar file
 -yjm,--yarnjobManagerMemory Memory for JobManager Container 
[in
  MB]
 -yn,--yarncontainer Number of YARN container to 
allocate
  (=Number of Task Managers)
 -yq,--yarnquery  Display available YARN resources
  (memory, cores)
 -yqu,--yarnqueueSpecify YARN queue.
 -ys,--yarnslots Number of slots per TaskManager
 -yt,--yarnship  Ship files in the specified 
directory
  (t for transfer)
 -ytm,--yarntaskManagerMemoryMemory per TaskManager Container 
[in
  MB]

Invalid action: "./examples/flink-java-examples-0.9-SNAPSHOT-WordCount.jar"
1 robert@robert-da ~/flink-workdir/flink2/build-target (git)-[flink-1436]
```

The "info" command is over-engineered in my optionion. It contains only one 
possible option, which is "-e" for execution plan. I would vote to remove the 
info action and call it "plan" or so. 
Or keep its "info" name and print the plan by default (this is not @mxm's 
fault .. but it would be nice to fix this with the PR)
```
 ./bin/flink info  
./examples/flink-java-examples-0.9-SNAPSHOT-WordCount.jar  

Action "info" displays information about a program.

  Syntax: info [OPTIONS]  
  "info" action options:
 -c,--classClass with the program entry point 
("main"
  method or "getPlan()" method. Only 
needed
  if the JAR file does not specify the 
class
  in its manifest.
 -e,--executionplan   Show optimized execution plan of the
  program (JSON)
 -m,--jobmanager   Address of the JobManager (master) to
  which to connect. Specify 
'yarn-cluster'
  as the JobManager to deploy a YARN 
cluster
  for the job. Use this flag to connect 
to a
  different JobManager than the one
  specified in the configuration.
 -p,--parallelismThe parallelism with which to run the
  program. Optional flag to override the
  default value specified in the
  configuration.

Error: Specify one of the above options to display information.
```

**Good**

What I liked was the error reporting when passing an invalid file as the 
jar file:
```
robert@robert-da ~/flink-workdir/flink2/build-target (git)-[flink-1436

[jira] [Created] (FLINK-1455) ExternalSortLargeRecordsITCase.testSortWithShortMediumAndLargeRecords: Potential Memory leak

2015-01-26 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1455:
-

 Summary: 
ExternalSortLargeRecordsITCase.testSortWithShortMediumAndLargeRecords: 
Potential Memory leak
 Key: FLINK-1455
 URL: https://issues.apache.org/jira/browse/FLINK-1455
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime
Affects Versions: 0.9
Reporter: Robert Metzger
Priority: Minor


This error occurred in one of my Travis jobs: 
https://travis-ci.org/rmetzger/flink/jobs/48343022

Would be cool if somebody who knows the sorter better could verify/invalidate 
the issue.

{code}
Running org.apache.flink.runtime.operators.sort.ExternalSortLargeRecordsITCase
java.lang.RuntimeException: Error obtaining the sorted input: Thread 
'SortMerger spilling thread' terminated due to an exception: Cannot allocate 
memory
at 
org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:593)
at 
org.apache.flink.runtime.operators.sort.ExternalSortLargeRecordsITCase.testSortWithShortMediumAndLargeRecords(ExternalSortLargeRecordsITCase.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Caused by: java.io.IOException: Thread 'SortMerger spilling thread' terminated 
due to an exception: Cannot allocate memory
at 
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:770)
Caused by: java.io.IOException: Cannot allocate memory
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:205)
at 
org.apache.flink.runtime.io.disk.iomanager.SegmentWriteRequest.write(AsynchronousFileIOChannel.java:267)
at 
org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync$WriterThread.run(IOManagerAsync.java:440)
Tests run: 5, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 56.089 sec <<< 
FAILURE! - in 
org.apache.flink.runtime.operators.sort.ExternalSortLargeRecordsITCase
testSortWithShortMediumAndLargeRecords(org.apache.flink.runtime.operators.sort.ExternalSortLargeRecordsITCase)
  Time elapsed: 25.84 sec  <<< FAILURE!
java.lang.AssertionError: Error obtaining the sorted input: Thread 'SortMerger 
spilling thread' terminated due to an exception: Cannot allocate memory
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.flink.runtime.operators.sort.ExternalSortLargeRecordsITCase.testSortWithShortMediumAndLargeRecords(ExternalSortLargeRecordsITCase.java:304)

testSortWithShortMediumAndLargeRecords(org.apache.flink.runtime.operators.sort.ExternalSortLargeRecordsITCase

[GitHub] flink pull request: [FLINK-1330] [build] Build creates a link in t...

2015-01-26 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/333#issuecomment-71468579
  
Very good catch.

I would address @aljoscha's comment and add the exclude. After that, I 
think it's fine to merge.

I've tested it locally and it works fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1330) Restructure directory layout

2015-01-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291880#comment-14291880
 ] 

ASF GitHub Bot commented on FLINK-1330:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/333#issuecomment-71468579
  
Very good catch.

I would address @aljoscha's comment and add the exclude. After that, I 
think it's fine to merge.

I've tested it locally and it works fine.


> Restructure directory layout
> 
>
> Key: FLINK-1330
> URL: https://issues.apache.org/jira/browse/FLINK-1330
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Documentation
>Reporter: Max Michels
>Priority: Minor
>  Labels: usability
>
> When building Flink, the build results can currently be found under 
> "flink-root/flink-dist/target/flink-$FLINKVERSION-incubating-SNAPSHOT-bin/flink-$YARNVERSION-$FLINKVERSION-incubating-SNAPSHOT/".
> I think we could improve the directory layout with the following:
> - provide the bin folder in the root by default
> - let the start up and submissions scripts in bin assemble the class path
> - in case the project hasn't been build yet, inform the user
> The changes would make it easier to work with Flink from source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >