[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-22 Thread thvasilo
Github user thvasilo commented on the issue: https://github.com/apache/flink/pull/2740 Hello @tfournier314, I should have clarified for documentation I meant apart from the docstrings you have added now, we also have to include documentation in the Flink [docs](https://github.com/apa

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-22 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 Hello @thvasilo @greghogan Ok I've updated documentation. I stay tuned for updating code. Regards Thomas --- If your project is set up for it, you can reply to this emai

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-21 Thread thvasilo
Github user thvasilo commented on the issue: https://github.com/apache/flink/pull/2740 Hello @tfournier314, This PR is still missing documentation. After that is done a project committer will have to review it before it gets merged, which might take a while. Regards,

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-21 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan @thvasilo What's the next step ? More tests and reviews ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-16 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan Ok I've pushed the code with my tests and some modifications in mapping @thvasilo It seems to work perfectly! --- If your project is set up for it, you can reply to this em

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-15 Thread thvasilo
Github user thvasilo commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan Excuse my ignorance, I'm only now learning about Flink internals :) It seems like the issue here was that `partitionByRange` partitions keys in ascending order but we want the end res

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-15 Thread greghogan
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2740 `zipWithIndex` preserves the order between partitions (DataSetUtils.java:121). @tfournier314, I don't think it's a problem pushing your current code since we're still discussing the PR. --- If yo

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-15 Thread thvasilo
Github user thvasilo commented on the issue: https://github.com/apache/flink/pull/2740 Hello @tfournier314 I tested your code and it does seem that partitions are sorted only internally, which is expected and `zipWithIndex` is AFAIK unaware of the sorted (as in key range) order of

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-14 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @greghogan I've not pushed the code yet because my tests are still incorrect. Indeed the following code: val env = ExecutionEnvironment.getExecutionEnvironment val fitData = env

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-09 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 @thvasilo @greghogan I've updated my code so that I'm streaming instead of caching with a collect(). Does it seem ok for you ? --- If your project is set up for it, you can reply to this email

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-04 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 I've changed my code so that I have now mapping:DataSet[(String,Long)] val mapping = input .mapWith( s => (s, 1) ) .groupBy( 0 ) .reduce( (a, b) => (a._1, a.

[GitHub] flink issue #2740: [FLINK-4964] [ml]

2016-11-02 Thread tfournier314
Github user tfournier314 commented on the issue: https://github.com/apache/flink/pull/2740 Yes, I've just updated the PR title --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena