Re: Opening a discussion on FlinkML

2016-02-16 Thread Theodore Vasiloudis
So in regards to the original topic question it seems like most people prefer option 2, which is to keep the development of FlinkML inside the project, but try to bring in new commiters. A lot of other interesting points have been raised here as well, and if people are interested in working on thi

Re: StateBackend

2016-02-16 Thread Stephan Ewen
I think this is actually a pretty good question. Right now, there are two different types of state backends: (1) Flink-embedded. They are independent of external services, scale out as the Flink job scales out, and are really mainly a way of storing and backuping key/value state. For exa

[jira] [Created] (FLINK-3411) Failed recovery can lead to removal of HA state

2016-02-16 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-3411: -- Summary: Failed recovery can lead to removal of HA state Key: FLINK-3411 URL: https://issues.apache.org/jira/browse/FLINK-3411 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-3412) Remove implicit conversions JavaStream / ScalaStream

2016-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-3412: --- Summary: Remove implicit conversions JavaStream / ScalaStream Key: FLINK-3412 URL: https://issues.apache.org/jira/browse/FLINK-3412 Project: Flink Issue Type:

[jira] [Created] (FLINK-3413) Remove implicit Seq to DataStream conversion

2016-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-3413: --- Summary: Remove implicit Seq to DataStream conversion Key: FLINK-3413 URL: https://issues.apache.org/jira/browse/FLINK-3413 Project: Flink Issue Type: Improvem

Re: StateBackend

2016-02-16 Thread Matthias J. Sax
Thanks for the input. Just to clearly my understanding: by "Flink-embedded [...] scale out as the Flink job scales out", you mean that each TM hosts an embedded state backend service, ie, all those instances form a logically single but distributed backend service? How is is ensure, that the state

[jira] [Created] (FLINK-3414) Add Scala API for CEP's pattern definition

2016-02-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-3414: Summary: Add Scala API for CEP's pattern definition Key: FLINK-3414 URL: https://issues.apache.org/jira/browse/FLINK-3414 Project: Flink Issue Type: Improvem

[jira] [Created] (FLINK-3415) TimestampExctractor accepts negative watermarks

2016-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-3415: --- Summary: TimestampExctractor accepts negative watermarks Key: FLINK-3415 URL: https://issues.apache.org/jira/browse/FLINK-3415 Project: Flink Issue Type: Bug

Re: Flink 1.0.0 Release Candidate 0: Please help testing

2016-02-16 Thread Stephan Ewen
Found one blocker issue during testing: - Watermark generators accept negative watermarks (FLINK-3415) On Mon, Feb 15, 2016 at 8:47 PM, Robert Metzger wrote: > Hi, > > I've now created a "preview RC" for the upcoming 1.0.0 release. > There are still some blocking issues and important pull req

Re: [ANNOUNCE] Please annotate public interfaces!

2016-02-16 Thread Robert Metzger
Thank you for taking care of this Fabian. I would like to bring your attention to the "ConfigConstants" class. I marked it as "@Public". My intention is to ensure that we don't change configuration parameters after the 1.0 release (this would break existing configuration files of users). The tool

Re: [ANNOUNCE] Please annotate public interfaces!

2016-02-16 Thread Till Rohrmann
I think the important part about the ConfigConstants is that the values don’t change. How they are represented inside of Flink, does not really matter. It would be good if that could be verified automatically. Cheers, Till ​ On Tue, Feb 16, 2016 at 2:59 PM, Robert Metzger wrote: > Thank you for

[jira] [Created] (FLINK-3416) [py] .bat files fail when path contains spaces

2016-02-16 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-3416: --- Summary: [py] .bat files fail when path contains spaces Key: FLINK-3416 URL: https://issues.apache.org/jira/browse/FLINK-3416 Project: Flink Issue Type

H2O integration

2016-02-16 Thread Simone Robutti
Hello everyone, here at RadicalBit we are evaluating the possibility to start a project similar to SparkingWater to integrate H2O with Flink. I know the subject has been discussed several time over the course of the last year but noone seems to be working on it right now. I'm writing here to enqu

Re: H2O integration

2016-02-16 Thread Till Rohrmann
Hi Simone, as far as I know, there is nobody currently working on an H2O integration. I only looked briefly at the sparkling water implementation when it was released. If I remember correctly, then the general idea was to start H2O from within the Executor thread and to use a special RDD to commun

[jira] [Created] (FLINK-3417) Add RocksDB StateBackendFactory and integrate with Flink Config

2016-02-16 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3417: --- Summary: Add RocksDB StateBackendFactory and integrate with Flink Config Key: FLINK-3417 URL: https://issues.apache.org/jira/browse/FLINK-3417 Project: Flink

[jira] [Created] (FLINK-3418) RocksDB HDFSCopyFromLocal util doesn't respect our Hadoop security configuration

2016-02-16 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-3418: - Summary: RocksDB HDFSCopyFromLocal util doesn't respect our Hadoop security configuration Key: FLINK-3418 URL: https://issues.apache.org/jira/browse/FLINK-3418 Proj

Re: H2O integration

2016-02-16 Thread Slim Baltagi
Hi This two blog posts might be a good start http://blog.h2o.ai/2014/09/how-sparkling-water-brings-h2o-to-spark/ https://databricks.com/blog/2014/06/30/sparkling-water-h20-spark.html I am also collecting here all resources related to H2O in general and Sparkling Water in particular: http://spa

[jira] [Created] (FLINK-3419) Drop partitionByHash from DataStream

2016-02-16 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3419: --- Summary: Drop partitionByHash from DataStream Key: FLINK-3419 URL: https://issues.apache.org/jira/browse/FLINK-3419 Project: Flink Issue Type: Improvem

[jira] [Created] (FLINK-3420) Remove "ReadTextFileWithValue" from StreamExecutionEnvironment

2016-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-3420: --- Summary: Remove "ReadTextFileWithValue" from StreamExecutionEnvironment Key: FLINK-3420 URL: https://issues.apache.org/jira/browse/FLINK-3420 Project: Flink I

[jira] [Created] (FLINK-3421) Remove all unused ClassTag context bounds in the Streaming Scala API

2016-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-3421: --- Summary: Remove all unused ClassTag context bounds in the Streaming Scala API Key: FLINK-3421 URL: https://issues.apache.org/jira/browse/FLINK-3421 Project: Flink

[jira] [Created] (FLINK-3422) Scramble HashPartitioner hashes

2016-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-3422: --- Summary: Scramble HashPartitioner hashes Key: FLINK-3422 URL: https://issues.apache.org/jira/browse/FLINK-3422 Project: Flink Issue Type: Improvement

Adding TaskManager's

2016-02-16 Thread Deepak Jha
Hi All, I have a question on scaling-up/scaling-down flink cluster. As per the documentation, in order to scale-up the cluster, I can add a new taskmanager on the fly and jobmanager can assign work to it. Assuming, I have Flink HA , so in the event of master JobManager failure, how is this taskmana