Re: Flink Streaming ContinuousTimeTriggers

2016-02-24 Thread Gavin Lin
Hi, how about this video ? https://www.youtube.com/watch?v=T7hiwcwCXGI Gavin.Lin 2016-02-25 3:55 GMT+08:00 Ankur Sharma : > Hey, > > Can you guide me to some example of ContinuousProcessingTimeTrigger? > I want to partition input stream into TimeWindow that should fire at > continuous time inte

Re: How to use all available task managers

2016-02-24 Thread Matthias J. Sax
Could it be, that you would need to edit client local flink-conf.yaml instead of the TaskManager config files? (In case, you do not want to specify parallelism via env.setParallelism(int);) -Matthias On 02/24/2016 04:19 PM, Saiph Kappa wrote: > Thanks! It worked now :-) > > On Wed, Feb 24, 2016

Re: downloading dependency in apache flink

2016-02-24 Thread Pankaj Kumar
I was using wrong method to create fat jar, after using mvn clean install -Pbuild-jar i was able to execute code through flink command line. On Wed, Feb 24, 2016 at 7:12 PM, Till Rohrmann wrote: > What is the error message you receive? > > On Wed, Feb 24, 2016 at 1:49 PM, Pankaj Kumar > wrote

Flink Streaming ContinuousTimeTriggers

2016-02-24 Thread Ankur Sharma
Hey, Can you guide me to some example of ContinuousProcessingTimeTrigger? I want to partition input stream into TimeWindow that should fire at continuous time interval on its on without waiting for a new element to enter the stream. Could you guide me to it? Thanks Best, Ankur Sharma Informati

Flink Streaming - WriteAsText

2016-02-24 Thread Ankur Sharma
Hey, I am trying to use ContinuousProcessingTimeTrigger which fires TimeWindow every 5 seconds. But even though I explicitly state that the output of apply() method should be dumped to file every 10 seconds, I don’t see the file getting appended. When I cancel the job, I see all the dumps which

Re: Compile issues with Flink 1.0-SNAPSHOT and Scala 2.11

2016-02-24 Thread Till Rohrmann
I just tested building a Flink job using the latest SNAPSHOT version and the flink-connector-kafka-0.8/flink-connector-kafka-0.9 Kafka connector. The compilation succeeded with SBT. Could you maybe share your build.sbt with me. This would help me to figure out the problem you’re experiencing. Che

Re: Compile issues with Flink 1.0-SNAPSHOT and Scala 2.11

2016-02-24 Thread Cory Monty
What Dan posted on 2/22 is the current error we're seeing. As he stated, using the 1.0.0-rc0 version works, but switching back to SNAPSHOT does not compile. We can try clearing the ivy cache, but that has had no affect in the past. On Wed, Feb 24, 2016 at 11:34 AM, Till Rohrmann wrote: > What is

Re: Compile issues with Flink 1.0-SNAPSHOT and Scala 2.11

2016-02-24 Thread Till Rohrmann
What is currently the error you observe? It might help to clear org.apache.flink in the ivy cache once in a while. Cheers, Till On Wed, Feb 24, 2016 at 6:09 PM, Cory Monty wrote: > We're still seeing this issue in the latest SNAPSHOT version. Do you have > any suggestions to resolve the error?

Re: Compile issues with Flink 1.0-SNAPSHOT and Scala 2.11

2016-02-24 Thread Cory Monty
We're still seeing this issue in the latest SNAPSHOT version. Do you have any suggestions to resolve the error? On Mon, Feb 22, 2016 at 3:41 PM, Dan Kee wrote: > Hello, > > I'm not sure if this related, but we recently started seeing this when > using `1.0-SNAPSHOT` in the `snapshots` repository

Re: Best way to process data in many files? (FLINK-BATCH)

2016-02-24 Thread Tim Conrad
Dear Till and others. I solved the issue by using the strategy suggested by Till like this: List fileListOfSpectra = ... SplittableList fileListOfSpectraSplitable = new SplittableList( fileListOfSpectra ); DataSource fileListOfSpectraDataSource = env.fromParallelCollect

Re: Best way to process data in many files? (FLINK-BATCH)

2016-02-24 Thread Till Rohrmann
If I’m not mistaken, then this shouldn’t solve the scheduling peculiarity of Flink. Flink will still deploy the tasks of the flat map operation to the machine where the source task is running. Only after this machine has no more slots left, other machines will be used as well. I think that you don

Re: How to use all available task managers

2016-02-24 Thread Saiph Kappa
Thanks! It worked now :-) On Wed, Feb 24, 2016 at 2:48 PM, Ufuk Celebi wrote: > You can use the environment to set it the job parallelism to 6 e.g. > env.setParallelism(6). > > Setting this will override the default behaviour. Maybe that's why the > default parallelism is not working... you migh

Re: Best way to process data in many files? (FLINK-BATCH)

2016-02-24 Thread Gábor Gévay
Hello, > // For each "filename" in list do... > DataSet featureList = fileList > .flatMap(new ReadDataSetFromFile()) // flatMap because there > might multiple DataSets in a file What happens if you just insert .rebalance() before the flatMap? > This kind of DataSource will only b

Re: How to use all available task managers

2016-02-24 Thread Ufuk Celebi
You can use the environment to set it the job parallelism to 6 e.g. env.setParallelism(6). Setting this will override the default behaviour. Maybe that's why the default parallelism is not working... you might have it set to 1 already? On Wed, Feb 24, 2016 at 3:41 PM, Saiph Kappa wrote: > I set

Re: How to use all available task managers

2016-02-24 Thread Saiph Kappa
I set "parallelism.default: 6" on flink-conf.yaml of all 6 machines, and still, my job only uses 1 task manager. Why? On Wed, Feb 24, 2016 at 8:31 AM, Till Rohrmann wrote: > Hi Saiph, > > I think the configuration value should be parallelism.default: 6. That > will execute jobs which have not pa

Re: downloading dependency in apache flink

2016-02-24 Thread Till Rohrmann
What is the error message you receive? On Wed, Feb 24, 2016 at 1:49 PM, Pankaj Kumar wrote: > Hi Till , > > I was able to make fat jar, but i am not able to execute this jar through > flink command line. > > On Wed, Feb 24, 2016 at 4:31 PM, Till Rohrmann > wrote: > >> Hi Pankaj, >> >> are you c

Re: downloading dependency in apache flink

2016-02-24 Thread Pankaj Kumar
Hi Till , I was able to make fat jar, but i am not able to execute this jar through flink command line. On Wed, Feb 24, 2016 at 4:31 PM, Till Rohrmann wrote: > Hi Pankaj, > > are you creating a fat jar when you create your use code jar? This can be > done using maven's shade plugin or the assem

Re: Best way to process data in many files? (FLINK-BATCH)

2016-02-24 Thread Till Rohrmann
Hi Tim, unfortunately, this is not documented explicitly as far as I know. For the InputFormats there is a marker interface called NonParallelInput. The input formats which implement this interface will be executed with a parallelism of 1. At the moment this holds true for the CollectionInputForma

Re: Underlying TaskManager's Actor System

2016-02-24 Thread Till Rohrmann
Hi Andrea, no there isn’t. But you can always start your own ActorSystem in a stateful operator. Cheers, Till ​ On Wed, Feb 24, 2016 at 11:57 AM, Andrea Sella wrote: > Hi, > There is a way to access to the underlying TaskManager's Actor System? > > Thank you in advance, > Andrea >

Re: downloading dependency in apache flink

2016-02-24 Thread Till Rohrmann
Hi Pankaj, are you creating a fat jar when you create your use code jar? This can be done using maven's shade plugin or the assembly plugin. We provide a maven archetype to set up a pom file which will make sure that a fat jar is built [1]. [1] https://ci.apache.org/projects/flink/flink-docs-mast

Underlying TaskManager's Actor System

2016-02-24 Thread Andrea Sella
Hi, There is a way to access to the underlying TaskManager's Actor System? Thank you in advance, Andrea

Re: Error when executing job

2016-02-24 Thread Till Rohrmann
I assume that you included the flink-connector-twitter dependency in your job jar, right? Alternatively, you might also put the jar in the lib folder on each of your machines. Cheers, Till ​ On Wed, Feb 24, 2016 at 10:38 AM, ram kumar wrote: > Hi, > > > getting below error when executing twitte

downloading dependency in apache flink

2016-02-24 Thread Pankaj Kumar
i am trying to write a job, using maven project. Job is working fine in my editor , but when i am running that job through flink command line its giving ClassNotFoundException exception . Its not to find dependency. If i will create a jar , will flink download all its dependency before executing

Re: Optimal Configuration for Cluster

2016-02-24 Thread Welly Tambunan
Hi Ufuk, Thanks for this. Really appreciated. Cheers On Tue, Feb 23, 2016 at 8:04 PM, Ufuk Celebi wrote: > I would go with one task manager with 48 slots per machine. This > reduces the communication overheads between task managers. > > Regarding memory configuration: Given that the machines h

Error when executing job

2016-02-24 Thread ram kumar
Hi, getting below error when executing twitter flink job, org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed. at org.apache.flink.client.program.Client.runBlocking(Client.java:370) at org.apache.flink.streaming.api.environment.S

Re: Dataset filter improvement

2016-02-24 Thread Till Rohrmann
Hi Flavio, it works the following way: Your data type will serialized by the PojoSerializer iff it is a POJO. Iff it is a generic type which cannot be serialized by any of the other serializers, then Kryo is used. If it is a POJO type and you’re having DataStream which can also contain subtypes o

Re: Dataset filter improvement

2016-02-24 Thread Flavio Pompermaier
Thanks Max and Till for the answers. However I still didn't understand fully the difference...Here are my doubts: - If I don't register any of my POJO classes, they will be serialized with Kryo (black box for Flink) - If I register all of my POJO using env.registerType they will be ser

Re: How to use all available task managers

2016-02-24 Thread Till Rohrmann
Hi Saiph, I think the configuration value should be parallelism.default: 6. That will execute jobs which have not parallelism defined with a DOP of 6. Cheers, Till ​ On Wed, Feb 24, 2016 at 1:43 AM, Saiph Kappa wrote: > Hi, > > I am running a flink stream application on a cluster with 6 slave