Re: JobTimeoutException: Lost connection to JobManager

2015-04-14 Thread Stephan Ewen
I pushed a fix to the master. The problem should now be gone. Please let us know if you experience other issues! Greetings, Stephan On Tue, Apr 14, 2015 at 9:57 PM, Mohamed Nadjib MAMI wrote: > Hello, > > Once I got the message, few seconds, I received your email. Well, this > just to cast a

Re: JobTimeoutException: Lost connection to JobManager

2015-04-14 Thread Mohamed Nadjib MAMI
Hello, Once I got the message, few seconds, I received your email. Well, this just to cast a need for a fix. Happy to feel the dynamism of the work. Great work. On 14.04.2015 21:50, Stephan Ewen wrote: You are on the latest snapshot version? I think there is an inconsistency in there. Will

Re: JobTimeoutException: Lost connection to JobManager

2015-04-14 Thread Stephan Ewen
You are on the latest snapshot version? I think there is an inconsistency in there. Will try to fix that toning. Can you actually use the milestone1 version? That one should be good. Greetings, Stephan Am 14.04.2015 20:31 schrieb "Fotis P" : > Hello everyone, > > I am getting this weird excepti

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Robert Metzger
You can control the logging behavior from the ExecutionConfig (env.getExecutionConfig()). There is a method (disableSysoutLogging()) that you can use. (In 0.9-SNAPSHOT only). Sorry again that you ran into this issue. On Tue, Apr 14, 2015 at 8:45 PM, Robert Metzger wrote: > Ah, I see. > > The is

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Robert Metzger
Ah, I see. The issue is this line in the JobClient.scala here: https://github.com/apache/flink/blob/release-0.9.0-milestone-1-rc1/flink-runtime/src/main/scala/org/apache/flink/runtime/client/JobClient.scala#L97 As you can see, its doing sysout logging. In the current master, this has been reworke

JobTimeoutException: Lost connection to JobManager

2015-04-14 Thread Fotis P
Hello everyone, I am getting this weird exception while running some simple counting jobs in Flink. Exception in thread "main" org.apache.flink.runtime.client.JobTimeoutException: Lost connection to JobManager at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:164)

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Stefan Bunk
Hi, I am also using IntelliJ, and I am starting directly from the IDE. Local execution. This is what my logging output looks like: [1]. I am getting my logger via: val log = org.apache.log4j.Logger.getLogger(getClass) [1] https://gist.github.com/knub/1c11683601b4eeb5d51b On 14 April 2015 at 18:

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Robert Metzger
Hi, how are you starting Flink? Out of the IDE? Using the scripts? I just created a new flink project with the milestone version. Just putting your log4j.xml into the resources folder enabled the logging (I've set it to INFO for flink and it worked). I've used IntelliJ and started the WordCount.j

Re: CoGgroup Operator Data Sink

2015-04-14 Thread Kostas Tzoumas
Each operator has only one output (which can be consumed by multiple downstream operators), so you cannot branch out to two different directions from inside the user code with many collectors. The reasoning is that you can have the same effect with what Robert suggested. But perhaps your use case

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Stefan Bunk
Hi Robert, thanks for the info. Adding the parameter didn't help. My logging file is found and my logging configuration for my own logging is working (even without the parameter), it's just that the file in the jar seems to be preferred over my file. Best, Stefan On 14 April 2015 at 17:16, Rober

Re: Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Robert Metzger
Hi Stefan, we made a stupid mistake in the 0.9.0-milestone-1 release by including our log4j.properties into the flink-runtime jar. its also in the fat jar in flink-dist. Maybe you can pass the name of your log4j file to your application with - Dlog4j.configuration=log4j.xml? The issue is already

Re: Sorting in a WindowedDataStream

2015-04-14 Thread Márton Balassi
Dear Niklas, To do that you can use WindowedDataStream.mapWindow(). This gives you an iterator to all the records in the window and you can do whatever you wish with them. One thing to note if sorting windows of the stream might add considerable latency to your job. Best, Marton On Tue, Apr 14

Logging in Flink 0.9.0-milestone-1

2015-04-14 Thread Stefan Bunk
Hi Flinkers, I just switched to 0.9.0-milestone-1, and now I get Flink's logging output again in my console (local execution). I have a log4j.xml under src/main/resources, which says not to log any Flink job progress updates, and which worked fine so far: [...]

Re: Nested Iterations supported in Flink?

2015-04-14 Thread Benoît Hanotte
Thanks for you quick answers! The algorithm is the following: I've got a spatial set of data and I want to find dense regions. The space is beforehand discretized into "cells" of a fixed size. Then, for each dense cell (1st iteration), starting with the most dense, the algorithm tries to ex

Re: Nested Iterations supported in Flink?

2015-04-14 Thread Stephan Ewen
Hi Benoît! You are right, the nested iterations are currently not supported. The test you found actually checks that the Optimizer gives a good error message when encountering nested iterations. Can you write your program as one iterations (the inner) and start the program multiple times to simu

Re: Nested Iterations supported in Flink?

2015-04-14 Thread Till Rohrmann
If your inner iterations happens to work only on the data of a single partition, then you can also implement this iteration as part of a mapPartition operator. The only problem there would be that you have to keep all the partition's data on the heap, if you need access to it. Cheers, Till On Tu

Re: Parallelism question

2015-04-14 Thread Maximilian Michels
Hi Giacomo, If you use a FileOutputFormat as a DataSink (e.g. as in env.writeAsText("/path"), then the output directory will contain 5 files named 1, 2, 3, 4, and 5, each containing the output of the corresponding task. The order of the data in the files follows the order of the distributed DataSe

Re: Parallelism question

2015-04-14 Thread Giacomo Licari
Hi Max, thank you for your reply. DataSink contains data ordered, I mean, it contains in order output1, output1 ... output5? Or are them mixed? Thanks a lot, Giacomo On Tue, Apr 14, 2015 at 11:58 AM, Maximilian Michels wrote: > Hi Giacomo, > > If I understand you correctly, you want your Flink

Sorting in a WindowedDataStream

2015-04-14 Thread Niklas Semmler
Hello there, What functions should be used to aggregate (unordered) tuples for every window in a WindowedDataStream to a (ordered) list? Neither foldWindow nor reduceWindow seems to be applicable, and aggregate does not, to my understanding, take user-defined functions. To get started with

Re: Parallelism question

2015-04-14 Thread Maximilian Michels
Hi Giacomo, If I understand you correctly, you want your Flink job to execute with a parallelism of 5. Just call setDegreeOfParallelism(5) on your ExecutionEnvironment. That way, all operations, when possible, will be performed using 5 parallel instances. This is also true for the DataSink which w

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

2015-04-14 Thread Maximilian Michels
Hi Filip, I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/). I can confirm your error. Would you mind using the 0.9.0-m

Re: CoGgroup Operator Data Sink

2015-04-14 Thread Mustafa Elbehery
Thanks for prompt reply. Maybe the expression "Sink" is not suitable to what I need. What if I want to *Collect* two data sets directly from the coGroup operator. Is there anyway to do so ?!! As I might know, the operator has only Collector Object, but I wonder if there is another feature in Flin

Re: CoGgroup Operator Data Sink

2015-04-14 Thread Robert Metzger
Hi, you can write the output of a coGroup operator to two sinks: --\ />Sink1 \ / (CoGroup) /\ --/ \-->Sink2 You can actually write to as many sinks as you want. Note that the data written to Sink1 and Sink2 will be identica

Parallelism question

2015-04-14 Thread Giacomo Licari
Hi guys, I have a question about how parallelism works. If I have a large dataset and I would divide it into 5 blocks, can I pass each block of data to a fixed parallel process (for example I set up 5 process) ? And if the results data from each process arrive to the output not in an ordered way,

CoGgroup Operator Data Sink

2015-04-14 Thread Mustafa Elbehery
Hi all, I wonder if the coGroup operator have the ability to sink two output simultaneously. I am trying to mock it by calling a function inside the operator, in which I sink the first output, and get the second output myself. I am not sure if this is the best way, and I would like to hear your s