Re: write data set to a single file

2015-05-13 Thread Mihail Vieru
Awesome, it works. Thanks! :) On 13.05.2015 20:05, Stephan Ewen wrote: If you want to write a single file, you need to write it with one task. So, you can run a program with parallelism 100 and just set the sink operator to parallelism 1. You can set the parallelism of each individual operato

Re: write data set to a single file

2015-05-13 Thread Stephan Ewen
If you want to write a single file, you need to write it with one task. So, you can run a program with parallelism 100 and just set the sink operator to parallelism 1. You can set the parallelism of each individual operator by calling "setParallelism()" after the operation, for example "result.wri

write data set to a single file

2015-05-13 Thread Mihail Vieru
Hi, I need to write a data set to a single file without setting the parallelism to 1. How can I achieve this? Cheers, Mihail P.S.: it's for persisting intermediate results in loops and reading those in the next iteration. Which btw work for higher iteration counts with explicit persistence.

Re: Flink hanging between job executions / All Pairs Shortest Paths

2015-05-13 Thread Mihail Vieru
Thanks for the quick answer! I will try the explicit persistence workaround. Can you give me a more precise estimate for the internal caching support? Will it take more than 2 weeks? It isn't an actual hang. It waits a lot before starting the next iteration after the 20th or so iteration. C

Re: Spark and Flink

2015-05-13 Thread Ted Yu
You can run the following command: mvn dependency:tree And see what jetty versions are brought in. Cheers > On May 13, 2015, at 6:07 AM, Pa Rö wrote: > > hi, > > i use spark and flink in the same maven project, > > now i get a exception on working with spark, flink work well > > the prob

Spark and Flink

2015-05-13 Thread Pa Rö
hi, i use spark and flink in the same maven project, now i get a exception on working with spark, flink work well the problem are transitiv dependencies. maybe somebody know a solution, or versions, which work together. best regards paul ps: a cloudera maven repo flink would be desirable my

Re: Flink hanging between job executions / All Pairs Shortest Paths

2015-05-13 Thread Stephan Ewen
BTW, you should be able to see that when, instead of executing the program, you print the execution plan. I am not sure where the hang comes from. Is it an actual hang, or does it just take long? If it is a hang, does it occur in the optimizer, or in the distributed runtime? On Wed, May 13, 2015

Re: Flink hanging between job executions / All Pairs Shortest Paths

2015-05-13 Thread Stephan Ewen
I think this is a good case where loops in the program can cause issues right now. The next graph always depends on the previous graph. This is a bit like a recursive definition. In the 10th iteration, in order to execute the print() command, you need to compute the 9th graph, which requires the 8

Re: Channel received an event before completing the current partial record

2015-05-13 Thread Stephan Ewen
Ah, that is good to hear. I think we should improve the error message there. On Wed, May 13, 2015 at 2:41 PM, Pa Rö wrote: > hi stephan, > > i have found the problem, something was wrong at the read and write > function from my data object (implements Writable), > now it's work. > > best regard

Re: Channel received an event before completing the current partial record

2015-05-13 Thread Pa Rö
hi stephan, i have found the problem, something was wrong at the read and write function from my data object (implements Writable), now it's work. best regards paul 2015-05-13 13:32 GMT+02:00 Stephan Ewen : > Hi Paul! > > Thank you for reporting this. This really seems like it should not happ

Re: Channel received an event before completing the current partial record

2015-05-13 Thread Ufuk Celebi
Hey Paul! Thanks for reporting the issue. I'm trying to reproduce the problem. I'll post the updates here. Which version of Flink are you using? You probably meant that you were using Flink 0.8.1 not Maven 8.1, right? ;-) On 13 May 2015, at 13:16, Pa Rö wrote: > my function code: > private s

Re: Channel received an event before completing the current partial record

2015-05-13 Thread Stephan Ewen
Hi Paul! Thank you for reporting this. This really seems like it should not happen ;-) Is this error reproducable? If yes, we can probably fix it well... Greetings, Stephan On Wed, May 13, 2015 at 1:16 PM, Pa Rö wrote: > my function code: > private static DataSet > getPointDataSet(ExecutionE

Re: Channel received an event before completing the current partial record

2015-05-13 Thread Pa Rö
my function code: private static DataSet getPointDataSet(ExecutionEnvironment env) { // load properties Properties pro = new Properties(); try { pro.load(new FileInputStream("./resources/config.properties")); } catch (Exception e) { e.printSta

Re: flink ml - k-means

2015-05-13 Thread Pa Rö
okay :) now i use the following exsample code from here: https://github.com/apache/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/clustering/KMeans.java 2015-05-11 21:56 GMT+02:00 Stephan Ewen : > Paul! > > Can you use the KMeans example? The co

Channel received an event before completing the current partial record

2015-05-13 Thread Pa Rö
hi, i read a csv file from disk with flink (java, maven version 8.1) and get the following exception: ERROR operators.DataSinkTask: Error in user code: Channel received an event before completing the current partial record.: DataSink(Print to System.out) (4/4) java.lang.IllegalStateException: Ch