Re: RichMapPartitionFunction - problems with collect

2016-04-29 Thread Till Rohrmann
You have to persist it and read from it for the subsequent operations. Take a look at the FlinkMLTools.persist methods. Cheers, Till ​ On Thu, Apr 28, 2016 at 6:14 PM, Sergio Ramírez wrote: > Hello, > > OK, now I understand everything. So if I want to re-use my DataSet in > several different op

Re: RichMapPartitionFunction - problems with collect

2016-04-28 Thread Sergio Ramírez
Hello, OK, now I understand everything. So if I want to re-use my DataSet in several different operations, what should I do? Is there any way to maintain the save the data from re-computation? I am not only talking about iteration. I mean re-use of data. For example, imagine I want filter so

Re: RichMapPartitionFunction - problems with collect

2016-04-26 Thread Till Rohrmann
Hi Sergio, sorry for the late reply. I figured out your problem. The reason why you see apparently inconsistent results is that you execute your job multiple times. Each collect call triggers an eager execution of your Flink job. Since the intermediate results are not stored the whole topology has

Re: RichMapPartitionFunction - problems with collect

2016-04-13 Thread Sergio Ramírez
Hello again: Any news about this problem with enriched MapPartition function? Thank you On 06/04/16 17:01, Sergio Ramírez wrote: Hello, Ok, please find enclosed the test code and the input data. Cheers On 31/03/16 10:07, Till Rohrmann wrote: Hi Sergio, could you please provide a complete

Re: RichMapPartitionFunction - problems with collect

2016-03-31 Thread Till Rohrmann
Hi Sergio, could you please provide a complete example (including input data) to reproduce your problem. It is hard to tell what's going wrong when one only sees a fraction of the program. Cheers, Till On Tue, Mar 29, 2016 at 5:58 PM, Sergio Ramírez wrote: > Hi again, > > I've not been able to

Re: RichMapPartitionFunction - problems with collect

2016-03-29 Thread Sergio Ramírez
Hi again, I've not been able to solve the problem with the instruction you gave me. I've tried with static variables (matrices) also unsuccessfully. I've also tried this simpler code: def mapPartition(it: java.lang.Iterable[LabeledVector], out: Collector[((Int, Int), Int)]): Unit = {

Re: RichMapPartitionFunction - problems with collect

2016-03-24 Thread Chesnay Schepler
Haven't looked to deeply into this, but this sounds like object reuse is enabled, at which point buffering values effectively causes you to store the same value multiple times. can you try disabling objectReuse using env.getConfig().disableObjectReuse() ? On 22.03.2016 16:53, Sergio Ramírez

RichMapPartitionFunction - problems with collect

2016-03-22 Thread Sergio Ramírez
Hi all, I've been having some problems with RichMapPartitionFunction. Firstly, I tried to convert the iterable into an array unsuccessfully. Then, I have used some buffers to store the values per column. I am trying to transpose the local matrix of LabeledVectors that I have in each partition