You have to persist it and read from it for the subsequent operations. Take
a look at the FlinkMLTools.persist methods.
Cheers,
Till
On Thu, Apr 28, 2016 at 6:14 PM, Sergio Ramírez
wrote:
> Hello,
>
> OK, now I understand everything. So if I want to re-use my DataSet in
> several different op
Hello,
OK, now I understand everything. So if I want to re-use my DataSet in
several different operations, what should I do? Is there any way to
maintain the save the data from re-computation? I am not only talking
about iteration. I mean re-use of data.
For example, imagine I want filter so
Hi Sergio,
sorry for the late reply. I figured out your problem. The reason why you
see apparently inconsistent results is that you execute your job multiple
times. Each collect call triggers an eager execution of your Flink job.
Since the intermediate results are not stored the whole topology has
Hello again:
Any news about this problem with enriched MapPartition function?
Thank you
On 06/04/16 17:01, Sergio Ramírez wrote:
Hello,
Ok, please find enclosed the test code and the input data.
Cheers
On 31/03/16 10:07, Till Rohrmann wrote:
Hi Sergio,
could you please provide a complete
Hi Sergio,
could you please provide a complete example (including input data) to
reproduce your problem. It is hard to tell what's going wrong when one only
sees a fraction of the program.
Cheers,
Till
On Tue, Mar 29, 2016 at 5:58 PM, Sergio Ramírez
wrote:
> Hi again,
>
> I've not been able to
Hi again,
I've not been able to solve the problem with the instruction you gave
me. I've tried with static variables (matrices) also unsuccessfully.
I've also tried this simpler code:
def mapPartition(it: java.lang.Iterable[LabeledVector], out:
Collector[((Int, Int), Int)]): Unit = {
Haven't looked to deeply into this, but this sounds like object reuse is
enabled, at which point buffering values effectively causes you to store
the same value multiple times.
can you try disabling objectReuse using
env.getConfig().disableObjectReuse() ?
On 22.03.2016 16:53, Sergio Ramírez
Hi all,
I've been having some problems with RichMapPartitionFunction. Firstly, I
tried to convert the iterable into an array unsuccessfully. Then, I have
used some buffers to store the values per column. I am trying to
transpose the local matrix of LabeledVectors that I have in each partition