Re: in-memory optimization

Ufuk Celebi Mon, 24 Apr 2017 07:30:01 -0700

Loop invariant data should be kept in Flink's managed memory in
serialized form (in a custom hash table). That means that they are not
read back again from the CSV file, but they are kept in serialized
form and need be deserialized again on access.


CC'ing Fabian to double check...

On Mon, Apr 24, 2017 at 4:20 PM, Robert Schwarzenberg
<schwarzenb...@campus.tu-berlin.de> wrote:
> Hello,
>
> I have a question regarding the loop-awareness of Flink wrt invariant
> datasets.
>
> Does Flink serialize the DataSet 'points' in line 85
>
> https://github.com/apache/flink/blob/master/flink-examples/flink-examples-batch/src/main/scala/org/apache/flink/examples/scala/clustering/KMeans.scala
>
> each iteration or are there in-memory optimization procedures in place?
>
> Thanks for your help!
>
> Regards,
> Robert

Re: in-memory optimization

Reply via email to