This is going to have profound performance implications if this is the only path for iteration.
On Fri, Feb 27, 2015 at 10:58 PM, Stephan Ewen <se...@apache.org> wrote: > I vote to have the key extractor return a new value each time. That means > that objects are not reused everywhere where it is possible, but still in > most places, which still helps. > > What still puzzles me: I thought that the collection execution stores > copies of the returned records by default (reuse safe mode). > Am 27.02.2015 15:36 schrieb "Aljoscha Krettek" <aljos...@apache.org>: > > > Hello Nation of Flink, > > while figuring out this bug: > > https://issues.apache.org/jira/browse/FLINK-1569 > > I came upon some difficulties. The problem is that the > > KeyExtractorMappers always > > return the same tuple. This is problematic, since Collection Execution > > does simply store the returned values in a list. These elements are > > not copied before they are stored when object reuse is enabled. > > Therefore, the whole list will contain only that one reused element. > > > > I see two options for solving this: > > 1. Change KeyExtractorMappers to always return a new tuple, thereby > > making object-reuse mode in cluster execution useless for key > > extractors. > > > > 2. Change collection execution mapper to always make copies of the > > returned elements. This would make object-reuse in collection > > execution pretty much obsolete, IMHO. > > > > How should we proceed with this? > > > > Cheers, > > Aljoscha > > >