Thanks a lot Jan, Im clear now.
Best, Eleanore On Mon, Apr 12, 2021 at 11:22 AM Jan Lukavský <je...@seznam.cz> wrote: > Hi Eleanore, > > that's a good question. :) > > There has been discussion about this in the PR of the mentioned Jira [1]. > Generally, objectReuse enables Flink to hypothetically reuse instances of > deserialized objects instead of creating new ones. But due to how Beam > defines Coders it creates new instances nevertheless. See the discussion > for details. > > Jan > > [1] https://github.com/apache/beam/pull/13240#issuecomment-721635620 > On 4/12/21 7:29 PM, Eleanore Jin wrote: > > Hi Jan, > > Thanks a lot for the reply! This helps, I wonder if you have any idea > whats the difference between fasterCopy vs objectReuse option? > > Eleanore > > On Fri, Apr 9, 2021 at 11:53 AM Jan Lukavský <je...@seznam.cz> wrote: > >> Hi Eleanore, >> >> the --fasterCopy option disables clone between operators (see [1]). It >> should be safe to use it, unless your pipeline outputs an object and >> later modifies the same instance. This is generally not supported by the >> Beam model and is considered to be an user error. FlinkRunner >> historically chose a way of "better-safe-than-sorry" approach and >> explicitly cloned every received object between (non-shuffle) operators. >> Enabling this option should increase performance, you can verify your >> Pipeline is not doing any disallowed mutations using DirectRunner, which >> checks this by default (without --enforceImmutability=false). >> >> Jan >> >> [1] https://issues.apache.org/jira/browse/BEAM-11146 >> >> On 4/9/21 7:57 AM, Eleanore Jin wrote: >> > Hi community, >> > >> > I am upgrading from Beam 2.23.0 -> 2.28.0, and a new >> > FlinkPipelineOption is introduced: fasterCopy. >> > >> > Can you please help me understand what is the difference between the >> > option objectReuse vs fasterCopy? >> > >> > Thanks a lot! >> > Eleanore >> >