Hi Eleanore,
the --fasterCopy option disables clone between operators (see [1]). It
should be safe to use it, unless your pipeline outputs an object and
later modifies the same instance. This is generally not supported by the
Beam model and is considered to be an user error. FlinkRunner
historically chose a way of "better-safe-than-sorry" approach and
explicitly cloned every received object between (non-shuffle) operators.
Enabling this option should increase performance, you can verify your
Pipeline is not doing any disallowed mutations using DirectRunner, which
checks this by default (without --enforceImmutability=false).
Jan
[1] https://issues.apache.org/jira/browse/BEAM-11146
On 4/9/21 7:57 AM, Eleanore Jin wrote:
Hi community,
I am upgrading from Beam 2.23.0 -> 2.28.0, and a new
FlinkPipelineOption is introduced: fasterCopy.
Can you please help me understand what is the difference between the
option objectReuse vs fasterCopy?
Thanks a lot!
Eleanore