Thanks a lot Jan,

Im clear now.

Best,
Eleanore

On Mon, Apr 12, 2021 at 11:22 AM Jan Lukavský <je...@seznam.cz> wrote:

> Hi Eleanore,
>
> that's a good question. :)
>
> There has been discussion about this in the PR of the mentioned Jira [1].
> Generally, objectReuse enables Flink to hypothetically reuse instances of
> deserialized objects instead of creating new ones. But due to how Beam
> defines Coders it creates new instances nevertheless. See the discussion
> for details.
>
>  Jan
>
> [1] https://github.com/apache/beam/pull/13240#issuecomment-721635620
> On 4/12/21 7:29 PM, Eleanore Jin wrote:
>
> Hi Jan,
>
> Thanks a lot for the reply! This helps, I wonder if you have any idea
> whats the difference between fasterCopy vs objectReuse option?
>
> Eleanore
>
> On Fri, Apr 9, 2021 at 11:53 AM Jan Lukavský <je...@seznam.cz> wrote:
>
>> Hi Eleanore,
>>
>> the --fasterCopy option disables clone between operators (see [1]). It
>> should be safe to use it, unless your pipeline outputs an object and
>> later modifies the same instance. This is generally not supported by the
>> Beam model and is considered to be an user error. FlinkRunner
>> historically chose a way of "better-safe-than-sorry" approach and
>> explicitly cloned every received object between (non-shuffle) operators.
>> Enabling this option should increase performance, you can verify your
>> Pipeline is not doing any disallowed mutations using DirectRunner, which
>> checks this by default (without --enforceImmutability=false).
>>
>>   Jan
>>
>> [1] https://issues.apache.org/jira/browse/BEAM-11146
>>
>> On 4/9/21 7:57 AM, Eleanore Jin wrote:
>> > Hi community,
>> >
>> > I am upgrading from Beam 2.23.0 -> 2.28.0, and a new
>> > FlinkPipelineOption is introduced: fasterCopy.
>> >
>> > Can you please help me understand what is the difference between the
>> > option objectReuse vs fasterCopy?
>> >
>> > Thanks a lot!
>> > Eleanore
>>
>

Reply via email to