Hi,
Flink will serialise uses functions when distributing work across the
cluster. Therefore your shared objects will not be shared objects anymore
once your program executes. You will still get object sharing because only
one instance of your function is used to process data on one parallel
instance of an operation.

Cheers,
Aljoscha

On Wed, 4 Jan 2017 at 21:05 Duck <k...@protonmail.com> wrote:

> Hi there,
>
> I was wondering on how my caching object, would behave in the given
> scenario below.
>
> 1) I create an instance of an object that performs lookups to an external
> resource, and caches results.
> 2) I have a DataStream that i perform a map function on (with a custom
> RichMapFunction)
> 3) I have a second DataStream that i perform a map function on (with a
> custom RichMapFunction)
> 4) I set the Job parallelism to 2.
>
> Will the multiple usage, along with parallelism duplicate my object in any
> way, or will it still behave as a "shared object instance". Wondering,
> since this "cacheloader" will talk to external resources, i do not want it
> to be say duplicated due to performance reasons on the external resource.
>
> Sent from ProtonMail <https://protonmail.com>, Swiss-based encrypted
> email.
>
>
>

Reply via email to