Hi, Flink will serialise uses functions when distributing work across the cluster. Therefore your shared objects will not be shared objects anymore once your program executes. You will still get object sharing because only one instance of your function is used to process data on one parallel instance of an operation.
Cheers, Aljoscha On Wed, 4 Jan 2017 at 21:05 Duck <k...@protonmail.com> wrote: > Hi there, > > I was wondering on how my caching object, would behave in the given > scenario below. > > 1) I create an instance of an object that performs lookups to an external > resource, and caches results. > 2) I have a DataStream that i perform a map function on (with a custom > RichMapFunction) > 3) I have a second DataStream that i perform a map function on (with a > custom RichMapFunction) > 4) I set the Job parallelism to 2. > > Will the multiple usage, along with parallelism duplicate my object in any > way, or will it still behave as a "shared object instance". Wondering, > since this "cacheloader" will talk to external resources, i do not want it > to be say duplicated due to performance reasons on the external resource. > > Sent from ProtonMail <https://protonmail.com>, Swiss-based encrypted > email. > > >