Hi there 👋 I have a question regarding dataflow runner, more precisely on its behavior for instantiating IntrinsicMapTaskExecutor.
I've noticed that running the same job with different versions, and analyzing the heap dump, there are some differences: beam java SDK 2.29.0: *11* instances beam java SDK 2.35.0: *45* instances beam java SDK 2.35.0 with runner v2: *47* instances On these test jobs, I run with only 1 worker of type n1-standard-1. In all cases, only 4 task executors are started and assigned to a thread. As creating a IntrinsicMapTaskExecutor is not 'free': it involves duplicating the whole coder stack. We see a significant increase of memory consumption as well as a longer startup time, due to the CloudObjects.coderFromCloudObject operation. Is there a reason why dataflow creates so many unused executors ? Cheers -- Michel Davit Data Engineer Spotify France | 54 Rue de Londres | 75008 Paris, France
