Thanks Robert. I found the following explanation for the number of threads for 4 cores: You have *4* CPU sockets, each CPU *can* have, up to, 12 *cores* and each *core can* have two *threads*. Your *max thread* count is, *4* CPU x 12 *cores* x 2 *threads* per *core*, so 12 x *4* x 2 is 96 Can I limit the threads using the pipeline options in some way? 10-20 elements per worker will work for me.
My current practice to work around that issue is to limit the number of elements in each dataflow pipeline (providing ~10 elements for each pipeline) Once I have completed around 200 elements processing = 20 pipelines (google does not allow more than 25 dataflow pipelines per region) with 10 elements each, I am launching the next 20 pipelines. This is ofcourse missing the benefit of serverless. Any idea, how to work around this? Best, Eila On Mon, May 17, 2021 at 1:27 PM Robert Bradshaw <rober...@google.com> wrote: > Note that workers generally process one element per thread at a time. The > number of threads defaults to the number of cores of the VM that you're > using. > > On Mon, May 17, 2021 at 10:18 AM Brian Hulette <bhule...@google.com> > wrote: > >> What type of files are you reading? If they can be split and read by >> multiple workers this might be a good candidate for a Splittable DoFn (SDF). >> >> Brian >> >> On Wed, May 12, 2021 at 6:18 AM Eila Oriel Research < >> e...@orielresearch.org> wrote: >> >>> Hi, >>> I am running out of resources on the workers machines. >>> The reasons are: >>> 1. Every pcollection is a reference to a LARGE file that is copied into >>> the worker >>> 2. The worker makes calculations on the copied file using a software >>> library that consumes memory / storage / compute resources >>> >>> I have changed the workers' CPUs and memory size. At some point, I am >>> running out of resources with this method as well >>> I am looking to limit the number of pCollection / elements that are >>> being processed in parallel on each worker at a time. >>> >>> Many thank for any advice, >>> Best wishes, >>> -- >>> Eila >>> <http://www.orielresearch.com> >>> Meetup <https://www.meetup.com/Deep-Learning-In-Production/> >>> >> -- Eila <http://www.orielresearch.com> Meetup <https://www.meetup.com/Deep-Learning-In-Production/>