What type of files are you reading? If they can be split and read by multiple workers this might be a good candidate for a Splittable DoFn (SDF).
Brian On Wed, May 12, 2021 at 6:18 AM Eila Oriel Research <e...@orielresearch.org> wrote: > Hi, > I am running out of resources on the workers machines. > The reasons are: > 1. Every pcollection is a reference to a LARGE file that is copied into > the worker > 2. The worker makes calculations on the copied file using a software > library that consumes memory / storage / compute resources > > I have changed the workers' CPUs and memory size. At some point, I am > running out of resources with this method as well > I am looking to limit the number of pCollection / elements that are being > processed in parallel on each worker at a time. > > Many thank for any advice, > Best wishes, > -- > Eila > <http://www.orielresearch.com> > Meetup <https://www.meetup.com/Deep-Learning-In-Production/> >