Re: Is there a way (seetings) to limit the number of element per worker machine

2021-06-02 Thread OrielResearch Eila Arich-Landkof
Hi Roberts, Thank you. I usually work with the custom worker configuration options I will custom it to low number of cores with large memory and see if it solves my problem Thanks so much, — Eila www.orielresearch.com https://www.meetup.com/Deep-Learning-In-Production Sent from my iPhone > On

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-06-02 Thread Vincent Marquez
On Wed, Jun 2, 2021 at 11:27 AM Robert Bradshaw wrote: > On Wed, Jun 2, 2021 at 11:18 AM Vincent Marquez > wrote: > > > > On Wed, Jun 2, 2021 at 11:11 AM Robert Bradshaw > wrote: > >> > >> If you want to control the total number of elements being processed > >> across all workers at a time, you

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-06-02 Thread Robert Bradshaw
On Wed, Jun 2, 2021 at 11:18 AM Vincent Marquez wrote: > > On Wed, Jun 2, 2021 at 11:11 AM Robert Bradshaw wrote: >> >> If you want to control the total number of elements being processed >> across all workers at a time, you can do this by assigning random keys >> of the form RandomInteger() % To

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-06-02 Thread Vincent Marquez
On Wed, Jun 2, 2021 at 11:11 AM Robert Bradshaw wrote: > If you want to control the total number of elements being processed > across all workers at a time, you can do this by assigning random keys > of the form RandomInteger() % TotalDesiredConcurrency followed by a > GroupByKey. > > If you want

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-06-02 Thread Robert Bradshaw
If you want to control the total number of elements being processed across all workers at a time, you can do this by assigning random keys of the form RandomInteger() % TotalDesiredConcurrency followed by a GroupByKey. If you want to control the number of elements being processed in parallel per V

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-05-28 Thread Eila Oriel Research
Thanks Robert. I found the following explanation for the number of threads for 4 cores: You have *4* CPU sockets, each CPU *can* have, up to, 12 *cores* and each *core can* have two *threads*. Your *max thread* count is, *4* CPU x 12 *cores* x 2 *threads* per *core*, so 12 x *4* x 2 is 96 Can I lim

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-05-28 Thread Eila Oriel Research
Hi Brian, Thanks for your response. I am reading genomic data. I am using a research tool (software) that was built to process the files - it is not built to work on multiple machines. I dont usually work with Splittable DoFn - so I hope that I understand the concept properly. Please let me know i

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-05-17 Thread Robert Bradshaw
Note that workers generally process one element per thread at a time. The number of threads defaults to the number of cores of the VM that you're using. On Mon, May 17, 2021 at 10:18 AM Brian Hulette wrote: > What type of files are you reading? If they can be split and read by > multiple workers

Re: Is there a way (seetings) to limit the number of element per worker machine

2021-05-17 Thread Brian Hulette
What type of files are you reading? If they can be split and read by multiple workers this might be a good candidate for a Splittable DoFn (SDF). Brian On Wed, May 12, 2021 at 6:18 AM Eila Oriel Research wrote: > Hi, > I am running out of resources on the workers machines. > The reasons are: >