Awesome, thanks!
On Sat, Nov 7, 2020 at 6:43 AM Till Rohrmann wrote:
> Hi Rex,
>
> You should configure the number of slots per TaskManager to be the number
> of cores of a machine/node. In total you will then have a cluster with
> #slots = #cores per machine x #machines.
>
> If you have a clust
Hi Rex,
You should configure the number of slots per TaskManager to be the number
of cores of a machine/node. In total you will then have a cluster with
#slots = #cores per machine x #machines.
If you have a cluster with 4 nodes and 8 slots each, then you have a total
of 32 slots. Now if you have
Great, thanks!
So just to confirm, configure # of task slots to # of core nodes x # of
vCPUs?
I'm not sure what you mean by "distribute them across both jobs (so that
the total adds up to 32)". Is it configurable how many task slots a job can
receive, so in this case I'd provide ~30/36 * 32 task
Hi Rex,
as a rule of thumb I recommend configuring your TMs with as many slots as
they have cores. So in your case your cluster would have 32 slots. Then
depending on the workload of your jobs you should distribute them across
both jobs (so that the total adds up to 32). A high number of operators
Hello,
I'm running a Job on AWS EMR with the TableAPI that does a long series of
Joins, GroupBys, and Aggregates and I'd like to know how to best tune
parallelism.
In my case, I have 8 EMR core nodes setup each with 4vCores and 8Gib of
memory. There's a job we have to run that has ~30 table opera