hi, all
I also noticed this problem. The reason is that Yarn accounts each executor
for only 1, no matter how many cores you configured.
Because Yarn only uses memory as the primary metrics for resource
allocation. It means that Yarn will pack as many as executors on each node
as long as the node
my jobs ran faster. I was in the 10+TB jobs territory with TPC data. ☺ The
links I provided have a few use cases and trials.
Hope that helps,
-Pat
From: Selvam Raman
Date: Monday, February 26, 2018 at 1:52 PM
To: Vadim Semenov
Cc: user
Subject: Re: Spark EMR executor-core vs Vcores
Thanks
yeah, for some reason (unknown to me, but you can find on aws forums) they
double the actual number of cores for nodemanagers.
I assume that's done to maximize utilization, but doesn't really matter to
me, at least, since I only run Spark, so I, personally, set `total number
of cores - 1/2` saving
Thanks. That’s make sense.
I want to know one more think , available vcore per machine is 16 but
threads per node 8. Am I missing to relate here.
What I m thinking now is number of vote = number of threads.
On Mon, 26 Feb 2018 at 18:45, Vadim Semenov wrote:
> All used cores aren't getting re
Putting all cores won't solve the purpose alone, you'll have to mention
executors as well executor memory accordingly to it..
On Tue 27 Feb, 2018, 12:15 AM Vadim Semenov, wrote:
> All used cores aren't getting reported correctly in EMR, and YARN itself
> has no control over it, so whatever you p
All used cores aren't getting reported correctly in EMR, and YARN itself
has no control over it, so whatever you put in `spark.executor.cores` will
be used,
but in the ResourceManager you will only see 1 vcore used per nodemanager.
On Mon, Feb 26, 2018 at 5:20 AM, Selvam Raman wrote:
> Hi,
>
> s
Hi Fawze,
Yes, it is true that i am running in yarn mode, 5 containers represents
4executor and 1 master.
But i am not expecting this details as i already aware of this. What i want
to know is relationship between Vcores(Emr yarn) vs executor-core(Spark).
>From my slave configuration i understan
It's recommended to sue executor-cores of 5.
Each executor here will utilize 20 GB which mean the spark job will utilize
50 cpu cores and 100GB memory.
You can not run more than 4 executors because your cluster doesn't have
enough memory.
Use see 5 executor because 4 for the job and one for the
Master Node details:
lscpu
Architecture: x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):4
On-line CPU(s) list: 0-3
Thread(s) per core:4
Core(s) per socket:1
Socket(s): 1
NUMA node(s): 1
Vendor ID: