It kinda depends on the application. Certain compression libraries, in
particular, are kinda lax with their use of off-heap buffers, so if
you configure executors to use many cores you might end up with higher
usage than the default configuration. Then there are also things like
PARQUET-118.
In an
Hi,
I'm running a spark job on YARN, using 6 executors each with 25 GB of
memory and spark.yarn.executor.overhead set to 5GB. Despite this, I still
seem to see YARN killing my executors for exceeding the memory limit.
Reading the docs, it looks like the overhead defaults to around 10% of the
size