Configuration:
Driver memory we tried: 2GB / 4GB / 5GB
Executor memory we tried: 4G / 5GB
Even reduced: *spark.memory.fraction *to 0.2 (we are not using cache)
VM Memory: 32 GB and 8 core
We tried for SPARK_WORKER_MEMORY: 30GB / 24GB
SPARK_WORKER_CORES: 32 (because jobs are not CPU bound )
SPARK
The error is in the Spark Standalone Worker. It's hitting an OOM while
launching/running an executor process. Specifically it's running out of
memory when parsing the hadoop configuration trying to figure out the
env/command line to run
https://github.com/apache/spark/blob/branch-2.4/core/src/main
We submit spark job through spark-submit command, Like below one.
sudo /var/lib/pf-spark/bin/spark-submit \
--total-executor-cores 30 \
--driver-cores 2 \
--class com.hrishikesh.mishra.Main\
--master spark://XX.XX.XXX.19:6066 \
--deploy-mode cluster \
--supervise http://XX.XX.XXX.19:90/jar/fk-r
Hi,
It's been a while since I worked with Spark Standalone, but I'd check the
logs of the workers. How do you spark-submit the app?
DId you check /grid/1/spark/work/driver-20200508153502-1291 directory?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
"The Internals Of" Online Bo
Thanks Jacek for quick response.
Due to our system constraints, we can't move to Structured Streaming now.
But definitely YARN can be tried out.
But my problem is I'm able to figure out where is the issue, Driver,
Executor, or Worker. Even exceptions are clueless. Please see the below
exception,
Hi,
Sorry for being perhaps too harsh, but when you asked "Am I missing
something. " and I noticed this "Kafka Direct Stream" and "Spark Standalone
Cluster. " I immediately thought "Yeah...please upgrade your Spark env to
use Spark Structured Streaming at the very least and/or use YARN as the
clus
These errors are completely clueless. No clue why its OOM exception is
coming.
20/05/08 15:36:55 INFO Worker: Asked to kill driver
driver-20200508153502-1291
20/05/08 15:36:55 INFO DriverRunner: Killing driver process!
20/05/08 15:36:55 INFO CommandUtils: Redirection to
/grid/1/spark/work/drive
It's only happening for Hadoop config. The exceptions trace are different
for each time it gets died. And Jobs run for couple hours then worker dies.
Another Reason:
*20/05/02 02:26:34 ERROR SparkUncaughtExceptionHandler: Uncaught exception
in thread Thread[ExecutorRunner for app-20200501213234-9
You might want to double check your Hadoop config files. From the stack
trace it looks like this is happening when simply trying to load
configuration (XML files). Make sure they're well formed.
On Thu, May 7, 2020 at 6:12 AM Hrishikesh Mishra
wrote:
> Hi
>
> I am getting out of memory error i
Hi
I am getting out of memory error in worker log in streaming jobs in every
couple of hours. After this worker dies. There is no shuffle, no
aggression, no. caching in job, its just a transformation.
I'm not able to identify where is the problem, driver or executor. And why
worker getting dead a
10 matches
Mail list logo