Additional information: The batch duration in my app is 1 minute, from
Spark UI, for each batch, the difference between Output Op Duration and Job
Duration is big. E.g. Output Op Duration is 1min while Job Duration is 19s.

2016-07-14 10:49 GMT-07:00 Renxia Wang <renxia.w...@gmail.com>:

> Hi all,
>
> I am running a Spark Streaming application with Kinesis on EMR 4.7.1. The
> application runs on YARN and use client mode. There are 17 worker nodes
> (c3.8xlarge) with 100 executors and 100 receivers. This setting works fine.
>
> But when I increase the number of worker nodes to 50, and increase the
> number of executors to 250, with the 250 receivers, the processing time of
> batches increase from ~50s to 2.3min, and scheduler delay for tasks
> increase from ~0.2s max to 20s max (while 75th percentile is about 2-3s).
>
> I tried to only increase the number executors but keep the number of
> receivers, but then I still see performance degrade from ~50s to 1.1min,
> and for tasks the scheduler delay increased from ~0.2s max to 4s max (while
> 75th percentile is about 1s).
>
> The spark-submit is as follow. The only parameter I changed here is the
> num-executors.
>
> spark-submit
> --deploy-mode client
> --verbose
> --master yarn
> --jars /usr/lib/spark/extras/lib/spark-streaming-kinesis-asl.jar
> --driver-memory 20g --driver-cores 20
> --num-executors 250
> --executor-cores 5
> --executor-memory 8g
> --conf spark.yarn.executor.memoryOverhead=1600
> --conf spark.driver.maxResultSize=0
> --conf spark.dynamicAllocation.enabled=false
> --conf spark.rdd.compress=true
> --conf spark.streaming.stopGracefullyOnShutdown=true
> --conf spark.streaming.backpressure.enabled=true
> --conf spark.speculation=true
> --conf spark.task.maxFailures=15
> --conf spark.ui.retainedJobs=100
> --conf spark.ui.retainedStages=100
> --conf spark.executor.logs.rolling.maxRetainedFiles=1
> --conf spark.executor.logs.rolling.strategy=time
> --conf spark.executor.logs.rolling.time.interval=hourly
> --conf spark.scheduler.mode=FAIR
> --conf spark.scheduler.allocation.file=/home/hadoop/fairscheduler.xml
> --conf spark.metrics.conf=/home/hadoop/spark-metrics.properties
> --class Main /home/hadoop/Main-1.0.jar
>
> I found this issue seems relevant:
> https://issues.apache.org/jira/browse/SPARK-14327
>
> Any suggestion for me to troubleshoot this issue?
>
> Thanks,
>
> Renxia
>
>

Reply via email to