If I launch more executors, GC gets worse.
2015-02-06 10:47 GMT+01:00 Guillermo Ortiz :
> This is an execution with 80 executors
>
> MetricMin25th percentileMedian75th percentileMax
> Duration 31s 44s 50s 1.1min 2.6 min
> GC Time 70ms 0.1s 0.3s 4s 53 s
> Input 128.0MB 128.0MB 128.0MB 128.0MB 128.0
This is an execution with 80 executors
MetricMin25th percentileMedian75th percentileMax
Duration 31s 44s 50s 1.1min 2.6 min
GC Time 70ms 0.1s 0.3s 4s 53 s
Input 128.0MB 128.0MB 128.0MB 128.0MB 128.0MB
I executed as well with 40 executors
MetricMin25th percentileMedian75th percentileMax
Duration 2
Yes, having many more cores than disks and all writing at the same time can
definitely cause performance issues. Though that wouldn't explain the high
GC. What percent of task time does the web UI report that tasks are
spending in GC?
On Fri, Feb 6, 2015 at 12:56 AM, Guillermo Ortiz
wrote:
> Y
Yes, It's surpressing to me as well
I tried to execute it with different configurations,
sudo -u hdfs spark-submit --master yarn-client --class
com.mycompany.app.App --num-executors 40 --executor-memory 4g
Example-1.0-SNAPSHOT.jar hdfs://ip:8020/tmp/sparkTest/ file22.bin
parameters
This is
That's definitely surprising to me that you would be hitting a lot of GC
for this scenario. Are you setting --executor-cores and
--executor-memory? What are you setting them to?
-Sandy
On Thu, Feb 5, 2015 at 10:17 AM, Guillermo Ortiz
wrote:
> Any idea why if I use more containers I get a lot
Any idea why if I use more containers I get a lot of stopped because GC?
2015-02-05 8:59 GMT+01:00 Guillermo Ortiz :
> I'm not caching the data. with "each iteration I mean,, each 128mb
> that a executor has to process.
>
> The code is pretty simple.
>
> final Conversor c = new Conversor(null, nul
I'm not caching the data. with "each iteration I mean,, each 128mb
that a executor has to process.
The code is pretty simple.
final Conversor c = new Conversor(null, null, null, longFields,typeFields);
SparkConf conf = new SparkConf().setAppName("Simple Application");
JavaSparkContext sc = new Ja
Hi Guillermo,
What exactly do you mean by "each iteration"? Are you caching data in
memory?
-Sandy
On Wed, Feb 4, 2015 at 5:02 AM, Guillermo Ortiz
wrote:
> I execute a job in Spark where I'm processing a file of 80Gb in HDFS.
> I have 5 slaves:
> (32cores /256Gb / 7physical disks) x 5
>
> I h
I execute a job in Spark where I'm processing a file of 80Gb in HDFS.
I have 5 slaves:
(32cores /256Gb / 7physical disks) x 5
I have been trying many different configurations with YARN.
yarn.nodemanager.resource.memory-mb 196Gb
yarn.nodemanager.resource.cpu-vcores 24
I have tried to execute the j