x27;user@spark.apache.org'
Subject: RE: General configurations on CDH5 to achieve maximum Spark
Performance
Essentially to change the performance yield of software cluster
infrastructure platform like spark you play with different permutations of:
- Number of CPU cores used by Sp
I don't think there's anything specific to CDH that you need to know,
other than it ought to set things up sanely for you.
Sandy did a couple posts about tuning:
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-1/
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-
-on-yarn.html
From: Manish Gupta 8 [mailto:mgupt...@sapient.com]
Sent: Thursday, April 16, 2015 6:21 PM
To: Evo Eftimov; user@spark.apache.org
Subject: RE: General configurations on CDH5 to achieve maximum Spark
Performance
Thanks Evo. Yes, my concern is only regarding the infrastructure
.
Thanks,
Manish
From: Evo Eftimov [mailto:evo.efti...@isecc.com]
Sent: Thursday, April 16, 2015 10:38 PM
To: Manish Gupta 8; user@spark.apache.org
Subject: RE: General configurations on CDH5 to achieve maximum Spark Performance
Well there are a number of performance tuning guidelines in dedicated
Well there are a number of performance tuning guidelines in dedicated
sections of the spark documentation - have you read and applied them
Secondly any performance problem within a distributed cluster environment
has two aspects:
1. Infrastructure
2. App Algorithms
You s