subject:"Re\: Performance issue when running Spark\-1.6.1 in yarn\-client mode with Hadoop 2.6.0"

Re: Performance issue when running Spark-1.6.1 in yarn-client mode with Hadoop 2.6.0

2017-06-08 Thread Satish John Bosco

I have tried the configuration calculator sheet provided by Cloudera as well but no improvements. However, ignoring the 17 mil operation to begin with. Let consider the simple sort on yarn and spark which has tremendous difference. The operation is simple Selected numeric col to be sorted ascendi

Re: Performance issue when running Spark-1.6.1 in yarn-client mode with Hadoop 2.6.0

2017-06-06 Thread Jörn Franke

What does your Spark job do? Have you tried standard configurations and changing them gradually? Have you checked the logfiles/ui which tasks take long? 17 Mio records does not sound much, but it depends what you do with it. I do not think that for such a small "cluster" it makes sense to hav