This PR adds support for multiple executors per worker:
https://github.com/apache/spark/pull/731 and should be available in 1.4.
Thanks,
Nishkam
On Wed, Jun 10, 2015 at 1:35 PM, Evo Eftimov wrote:
> We/i were discussing STANDALONE mode, besides maxdml had already
> summarized what is available
t's since fixed? I'm on 1.0.1 and using 'yarn-cluster' as the
> master. 'yarn-client' seems to pick up the values and works fine.
>
> Greg
>
> From: Nishkam Ravi
> Date: Monday, September 22, 2014 3:30 PM
> To: Greg
> Cc: Andrew Or , &qu
Greg, if you look carefully, the code is enforcing that the memoryOverhead
be lower (and not higher) than spark.driver.memory.
Thanks,
Nishkam
On Mon, Sep 22, 2014 at 1:26 PM, Greg Hill wrote:
> I thought I had this all figured out, but I'm getting some weird errors
> now that I'm attempting t
Can you share more details about your job, cluster properties and
configuration parameters?
Thanks,
Nishkam
On Fri, Aug 29, 2014 at 11:33 AM, Chirag Aggarwal <
chirag.aggar...@guavus.com> wrote:
> When I run SparkSql over yarn, it runs 2-4 times slower as compared to
> when its run in local mo
See if this helps:
https://github.com/nishkamravi2/SparkAutoConfig/
It's a very simple tool for auto-configuring default parameters in Spark.
Takes as input high-level parameters (like number of nodes, cores per node,
memory per node, etc) and spits out default configuration, user advice and
comm
I think two small JVMs would often beat a large one due to lower GC
overhead.
Spark-on-YARN takes 10-30 seconds of setup time for workloads like
WordCount and PageRank on a small-sized cluster and thereafter performs as
well as Spark standalone, as has been noted by Tom and Patrick. However,
certain amount of configuration/tuning effort is required to match peak
performance.