: Friday, January 16, 2015 9:44 AM
To: Wang, Ningjun (LNG-NPV)
Cc: Sean Owen; user@spark.apache.org
Subject: Re: How to force parallel processing of RDD using multiple thread
Spark will use the number of cores available in the cluster. If your cluster is
1 node with 4 cores, Spark will execute up to
gt; Consulting Software Engineer
> > LexisNexis
> > 121 Chanlon Road
> > New Providence, NJ 07974-1541
> >
> >
> > -----Original Message-----
> > From: Sean Owen [mailto:so...@cloudera.com]
> > Sent: Thursday, January 15, 2015 4:29 PM
> > To: Wang,
: Wang, Ningjun (LNG-NPV)
Cc: user@spark.apache.org
Subject: Re: How to force parallel processing of RDD using multiple thread
Check the number of partitions in your input. It may be much less than the
available parallelism of your small cluster. For example, input that lives in
just 1 partition
; To: Wang, Ningjun (LNG-NPV)
> Cc: user@spark.apache.org
> Subject: Re: How to force parallel processing of RDD using multiple thread
>
> What is your cluster manager? For example on YARN you would specify
> --executor-cores. Read:
> http://spark.apache.org/docs/latest/running-on-yarn.
Providence, NJ 07974-1541
-Original Message-
From: Sean Owen [mailto:so...@cloudera.com]
Sent: Thursday, January 15, 2015 4:29 PM
To: Wang, Ningjun (LNG-NPV)
Cc: user@spark.apache.org
Subject: Re: How to force parallel processing of RDD using multiple thread
What is your cluster manager
What is your cluster manager? For example on YARN you would specify
--executor-cores. Read:
http://spark.apache.org/docs/latest/running-on-yarn.html
On Thu, Jan 15, 2015 at 8:54 PM, Wang, Ningjun (LNG-NPV)
wrote:
> I have a standalone spark cluster with only one node with 4 CPU cores. How
> can I