RE: How to force parallel processing of RDD using multiple thread

2015-01-16 Thread Wang, Ningjun (LNG-NPV)
: Friday, January 16, 2015 9:44 AM To: Wang, Ningjun (LNG-NPV) Cc: Sean Owen; user@spark.apache.org Subject: Re: How to force parallel processing of RDD using multiple thread Spark will use the number of cores available in the cluster. If your cluster is 1 node with 4 cores, Spark will execute up to

Re: How to force parallel processing of RDD using multiple thread

2015-01-16 Thread Gerard Maas
gt; Consulting Software Engineer > > LexisNexis > > 121 Chanlon Road > > New Providence, NJ 07974-1541 > > > > > > -----Original Message----- > > From: Sean Owen [mailto:so...@cloudera.com] > > Sent: Thursday, January 15, 2015 4:29 PM > > To: Wang,

RE: How to force parallel processing of RDD using multiple thread

2015-01-16 Thread Wang, Ningjun (LNG-NPV)
: Wang, Ningjun (LNG-NPV) Cc: user@spark.apache.org Subject: Re: How to force parallel processing of RDD using multiple thread Check the number of partitions in your input. It may be much less than the available parallelism of your small cluster. For example, input that lives in just 1 partition

Re: How to force parallel processing of RDD using multiple thread

2015-01-15 Thread Sean Owen
; To: Wang, Ningjun (LNG-NPV) > Cc: user@spark.apache.org > Subject: Re: How to force parallel processing of RDD using multiple thread > > What is your cluster manager? For example on YARN you would specify > --executor-cores. Read: > http://spark.apache.org/docs/latest/running-on-yarn.

RE: How to force parallel processing of RDD using multiple thread

2015-01-15 Thread Wang, Ningjun (LNG-NPV)
Providence, NJ 07974-1541 -Original Message- From: Sean Owen [mailto:so...@cloudera.com] Sent: Thursday, January 15, 2015 4:29 PM To: Wang, Ningjun (LNG-NPV) Cc: user@spark.apache.org Subject: Re: How to force parallel processing of RDD using multiple thread What is your cluster manager

Re: How to force parallel processing of RDD using multiple thread

2015-01-15 Thread Sean Owen
What is your cluster manager? For example on YARN you would specify --executor-cores. Read: http://spark.apache.org/docs/latest/running-on-yarn.html On Thu, Jan 15, 2015 at 8:54 PM, Wang, Ningjun (LNG-NPV) wrote: > I have a standalone spark cluster with only one node with 4 CPU cores. How > can I