Hi, myasuka Have you checked the jvm gc time of each executor?
I think you should increase the SPARK_EXECUTOR_CORES or SPARK_EXECUTOR_INSTANCES until you get the enough concurrency. Here is my recommend config: SPARK_EXECUTOR_CORES=8 SPARK_EXECUTOR_INSTANCES=4 SPARK_WORKER_MEMORY=8G note: make sure you got enough memory on each node, more than SPARK_EXECUTOR_INSTANCES * SPARK_WORKER_MEMORY Best Regards, Yi Tian tianyi.asiai...@gmail.com On Sep 29, 2014, at 21:06, myasuka <myas...@live.com> wrote: > Our cluster is a standalone cluster with 16 computing nodes, each node has 16 > cores. I set SPARK_WORKER_INSTANCES to 1, and set SPARK_WORKER_CORES to 32, > we give 512 tasks all together, this situation can help increase the > concurrency. But if I set SPARK_WORKER_INSTANCES to 2, SPARK_WORKER_CORES > to 16, this dosen't work well. > > Thank you for your reply. > > > Yi Tian wrote >> for yarn-client mode: >> >> SPARK_EXECUTOR_CORES * SPARK_EXECUTOR_INSTANCES = 2(or 3) * >> TotalCoresOnYourCluster >> >> for standlone mode: >> >> SPARK_WORKER_INSTANCES * SPARK_WORKER_CORES = 2(or 3) * >> TotalCoresOnYourCluster >> >> >> >> Best Regards, >> >> Yi Tian > >> tianyi.asiainfo@ > >> >> >> >> >> On Sep 28, 2014, at 17:59, myasuka < > >> myasuka@ > >> > wrote: >> >>> Hi, everyone >>> I come across with a problem about increasing the concurency. In a >>> program, after shuffle write, each node should fetch 16 pair matrices to >>> do >>> matrix multiplication. such as: >>> >>> *import breeze.linalg.{DenseMatrix => BDM} >>> >>> pairs.map(t => { >>> val b1 = t._2._1.asInstanceOf[BDM[Double]] >>> val b2 = t._2._2.asInstanceOf[BDM[Double]] >>> >>> val c = (b1 * b2).asInstanceOf[BDM[Double]] >>> >>> (new BlockID(t._1.row, t._1.column), c) >>> })* >>> >>> Each node has 16 cores. However, no matter I set 16 tasks or more on >>> each node, the concurrency cannot be higher than 60%, which means not >>> every >>> core on the node is computing. Then I check the running log on the WebUI, >>> according to the amount of shuffle read and write in every task, I see >>> some >>> task do once matrix multiplication, some do twice while some do none. >>> >>> Thus, I think of using java multi thread to increase the concurrency. >>> I >>> wrote a program in scala which calls java multi thread without Spark on a >>> single node, by watch the 'top' monitor, I find this program can use CPU >>> up >>> to 1500% ( means nearly every core are computing). But I have no idea how >>> to >>> use Java multi thread in RDD transformation. >>> >>> Is there any one can provide some example code to use Java multi >>> thread >>> in RDD transformation, or give any idea to increase the concurrency ? >>> >>> Thanks for all >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-use-multi-thread-in-RDD-map-function-tp8583.html >>> Sent from the Apache Spark Developers List mailing list archive at >>> Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: > >> dev-unsubscribe@.apache > >>> For additional commands, e-mail: > >> dev-help@.apache > >>> > > > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-use-multi-thread-in-RDD-map-function-tp8583p8594.html > Sent from the Apache Spark Developers List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org