Our cluster is a standalone cluster with 16 computing nodes, each node has 16 cores. I set SPARK_WORKER_INSTANCES to 1, and set SPARK_WORKER_CORES to 32, we give 512 tasks all together, this situation can help increase the concurrency. But if I set SPARK_WORKER_INSTANCES to 2, SPARK_WORKER_CORES to 16, this dosen't work well.
Thank you for your reply. Yi Tian wrote > for yarn-client mode: > > SPARK_EXECUTOR_CORES * SPARK_EXECUTOR_INSTANCES = 2(or 3) * > TotalCoresOnYourCluster > > for standlone mode: > > SPARK_WORKER_INSTANCES * SPARK_WORKER_CORES = 2(or 3) * > TotalCoresOnYourCluster > > > > Best Regards, > > Yi Tian > tianyi.asiainfo@ > > > > > On Sep 28, 2014, at 17:59, myasuka < > myasuka@ > > wrote: > >> Hi, everyone >> I come across with a problem about increasing the concurency. In a >> program, after shuffle write, each node should fetch 16 pair matrices to >> do >> matrix multiplication. such as: >> >> *import breeze.linalg.{DenseMatrix => BDM} >> >> pairs.map(t => { >> val b1 = t._2._1.asInstanceOf[BDM[Double]] >> val b2 = t._2._2.asInstanceOf[BDM[Double]] >> >> val c = (b1 * b2).asInstanceOf[BDM[Double]] >> >> (new BlockID(t._1.row, t._1.column), c) >> })* >> >> Each node has 16 cores. However, no matter I set 16 tasks or more on >> each node, the concurrency cannot be higher than 60%, which means not >> every >> core on the node is computing. Then I check the running log on the WebUI, >> according to the amount of shuffle read and write in every task, I see >> some >> task do once matrix multiplication, some do twice while some do none. >> >> Thus, I think of using java multi thread to increase the concurrency. >> I >> wrote a program in scala which calls java multi thread without Spark on a >> single node, by watch the 'top' monitor, I find this program can use CPU >> up >> to 1500% ( means nearly every core are computing). But I have no idea how >> to >> use Java multi thread in RDD transformation. >> >> Is there any one can provide some example code to use Java multi >> thread >> in RDD transformation, or give any idea to increase the concurrency ? >> >> Thanks for all >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-use-multi-thread-in-RDD-map-function-tp8583.html >> Sent from the Apache Spark Developers List mailing list archive at >> Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: > dev-unsubscribe@.apache >> For additional commands, e-mail: > dev-help@.apache >> -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-use-multi-thread-in-RDD-map-function-tp8583p8594.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org