The weight vector is usually dense and if you have many partitions, the driver may slow down. You can also take a look at the driver memory inside the Executor tab in WebUI. Another setting to check is the HDFS block size and whether the input data is evenly distributed to the executors. Are the hardware specs the same for the two clusters? -Xiangrui
On Tue, Jul 29, 2014 at 10:46 PM, Tan Tim <unname...@gmail.com> wrote: > The application is Logistic Regression (OWLQN), we develop a sparse vector > version. The feature dimesions is 1M+, but its very sparse. This appliction > can run on another spark cluster, and every stage is about 50 seconds, and > every executors have highly cpu usage. the only difference is OS(the faster > one is ubuntu, and the slower on is centos).