Re: why a machine learning application run slowly on the spark cluster

Xiangrui Meng Tue, 29 Jul 2014 22:53:31 -0700

The weight vector is usually dense and if you have many partitions,
the driver may slow down. You can also take a look at the driver
memory inside the Executor tab in WebUI. Another setting to check is
the HDFS block size and whether the input data is evenly distributed
to the executors. Are the hardware specs the same for the two
clusters? -Xiangrui


On Tue, Jul 29, 2014 at 10:46 PM, Tan Tim <unname...@gmail.com> wrote:
> The application is Logistic Regression (OWLQN), we develop a sparse vector
> version. The feature dimesions is 1M+, but its very sparse. This appliction
> can run on another spark cluster, and every stage is about 50 seconds, and
> every executors have highly cpu usage. the only difference is OS(the faster
> one is ubuntu, and the slower on is centos).

Re: why a machine learning application run slowly on the spark cluster

Reply via email to