For standalone and yarn mode, you need to install native libraries on all 
nodes. The best solution is installing them to /usr/lib/libblas.so.3 and 
/usr/lib/liblapack.so.3 . If your matrix is sparse, the native libraries cannot 
help because they are for dense linear algebra. You can create RDD of sparse 
rows and try k-means directly, it supports sparse input. -Xiangrui

Sent from my iPad

> On Jun 5, 2014, at 2:36 AM, yangliuyu <yangli...@163.com> wrote:
> 
> Hi,
> 
> We're using Mllib (1.0.0 release version) on a k-means clustering problem.
> We want to reduce the matrix column size before send the points to k-means
> solver.
> 
> It works on my mac with the local mode: spark-test-run-assembly-1.0.jar
> contains my application code, com.github.fommil, netlib code and
> netlib-native*.so files (include jnilib and dll files) 
> 
> spark-submit --class test.TestMllibPCA --master local[4] --executor-memory
> 3g --driver-memory 3g --driver-class-path
> /data/user/dump/spark-test-run-assembly-1.0.jar
> /data/user/dump/spark-test-run-assembly-1.0.jar
> /data/user/dump/user_fav_2014_04_09.csv.head1w 
> 
> But if  --driver-class-path removed, the warn message appears:
> 14/06/05 16:36:20 WARN LAPACK: Failed to load implementation from:
> com.github.fommil.netlib.NativeSystemLAPACK
> 14/06/05 16:36:20 WARN LAPACK: Failed to load implementation from:
> com.github.fommil.netlib.NativeRefLAPACK
> 
> or set SPARK_CLASSPATH=/data/user/dump/spark-test-run-assembly-1.0.jar can
> also solve the problem.
> 
> The matrix contain sparse data with rows: 6778, columns: 2487 and the time
> consume of calculating PCA is 10s and 47s respectively which infers the
> native library works well.
> 
> Then I want to test it on a spark standalone cluster(on CentOS), but it
> failed again.
> After change JDK logging level to FINEST, got the message:
> 14/06/05 16:19:15 INFO JniLoader: JNI LIB =
> netlib-native_system-linux-x86_64.so
> 14/06/05 16:19:15 INFO JniLoader: extracting
> jar:file:/data/user/dump/spark-test-run-assembly-1.0.jar!/netlib-native_system-linux-x86_64.so
> to /tmp/jniloader6648403281987654682netlib-native_system-linux-x86_64.so
> 14/06/05 16:19:15 WARN LAPACK: Failed to load implementation from:
> com.github.fommil.netlib.NativeSystemLAPACK
> 14/06/05 16:19:15 INFO JniLoader: JNI LIB =
> netlib-native_ref-linux-x86_64.so
> 14/06/05 16:19:15 INFO JniLoader: extracting
> jar:file:/data/user/dump/spark-test-run-assembly-1.0.jar!/netlib-native_ref-linux-x86_64.so
> to /tmp/jniloader2298588627398263902netlib-native_ref-linux-x86_64.so
> 14/06/05 16:19:16 WARN LAPACK: Failed to load implementation from:
> com.github.fommil.netlib.NativeRefLAPACK
> 14/06/05 16:19:16 INFO LAPACK: Implementation provided by class
> com.github.fommil.netlib.F2jLAPACK
> 
> The libgfortran ,atlas, blas, lapack and arpack are all installed and all of
> the .so files are located under /usr/lib64, spark.executor.extraLibraryPath
> is set to /usr/lib64 in conf/spark-defaults.conf but none of them works. I
> tried add --jars /data/user/dump/spark-test-run-assembly-1.0.jar but no good
> news.
> 
> What should I try next?
> 
> Is the native library need to be visible for driver and executor both? In
> local mode the problem seems to be a classpath problem, but for standalone
> and yarn mode it get more complex. A detail document is really helpful.
> 
> Thanks.
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Native-library-can-not-be-loaded-when-using-Mllib-PCA-tp7042.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to