n Sean Owen's
post:
http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/
Best Regards,
Shixiong Zhu
2014-12-14 16:35 GMT+08:00 Yanbo :In #1, class HTable can
not be serializable.
You also need to check you self defined function getUserActions an
The scenario is using HTable instance to scan multiple rowkey range in Spark
tasks look likes below:
Option 1:
val users = input
.map { case (deviceId, uid) =>
uid}.distinct().sortBy(x=>x).mapPartitions(iterator=>{
val conf = HBaseConfiguration.create()
val table = new HTable(conf
Finally, we solved this problem by building our own netlib-java natives so
files on CentOS, it works without any warning but the performance is far
from running in Macbook Pro.
The matrix size is rows: 6778, columns: 2487
The MBP used 10 seconds to get the PCA result, but CentOS used 110s, event
Thanks Xiangrui,
I switched to a Ubuntu 14.04 server and it works after install liblapack3gf
and libopenblas-base.
So it is a environment problem which is not related to Mllib.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Native-library-can-not-be-load
Hi,
We're using Mllib (1.0.0 release version) on a k-means clustering problem.
We want to reduce the matrix column size before send the points to k-means
solver.
It works on my mac with the local mode: spark-test-run-assembly-1.0.jar
contains my application code, com.github.fommil, netlib code an