Re: Serialization issue when using HBase with Spark

2014-12-15 Thread Aniket Bhatnagar
"The reason not using sc.newAPIHadoopRDD is it only support one scan each time." I am not sure is that's true. You can use multiple scans as following: val scanStrings = scans.map(scan => convertScanToString(scan)) conf.setStrings(MultiTableInputFormat.SCANS, scanStrings : _*) where convertScanT

Re: Serialization issue when using HBase with Spark

2014-12-15 Thread Shixiong Zhu
Just point out a bug in your codes. You should not use `mapPartitions` like that. For details, I recommend Section "setup() and cleanup()" in Sean Owen's post: http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/ Best Regards, Shixiong Zhu 2014-12-14 16:35 GMT+08

Re: Serialization issue when using HBase with Spark

2014-12-14 Thread Yanbo
In #1, class HTable can not be serializable. You also need to check you self defined function getUserActions and make sure it is a member function of one class who implement serializable interface. 发自我的 iPad > 在 2014年12月12日,下午4:35,yangliuyu 写道: > > The scenario is using HTable instance to scan

Re: Serialization issue when using HBase with Spark

2014-12-12 Thread Akhil Das
Can you paste the complete code? it looks like at some point you are passing a hadoop's configuration which is not Serializable. You can look at this thread for similar discussion http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-into-HBase-td13378.html Thanks Best Regards On Fr