"The reason not using sc.newAPIHadoopRDD is it only support one scan each
time."
I am not sure is that's true. You can use multiple scans as following:
val scanStrings = scans.map(scan => convertScanToString(scan))
conf.setStrings(MultiTableInputFormat.SCANS, scanStrings : _*)
where convertScanT
Just point out a bug in your codes. You should not use `mapPartitions` like
that. For details, I recommend Section "setup() and cleanup()" in Sean
Owen's post:
http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/
Best Regards,
Shixiong Zhu
2014-12-14 16:35 GMT+08
In #1, class HTable can not be serializable.
You also need to check you self defined function getUserActions and make sure
it is a member function of one class who implement serializable interface.
发自我的 iPad
> 在 2014年12月12日,下午4:35,yangliuyu 写道:
>
> The scenario is using HTable instance to scan
Can you paste the complete code? it looks like at some point you are
passing a hadoop's configuration which is not Serializable. You can look at
this thread for similar discussion
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-into-HBase-td13378.html
Thanks
Best Regards
On Fr