Re: Spark Job on YARN accessing Hbase Table

2016-03-13 Thread Ted Yu
The backport would be done under HBASE-14160. FYI On Sun, Mar 13, 2016 at 4:14 PM, Benjamin Kim wrote: > Ted, > > Is there anything in the works or are there tasks already to do the > back-porting? > > Just curious. > > Thanks, > Ben > > On Mar 13, 2016, at 3:46 PM, Ted Yu wrote: > > class HFi

Re: Spark Job on YARN accessing Hbase Table

2016-03-13 Thread Benjamin Kim
Ted, Is there anything in the works or are there tasks already to do the back-porting? Just curious. Thanks, Ben > On Mar 13, 2016, at 3:46 PM, Ted Yu wrote: > > class HFileWriterImpl (in standalone file) is only present in master branch. > It is not in branch-1. > > compressionByName() res

Re: Spark Job on YARN accessing Hbase Table

2016-03-13 Thread Ted Yu
class HFileWriterImpl (in standalone file) is only present in master branch. It is not in branch-1. compressionByName() resides in class with @InterfaceAudience.Private which got moved in master branch. So looks like there is some work to be done for backporting to branch-1 :-) On Sun, Mar 13, 2

Re: Spark Job on YARN accessing Hbase Table

2016-03-13 Thread Benjamin Kim
Ted, I did as you said, but it looks like that HBaseContext relies on some differences in HBase itself. [ERROR] /home/bkim/hbase-rel-1.0.2/hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/HBaseContext.scala:30: error: object HFileWriterImpl is not a member of package org.apache.hadoop

Re: Spark Job on YARN accessing Hbase Table

2016-03-13 Thread Benjamin Kim
Ted, That’s great! I didn’t know. I will proceed with it as you said. Thanks, Ben > On Mar 13, 2016, at 12:42 PM, Ted Yu wrote: > > Benjamin: > Since hbase-spark is in its own module, you can pull the whole hbase-spark > subtree into hbase 1.0 root dir and add the following to root pom.xml: >

Re: Spark Job on YARN accessing Hbase Table

2016-03-13 Thread Ted Yu
Benjamin: Since hbase-spark is in its own module, you can pull the whole hbase-spark subtree into hbase 1.0 root dir and add the following to root pom.xml: hbase-spark Then you would be able to build the module yourself. hbase-spark module uses APIs which are compatible with hbase 1.0 Cheers

Re: Spark Job on YARN accessing Hbase Table

2016-03-13 Thread Benjamin Kim
Hi Ted, I see that you’re working on the hbase-spark module for hbase. I recently packaged the SparkOnHBase project and gave it a test run. It works like a charm on CDH 5.4 and 5.5. All I had to do was add /opt/cloudera/parcels/CDH/jars/htrace-core-3.1.0-incubating.jar to the classpath.txt fil

Re: Spark Job on YARN accessing Hbase Table

2016-02-10 Thread Prabhu Joseph
Yes Ted, spark.executor.extraClassPath will work if hbase client jars is present in all Spark Worker / NodeManager machines. spark.yarn.dist.files is the easier way, as hbase client jars can be copied from driver machine or hdfs into container / spark-executor classpath automatically. No need to m

Re: Spark Job on YARN accessing Hbase Table

2016-02-10 Thread Ted Yu
Have you tried adding hbase client jars to spark.executor.extraClassPath ? Cheers On Wed, Feb 10, 2016 at 12:17 AM, Prabhu Joseph wrote: > + Spark-Dev > > For a Spark job on YARN accessing hbase table, added all hbase client jars > into spark.yarn.dist.files, NodeManager when launching containe

Re: Spark Job on YARN accessing Hbase Table

2016-02-10 Thread Prabhu Joseph
+ Spark-Dev For a Spark job on YARN accessing hbase table, added all hbase client jars into spark.yarn.dist.files, NodeManager when launching container i.e executor, does localization and brings all hbase-client jars into executor CWD, but still the executor tasks fail with ClassNotFoundException