Re: Python, Spark and HBase

2015-08-03 Thread ericbless
I wanted to confirm whether this is now supported, such as in Spark v1.3.0 I've read varying info online & just thought I'd verify. Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p24117.html Sent from the Apache Spark U

Re: Python, Spark and HBase

2014-05-29 Thread Nick Pentreath
Hi Tommer, I'm working on updating and improving the PR, and will work on getting an HBase example working with it. Will feed back as soon as I have had the chance to work on this a bit more. N On Thu, May 29, 2014 at 3:27 AM, twizansk wrote: > The code which causes the error is: > > The code

Re: Python, Spark and HBase

2014-05-28 Thread twizansk
The code which causes the error is: The code which causes the error is: sc = SparkContext("local", "My App") rdd = sc.newAPIHadoopFile( name, 'org.apache.hadoop.hbase.mapreduce.TableInputFormat', 'org.apache.hadoop.hbase.io.ImmutableBytesWritable', 'org.apache.hadoop.hbase.client

Re: Python, Spark and HBase

2014-05-28 Thread twizansk
In my code I am not referencing PythonRDD or PythonRDDnewAPIHadoopFile at all. I am calling SparkContext.newAPIHadoopFile with: inputformat_class='org.apache.hadoop.hbase.mapreduce.TableInputFormat' key_class='org.apache.hadoop.hbase.io.ImmutableBytesWritable', value_class='org.apache.hadoop.hba

Re: Python, Spark and HBase

2014-05-28 Thread Matei Zaharia
It sounds like you made a typo in the code — perhaps you’re trying to call self._jvm.PythonRDDnewAPIHadoopFile instead of self._jvm.PythonRDD.newAPIHadoopFile? There should be a dot before the new. Matei On May 28, 2014, at 5:25 PM, twizansk wrote: > Hi Nick, > > I finally got around to do

Re: Python, Spark and HBase

2014-05-28 Thread twizansk
Hi Nick, I finally got around to downloading and building the patch. I pulled the code from https://github.com/MLnick/spark-1/tree/pyspark-inputformats I am running on a CDH5 node. While the code in the CDH branch is different from spark master, I do believe that I have resolved any inconsist

Re: Python, Spark and HBase

2014-05-21 Thread twizansk
Thanks Nick and Matei. I'll take a look at the patch and keep you updated. Tommer -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6176.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Python, Spark and HBase

2014-05-20 Thread Nick Pentreath
Yes actually if you could possibly test the patch out and see how easy it is to load HBase Rdds that would be great.  That way I could make any amendments required to make HBase / Cassandra etc easier  — Sent from Mailbox On Wed, May 21, 2014 at 4:41 AM, Matei Zaharia wrote: > Unfortunately

Re: Python, Spark and HBase

2014-05-20 Thread Matei Zaharia
Unfortunately this is not yet possible. There’s a patch in progress posted here though: https://github.com/apache/spark/pull/455 — it would be great to get your feedback on it. Matei On May 20, 2014, at 4:21 PM, twizansk wrote: > Hello, > > This seems like a basic question but I have been un