I wanted to confirm whether this is now supported, such as in Spark v1.3.0
I've read varying info online & just thought I'd verify.
Thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p24117.html
Sent from the Apache Spark U
Hi Tommer,
I'm working on updating and improving the PR, and will work on getting an
HBase example working with it. Will feed back as soon as I have had the
chance to work on this a bit more.
N
On Thu, May 29, 2014 at 3:27 AM, twizansk wrote:
> The code which causes the error is:
>
> The code
The code which causes the error is:
The code which causes the error is:
sc = SparkContext("local", "My App")
rdd = sc.newAPIHadoopFile(
name,
'org.apache.hadoop.hbase.mapreduce.TableInputFormat',
'org.apache.hadoop.hbase.io.ImmutableBytesWritable',
'org.apache.hadoop.hbase.client
In my code I am not referencing PythonRDD or PythonRDDnewAPIHadoopFile at
all. I am calling SparkContext.newAPIHadoopFile with:
inputformat_class='org.apache.hadoop.hbase.mapreduce.TableInputFormat'
key_class='org.apache.hadoop.hbase.io.ImmutableBytesWritable',
value_class='org.apache.hadoop.hba
It sounds like you made a typo in the code — perhaps you’re trying to call
self._jvm.PythonRDDnewAPIHadoopFile instead of
self._jvm.PythonRDD.newAPIHadoopFile? There should be a dot before the new.
Matei
On May 28, 2014, at 5:25 PM, twizansk wrote:
> Hi Nick,
>
> I finally got around to do
Hi Nick,
I finally got around to downloading and building the patch.
I pulled the code from
https://github.com/MLnick/spark-1/tree/pyspark-inputformats
I am running on a CDH5 node. While the code in the CDH branch is different
from spark master, I do believe that I have resolved any inconsist
Thanks Nick and Matei. I'll take a look at the patch and keep you updated.
Tommer
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6176.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Yes actually if you could possibly test the patch out and see how easy it is to
load HBase Rdds that would be great.
That way I could make any amendments required to make HBase / Cassandra etc
easier
—
Sent from Mailbox
On Wed, May 21, 2014 at 4:41 AM, Matei Zaharia
wrote:
> Unfortunately
Unfortunately this is not yet possible. There’s a patch in progress posted here
though: https://github.com/apache/spark/pull/455 — it would be great to get
your feedback on it.
Matei
On May 20, 2014, at 4:21 PM, twizansk wrote:
> Hello,
>
> This seems like a basic question but I have been un