Hi

I'm new to Riak and followed the installation instructions to get it working on 
an AWS cluster (3 nodes).

So far ive been able to use Riak in pyspark (zeppelin) to create/read/write 
tables, but i would like to use the dataframes directly from spark, using the 
Spark-Riak Connector.
When following the example found here: 
http://docs.basho.com/riak/ts/1.4.0/add-ons/spark-riak-connector/quick-start/#python
But i run into trouble on this last part:

host= my_ip_adress_of_riak_node
pb_port = '8087'
hostAndPort = ":".join([host, pb_port])
client = riak.RiakClient(host=host, pb_port=pb_port)

df.write \
    .format('org.apache.spark.sql.riak') \
    .option('spark.riak.connection.host', hostAndPort) \
    .mode('Append') \
    .save('test')

Important to note that i'm using a local download of the Jar file that is 
loaded into the pyspark interpreter in zeppeling through:
%dep
z.reset()
z.load("/home/hadoop/spark-riak-connector_2.10-1.6.0.jar")

Here is the error message i get back:
Py4JJavaError: An error occurred while calling o569.save. : 
java.lang.NoClassDefFoundError: com/basho/riak/client/core/util/HostAndPort at 
com.basho.riak.spark.rdd.connector.RiakConnectorConf$.apply(RiakConnectorConf.scala:76)
 at 
com.basho.riak.spark.rdd.connector.RiakConnectorConf$.apply(RiakConnectorConf.scala:89)
 at org.apache.spark.sql.riak.RiakRelation$.apply(RiakRelation.scala:115) at 
org.apache.spark.sql.riak.DefaultSource.createRelation(DefaultSource.scala:51) 
at 
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222)
 at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) at 
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606) at 
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at 
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at 
py4j.Gateway.invoke(Gateway.java:259) at 
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at 
py4j.commands.CallCommand.execute(CallCommand.java:79) at 
py4j.GatewayConnection.run(GatewayConnection.java:209) at 
java.lang.Thread.run(Thread.java:745) (<class 'py4j.protocol.Py4JJavaError'>, 
Py4JJavaError(u'An error occurred while calling o569.save.\n', JavaObject 
id=o570), <traceback object at 0x7f7021bb0200>)

Hope somebody can help out.
thanks, joris
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to