Hi I'm new to Riak and followed the installation instructions to get it working on an AWS cluster (3 nodes).
So far ive been able to use Riak in pyspark (zeppelin) to create/read/write tables, but i would like to use the dataframes directly from spark, using the Spark-Riak Connector. When following the example found here: http://docs.basho.com/riak/ts/1.4.0/add-ons/spark-riak-connector/quick-start/#python But i run into trouble on this last part: host= my_ip_adress_of_riak_node pb_port = '8087' hostAndPort = ":".join([host, pb_port]) client = riak.RiakClient(host=host, pb_port=pb_port) df.write \ .format('org.apache.spark.sql.riak') \ .option('spark.riak.connection.host', hostAndPort) \ .mode('Append') \ .save('test') Important to note that i'm using a local download of the Jar file that is loaded into the pyspark interpreter in zeppeling through: %dep z.reset() z.load("/home/hadoop/spark-riak-connector_2.10-1.6.0.jar") Here is the error message i get back: Py4JJavaError: An error occurred while calling o569.save. : java.lang.NoClassDefFoundError: com/basho/riak/client/core/util/HostAndPort at com.basho.riak.spark.rdd.connector.RiakConnectorConf$.apply(RiakConnectorConf.scala:76) at com.basho.riak.spark.rdd.connector.RiakConnectorConf$.apply(RiakConnectorConf.scala:89) at org.apache.spark.sql.riak.RiakRelation$.apply(RiakRelation.scala:115) at org.apache.spark.sql.riak.DefaultSource.createRelation(DefaultSource.scala:51) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:209) at java.lang.Thread.run(Thread.java:745) (<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError(u'An error occurred while calling o569.save.\n', JavaObject id=o570), <traceback object at 0x7f7021bb0200>) Hope somebody can help out. thanks, joris
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com