Hi Joris,

I have looked at the tutorial you have been following but I confess I am
confused.  In the example you are following I do not see where the spark
and sql contexts are created.  I use PySpark through the Jupyter notebook
and I have to specify a path to the connector on invoking the jupyter
notebook. Is it possible for you to share all your code (and how you are
invoking zeppelin) with me so I can trace everything through?

regards
Stephen

On Mon, Sep 12, 2016 at 3:27 PM, Agtmaal, Joris van <
joris.vanagtm...@wartsila.com> wrote:

> Hi
>
>
>
> I’m new to Riak and followed the installation instructions to get it
> working on an AWS cluster (3 nodes).
>
>
>
> So far ive been able to use Riak in pyspark (zeppelin) to
> create/read/write tables, but i would like to use the dataframes directly
> from spark, using the Spark-Riak Connector.
>
> When following the example found here: http://docs.basho.com/riak/ts/
> 1.4.0/add-ons/spark-riak-connector/quick-start/#python
>
> But i run into trouble on this last part:
>
>
>
> host= my_ip_adress_of_riak_node
>
> pb_port = '8087'
>
> hostAndPort = ":".join([host, pb_port])
>
> client = riak.RiakClient(host=host, pb_port=pb_port)
>
>
>
> df.write \
>
>     .format('org.apache.spark.sql.riak') \
>
>     .option('spark.riak.connection.host', hostAndPort) \
>
>     .mode('Append') \
>
>     .save('test')
>
>
>
> Important to note that i’m using a local download of the Jar file that is
> loaded into the pyspark interpreter in zeppeling through:
>
> %dep
>
> z.reset()
>
> z.load("/home/hadoop/spark-riak-connector_2.10-1.6.0.jar")
>
>
>
> Here is the error message i get back:
>
> Py4JJavaError: An error occurred while calling o569.save. : 
> java.lang.NoClassDefFoundError:
> com/basho/riak/client/core/util/HostAndPort at com.basho.riak.spark.rdd.
> connector.RiakConnectorConf$.apply(RiakConnectorConf.scala:76) at
> com.basho.riak.spark.rdd.connector.RiakConnectorConf$.
> apply(RiakConnectorConf.scala:89) at org.apache.spark.sql.riak.
> RiakRelation$.apply(RiakRelation.scala:115) at org.apache.spark.sql.riak.
> DefaultSource.createRelation(DefaultSource.scala:51) at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43) at 
> java.lang.reflect.Method.invoke(Method.java:606)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at
> py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.
> invokeMethod(AbstractCommand.java:133) at 
> py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:209) at
> java.lang.Thread.run(Thread.java:745) (<class
> 'py4j.protocol.Py4JJavaError'>, Py4JJavaError(u'An error occurred while
> calling o569.save.\n', JavaObject id=o570), <traceback object at
> 0x7f7021bb0200>)
>
>
>
> Hope somebody can help out.
>
> thanks, joris
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


-- 
{ "name" : "Stephen Etheridge",
   "title" : "Solution Architect, EMEA",
   "Organisation" : "Basho Technologies, Inc",
   "Telephone" : "07814 406662",
   "email" : "mailto:setheri...@basho.com";,
   "github" : "http://github.com/datalemming";,
   "twitter" : "@datalemming"}
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to