Hi!

I have a Spark cluster running on top of a Cassandra cluster, using
Datastax' new driver, and one of the fields of my RDDs is an
XML-string. In a normal Scala sparkjob, parsing that data is no
problem, but I would like to also make that information available
through Spark SQL. So, is there any way to write user defined
functions for Spark SQL? I know that a HiveContext is available, but I
understand that that is for querying data from Hive, and I don't have
Hive in my stack (please correct me if I'm wrong).

I would love to be able to do something like the following:

val casRdd = sparkCtx.cassandraTable("ks", "cf")

// registerAsTable etc

val res = sql("SELECT id, xmlGetTag(xmlfield, 'sometag') FROM cf")

-- 
Best regards,
Martin Gammelsæter

Reply via email to