Hi! I have a Spark cluster running on top of a Cassandra cluster, using Datastax' new driver, and one of the fields of my RDDs is an XML-string. In a normal Scala sparkjob, parsing that data is no problem, but I would like to also make that information available through Spark SQL. So, is there any way to write user defined functions for Spark SQL? I know that a HiveContext is available, but I understand that that is for querying data from Hive, and I don't have Hive in my stack (please correct me if I'm wrong).
I would love to be able to do something like the following: val casRdd = sparkCtx.cassandraTable("ks", "cf") // registerAsTable etc val res = sql("SELECT id, xmlGetTag(xmlfield, 'sometag') FROM cf") -- Best regards, Martin Gammelsæter