Takuya, thanks for your reply :) I am already doing that, and it is working well. My question is, can I define arbitrary functions to be used in these queries?
On Fri, Jul 4, 2014 at 11:12 AM, Takuya UESHIN <ues...@happy-camper.st> wrote: > Hi, > > You can convert standard RDD of Product class (e.g. case class) to SchemaRDD > by SQLContext. > Load data from Cassandra into RDD of case class, convert it to SchemaRDD and > register it, > then you can use it in your SQLs. > > http://spark.apache.org/docs/latest/sql-programming-guide.html#running-sql-on-rdds > > Thanks. > > > > 2014-07-04 17:59 GMT+09:00 Martin Gammelsæter > <martingammelsae...@gmail.com>: > >> Hi! >> >> I have a Spark cluster running on top of a Cassandra cluster, using >> Datastax' new driver, and one of the fields of my RDDs is an >> XML-string. In a normal Scala sparkjob, parsing that data is no >> problem, but I would like to also make that information available >> through Spark SQL. So, is there any way to write user defined >> functions for Spark SQL? I know that a HiveContext is available, but I >> understand that that is for querying data from Hive, and I don't have >> Hive in my stack (please correct me if I'm wrong). >> >> I would love to be able to do something like the following: >> >> val casRdd = sparkCtx.cassandraTable("ks", "cf") >> >> // registerAsTable etc >> >> val res = sql("SELECT id, xmlGetTag(xmlfield, 'sometag') FROM cf") >> >> -- >> Best regards, >> Martin Gammelsæter > > > > > -- > Takuya UESHIN > Tokyo, Japan > > http://twitter.com/ueshin -- Mvh. Martin Gammelsæter 92209139