Re: Spark SQL user defined functions

Martin Gammelsæter Fri, 04 Jul 2014 02:17:34 -0700

Takuya, thanks for your reply :)
I am already doing that, and it is working well. My question is, can I
define arbitrary functions to be used in these queries?


On Fri, Jul 4, 2014 at 11:12 AM, Takuya UESHIN <ues...@happy-camper.st> wrote:
> Hi,
>
> You can convert standard RDD of Product class (e.g. case class) to SchemaRDD
> by SQLContext.
> Load data from Cassandra into RDD of case class, convert it to SchemaRDD and
> register it,
> then you can use it in your SQLs.
>
> http://spark.apache.org/docs/latest/sql-programming-guide.html#running-sql-on-rdds
>
> Thanks.
>
>
>
> 2014-07-04 17:59 GMT+09:00 Martin Gammelsæter
> <martingammelsae...@gmail.com>:
>
>> Hi!
>>
>> I have a Spark cluster running on top of a Cassandra cluster, using
>> Datastax' new driver, and one of the fields of my RDDs is an
>> XML-string. In a normal Scala sparkjob, parsing that data is no
>> problem, but I would like to also make that information available
>> through Spark SQL. So, is there any way to write user defined
>> functions for Spark SQL? I know that a HiveContext is available, but I
>> understand that that is for querying data from Hive, and I don't have
>> Hive in my stack (please correct me if I'm wrong).
>>
>> I would love to be able to do something like the following:
>>
>> val casRdd = sparkCtx.cassandraTable("ks", "cf")
>>
>> // registerAsTable etc
>>
>> val res = sql("SELECT id, xmlGetTag(xmlfield, 'sometag') FROM cf")
>>
>> --
>> Best regards,
>> Martin Gammelsæter
>
>
>
>
> --
> Takuya UESHIN
> Tokyo, Japan
>
> http://twitter.com/ueshin



-- 
Mvh.
Martin Gammelsæter
92209139

Re: Spark SQL user defined functions

Reply via email to