Spark SQL dataframes explode /lateral view help

Deenar Toraskar Tue, 05 Jan 2016 12:00:40 -0800

Hi All

I have the following spark sql query and would like to use convert this to
use the dataframes api (spark 1.6). The eee, eep and pfep are all maps of
(int -> float)



select e.counterparty, epe, mpfe, eepe, noOfMonthseep, teee as
effectiveExpectedExposure, teep as expectedExposure , tpfep as pfe
|from exposureMeasuresCpty e
  |  lateral view explode(eee) dummy1 as noOfMonthseee, teee
  |  lateral view explode(eep) dummy2 as noOfMonthseep, teep
  |  lateral view explode(pfep) dummy3 as noOfMonthspfep, tpfep
  |where e.counterparty = '$cpty' and noOfMonthseee = noOfMonthseep and
noOfMonthseee = noOfMonthspfep
  |order by noOfMonthseep""".stripMargin

Any guidance or samples would be appreciated. I have seen code snippets
that handle arrays, but havent come across how to handle maps

case class Book(title: String, words: String)
   val df: RDD[Book]

   case class Word(word: String)
   val allWords = df.explode('words) {
     case Row(words: String) => words.split(" ").map(Word(_))
   }

   val bookCountPerWord = allWords.groupBy("word").agg(countDistinct("title"))


Regards
Deenar

Spark SQL dataframes explode /lateral view help

Reply via email to