Hi All
I have the following spark sql query and would like to use convert this to
use the dataframes api (spark 1.6). The eee, eep and pfep are all maps of
(int -> float)
select e.counterparty, epe, mpfe, eepe, noOfMonthseep, teee as
effectiveExpectedExposure, teep as expectedExposure , tpfep as pfe
|from exposureMeasuresCpty e
| lateral view explode(eee) dummy1 as noOfMonthseee, teee
| lateral view explode(eep) dummy2 as noOfMonthseep, teep
| lateral view explode(pfep) dummy3 as noOfMonthspfep, tpfep
|where e.counterparty = '$cpty' and noOfMonthseee = noOfMonthseep and
noOfMonthseee = noOfMonthspfep
|order by noOfMonthseep""".stripMargin
Any guidance or samples would be appreciated. I have seen code snippets
that handle arrays, but havent come across how to handle maps
case class Book(title: String, words: String)
val df: RDD[Book]
case class Word(word: String)
val allWords = df.explode('words) {
case Row(words: String) => words.split(" ").map(Word(_))
}
val bookCountPerWord = allWords.groupBy("word").agg(countDistinct("title"))
Regards
Deenar