Hi All,
Please suggest me the best approach to achieve result. [ Please comment if
the existing logic is fine or not]
Input Record :
ColA ColB ColC
1 2 56
1 2 46
1 3 45
1 5 34
1 5 90
2 1 89
2 5 45
Expected Result
ResA ResB
1 2:2|3:3|5:5
2 1:1|5:5
I followd the below Spark steps
(Spark version - 1.5.0)
def valsplit(elem :scala.collection.mutable.WrappedArray[String]) : String
=
{
elem.map(e => e+":"+e).mkString("|")
}
sqlContext.udf.register("valudf",valsplit(_:scala.collection.mutable.WrappedArray[String]))
val x =sqlContext.sql("select site,valudf(collect_set(requests)) as test
from sel group by site").first
--
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"