Hi there,
I have the following scenario:
My files have 2 attributes and 1 numeric value:
(attr1,attr2,val)
I want to generate the percentage of values of each of attr1 on the sum of
val grouped on attr2
Currently I am doing it like this:
input.map(e => e._2.toString.split(","))
.map(e=>
(e(0),Utils.getMonthFromDate(e(1).toLong),e(3).toDouble,e(3).toDouble))
.groupBy(0,1)
.sum(2)
.groupBy(1)
.sum(3)
.map(e => (e._1,e._2,scala.math.BigDecimal(e._3*1.0/e._4*1.0).toString()))
Is there a more efficient way to calculate this?
Thanks a lot!