All right thanks for inputs is there any way spark can process all combination
parallel in one job ?
If is it ok to load the input csv file in dataframe and use flat map to create
key pair, then use reduceByKey to sum the double array? I believe that will
work same like agg function which you
You can explore grouping sets in SQL and write an aggregate function to add
array wise sum.
It will boil down to something like
Select attr1,attr2...,yourAgg(Val)
>From t
Group by attr1,attr2...
Grouping sets((attr1,attr2),(aytr1))
On 12 Nov 2016 04:57, "Anil Langote" wrote:
> Hi All,
>
>
>
> I
Hi All,
I have been working on one use case and couldn’t able to think the better
solution, I have seen you very active on spark user list please throw your
thoughts on implementation. Below is the requirement.
I have tried using dataset by splitting the double array column but it fails