May be I should consider something like impala ?
Le ven. 15 déc. 2017 à 11:32, Julien CHAMP <[email protected]> a écrit :
> Hi Spark Community members !
>
> I want to do several ( from 1 to 10) aggregate functions using window
> functions on something like 100 columns.
>
> Instead of doing several pass on the data to compute each aggregate
> function, is there a way to do this efficiently ?
>
>
>
> Currently it seems that doing
>
>
> val tw =
> Window
> .orderBy("date")
> .partitionBy("id")
> .rangeBetween(-8035200000L, 0)
>
> and then
>
> x
> .withColumn("agg1", max("col").over(tw))
> .withColumn("agg2", min("col").over(tw))
> .withColumn("aggX", avg("col").over(tw))
>
>
> Is not really efficient :/
> It seems that it iterates on the whole column for each aggregation ? Am I
> right ?
>
> Is there a way to compute all the required operations on a columns with a
> single pass ?
> Event better, to compute all the required operations on ALL columns with a
> single pass ?
>
> Thx for your Future[Answers]
>
> Julien
>
>
>
>
>
> --
>
>
> Julien CHAMP — Data Scientist
>
>
> *Web : **www.tellmeplus.com* <http://tellmeplus.com/> — *Email :
> **[email protected]
> <[email protected]>*
>
> *Phone ** : **06 89 35 01 89 <0689350189> * — *LinkedIn* : *here*
> <https://www.linkedin.com/in/julienchamp>
>
> TellMePlus S.A — Predictive Objects
>
> *Paris* : 7 rue des Pommerots, 78400 Chatou
> *Montpellier* : 51 impasse des églantiers, 34980 St Clément de Rivière
>
--
Julien CHAMP — Data Scientist
*Web : **www.tellmeplus.com* <http://tellmeplus.com/> — *Email :
**[email protected]
<[email protected]>*
*Phone ** : **06 89 35 01 89 <0689350189> * — *LinkedIn* : *here*
<https://www.linkedin.com/in/julienchamp>
TellMePlus S.A — Predictive Objects
*Paris* : 7 rue des Pommerots, 78400 Chatou
*Montpellier* : 51 impasse des églantiers, 34980 St Clément de Rivière
--
Ce message peut contenir des informations confidentielles ou couvertes par
le secret professionnel, à l’intention de son destinataire. Si vous n’en
êtes pas le destinataire, merci de contacter l’expéditeur et d’en supprimer
toute copie.
This email may contain confidential and/or privileged information for the
intended recipient. If you are not the intended recipient, please contact
the sender and delete all copies.
--
<http://www.tellmeplus.com/assets/emailing/banner.html>