Re: [pyspark 2.3+] repartition followed by window function

2019-05-22 Thread Shraddha Shah
Any suggestions? On Wed, May 22, 2019 at 6:32 AM Rishi Shah wrote: > Hi All, > > If dataframe is repartitioned in memory by (date, id) columns and then if > I use multiple window functions which uses partition by clause with (date, > id) columns --> we can avoid shuffle/sort again I believe.. Ca

[pyspark 2.3+] repartition followed by window function

2019-05-22 Thread Rishi Shah
Hi All, If dataframe is repartitioned in memory by (date, id) columns and then if I use multiple window functions which uses partition by clause with (date, id) columns --> we can avoid shuffle/sort again I believe.. Can someone confirm this? However what happens when dataframe repartition was do