from:"Ranjan, Abhinav"

Re: Need to order iterator values in spark dataframe

2020-04-01 Thread Ranjan, Abhinav

key, then sortWithinPartition, and the groupByKey. Since data are already hash-partitioned by key, Spark should not shuffle the data hence change the sort wihtin each partition: ds.repartition($"key").sortWithinPartitions($"code").groupBy($"key") Enrico Am 26.03.20 um

Need to order iterator values in spark dataframe

2020-03-26 Thread Ranjan, Abhinav

Hi, I have a dataframe which has data like: key | code | code_value 1 | c1 | 11 1 | c2 | 12 1 | c2 | 9 1 | c3

override collect_list

2019-11-26 Thread Ranjan, Abhinav

Hi all, I want to collect some rows in a list by using the spark's collect_list function. However, the no. of rows getting in the list is overflowing the memory. Is there any way to force the collection of rows onto the disk rather than in memory, or else instead of collecting it as a list,

Re: Need to order iterator values in spark dataframe

Need to order iterator values in spark dataframe

override collect_list

3 matches

Site Navigation

Mail list logo

Footer information