date:20230505

Can Spark SQL (not DataFrame or Dataset) aggregate array into map of element of count?

2023-05-05 Thread Yong Zhang

Hi, This is on Spark 3.1 environment. For some reason, I can ONLY do this in Spark SQL, instead of either Scala or PySpark environment. I want to aggregate an array into a Map of element count, within that array, but in Spark SQL. I know that there is an aggregate function available like aggre

Re: Write DataFrame with Partition and choose Filename in PySpark

2023-05-05 Thread Marco Costantini

Hi Mich, Thank you. Ah, I want to avoid bringing all data to the driver node. That is my understanding of what will happen in that case. Perhaps, I'll trigger a Lambda to rename/combine the files after PySpark writes them. Cheers, Marco. On Thu, May 4, 2023 at 5:25 PM Mich Talebzadeh wrote: >