Hi, This is on Spark 3.1 environment.
For some reason, I can ONLY do this in Spark SQL, instead of either Scala or
PySpark environment.
I want to aggregate an array into a Map of element count, within that array,
but in Spark SQL.
I know that there is an aggregate function available like
aggre
Hi Mich,
Thank you. Ah, I want to avoid bringing all data to the driver node. That
is my understanding of what will happen in that case. Perhaps, I'll trigger
a Lambda to rename/combine the files after PySpark writes them.
Cheers,
Marco.
On Thu, May 4, 2023 at 5:25 PM Mich Talebzadeh
wrote:
>