pushing it forward
yourself 😊 Let me know if you need an extra pair of hands!
Thanks,
Ximo.
De: Cheng Su
Enviado el: miércoles, 9 de septiembre de 2020 8:57
Para: XIMO GUANTER GONZALBEZ ; Reynold
Xin
CC: Spark Dev List
Asunto: Re: Avoiding unnnecessary sort in
FileFormatWriter
performance in our
scenario.
Cheers,
Ximo.
De: Cheng Su
Enviado el: viernes, 4 de septiembre de 2020 20:38
Para: Reynold Xin ; XIMO GUANTER GONZALBEZ
CC: Spark Dev List
Asunto: Re: Avoiding unnnecessary sort in
FileFormatWriter/DynamicPartitionDataWriter
Hi,
Just for context - I created th
Hello,
I have observed that if a DataFrame is saved with partitioning columns in
Parquet, then a sort is performed in FileFormatWriter (see
https://github.com/apache/spark/blob/v3.0.0/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala#L152)
because Dynamic