Hello ,
I read the csv file having size of 2.7 gb which is having 100 columns , when I am converting this to parquet with Spark 3.2 and Hadoop 2.7.6 it takes 28 secs but in Spark 3.5.2 and Hadoop 3.4.1 it takes 34 secs . This stat is bad . 
Sent from my iPhone

On Mar 22, 2025, at 9:21 PM, Ángel Álvarez Pascua <angel.alvarez.pas...@gmail.com> wrote:


Sure. I love performance challenges and mysteries!

Please, could you provide an example project or the steps to build one?

Thanks.

El dom, 23 mar 2025, 2:17, Prem Sahoo <prem.re...@gmail.com> escribió:
Hello Team,
I was working with Spark 3.2 and Hadoop 2.7.6 and writing to MinIO object storage . It was slower when compared to write to MapR FS with above tech stack. Then moved on to later upgraded version of Spark 3.5.2 and Hadoop 4.3.1 which started writing to MinIO with V2 fileoutputcommitter and check ed the performance which is worse than old tech stack. Then tried using magic committer and it came out slower than V2 so with the latest tech stack the performance is down graded. Could some please assist .
Sent from my iPhone
---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to