Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance

2025-03-25 Thread Prem Sahoo
Just one more variable is Spark 3.5.2 runs on kubernetes and Spark 3.2.0 runs on YARN . It seems kubernetes can be a cause of slowness too .Sent from my iPhoneOn Mar 24, 2025, at 7:10 PM, Prem Gmail wrote:Hello Spark Dev/users,Any one has any clue why and how a better version have performance iss

Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance

2025-03-24 Thread Prem Gmail
Hello Spark Dev/users,Any one has any clue why and how a better version have performance issue .I will be happy to raise JIRA .Sent from my iPhoneOn Mar 24, 2025, at 4:20 PM, Prem Sahoo wrote:The problem is on the writer's side. It takes longer to write to Minio with Spark 3.5.2 and Hadoop 3.4.1

Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance

2025-03-24 Thread Prem Sahoo
The problem is on the writer's side. It takes longer to write to Minio with Spark 3.5.2 and Hadoop 3.4.1 . so it seems there are some tech changes between hadoop 2.7.6 to 3.4.1 which made the write process faster. On Sun, Mar 23, 2025 at 12:09 AM Ángel Álvarez Pascua < angel.alvarez.pas...@gmail.c

Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance

2025-03-22 Thread Ángel Álvarez Pascua
@Prem Sahoo , could you test both versions of Spark+Hadoop by replacing your "write to MinIO" statement with write.format("noop")? This would help us determine whether the issue lies on the reader side or the writer side. El dom, 23 mar 2025 a las 4:53, Prem Gmail () escribió: > V2 writer in 3.

Re: Spark 3.5.2 and Hadoop 3.4.1 slow performance

2025-03-22 Thread Prem Gmail
V2 writer in 3.5.2 and Hadoop 3.4.1 should be much faster than Spark 3.2.0 and Hadoop 2.7.6 but that’s not the case , tried magic committer option which is agin more slow . So internally something changed which made this slow . May I know ?Sent from my iPhoneOn Mar 22, 2025, at 11:05 PM, Kristopher