date:20240520

Re: EXT: Dual Write to HDFS and MinIO in faster way

2024-05-20 Thread Vibhor Gupta

Hi Prem, You can try to write to HDFS then read from HDFS and write to MinIO. This will prevent duplicate transformation. You can also try persisting the dataframe using the DISK_ONLY level. Regards, Vibhor From: Prem Sahoo Date: Tuesday, 21 May 2024 at 8:16 AM To: Spark dev list Subject: EXT

Dual Write to HDFS and MinIO in faster way

2024-05-20 Thread Prem Sahoo

Hello Team, I am planning to write to two datasource at the same time . Scenario:- Writing the same dataframe to HDFS and MinIO without re-executing the transformations and no cache(). Then how can we make it faster ? Read the parquet file and do a few transformations and write to HDFS and MinIO