Hi, We are doing the following to save a dataframe in parquet (using DirectParquetOutputCommitter) as follows.
dfWriter.format("parquet") .mode(SaveMode.Overwrite) .save(outputPath) The problem is even if an executor fails once while writing file (say some transient HDFS issue), when its re-spawn, it fails again because the file exists already, eventually failing the entire job. Is this a known issue? Any workarounds? Thanks Vinoth