Hi Kshitij, There are option to suppress the metadata files from get created. Set the below properties and try.
1) To disable the transaction logs of spark "spark.sql.sources.commitProtocolClass = org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol". This will help to disable the "committed<TID>" and "started<TID>" files but still _SUCCESS, _common_metadata and _metadata files will generate. 2) We can disable the _common_metadata and _metadata files using "parquet.enable.summary-metadata=false". 3) We can also disable the _SUCCESS file using "mapreduce.fileoutputcommitter.marksuccessfuljobs=false". On Sat, 22 Feb, 2020, 10:51 AM Kshitij, <kshtjkm...@gmail.com> wrote: > Hi, > > There is no dataframe spark API which writes/creates a single file instead > of directory as a result of write operation. > > Below both options will create directory with a random file name. > > df.coalesce(1).write.csv(<path>) > > > > df.write.csv(<path>) > > > Instead of creating directory with standard files (_SUCCESS , _committed , > _started). I want a single file with file_name specified. > > > Thanks >