Hi all,
I have searched a bit before posting this query.

Using Spark 1.6.1
Dataframe.write().format("parquet").mode(SaveMode.Append).save("location)

Note:- The data in that folder can be deleted and most of the times that
folder doesn't even exist.

Which Savemode is the best, if necessary at all?

I am using Savemode.Append which seems to cause huge amounts of shuffle as
only executioner is doing the actual write. (May be wrong)

Would using Overwrite cause all the executors write to that folder at once
or would this also send data to one single executor before writing?

Or should I not use any of the modes at all and just do a write?


Thank You,
Anu

Reply via email to