By default Spark will create one file per partition. Spark SQL defaults to
using 200 partitions. If you want to reduce the number of files written
out, repartition your dataframe using repartition and give it the desired
number of partitions.
originalDF.repartition(10).write.avro("masterNew.avro")
How many reducers you had that created those avro files?
Each reducer very likely creates its own avro part- file.
We normally use Parquet, but it should be the same for Avro, so this might
be
relevant
http://stackoverflow.com/questions/34026764/how-to-limit-parquet-file-dimension-for-a-parquet-ta