Hi, We are trying to load around 10k avro files (each file holds only one record) using spark-avro but it takes over 15 minutes to load. It seems that most of the work is being done at the driver where it created a broadcast variable for each file.
Any idea why is it behaving that way ? Thank you. Daniel