Hi, I am getting java.lang.OutOfMemoryError: Java heap space error whenever I ran the spark sql job.
I came to conclusion issue is because of reading number of files from spark. I am reading 37 partitions and each partition has around 2000 files with filesize more than 128 MB 37*2000 files from spark. can anyone provide solution on how to merge all files while reading in spark and run it efficently. Increasing executor memory didnt resolve my issue. I went from 16 GB to 64 GB stil no luck. Thanks, Asmath