Hi, I'm trying to read a 60GB HDFS file using spark textFile("hdfs_file_path", minPartitions).
How can I control the no.of tasks by increasing the split size? With default split size of 250 MB, several tasks are created. But I would like to have a specific no.of tasks created while reading from HDFS itself instead of using repartition() etc., Any suggestions are helpful! Thanks