Reading from HDFS by increasing split size

Kanagha Kumar Tue, 10 Oct 2017 00:46:04 -0700

Hi,

I'm trying to read a 60GB HDFS file using spark textFile("hdfs_file_path",
minPartitions).


How can I control the no.of tasks by increasing the split size? With
default split size of 250 MB, several tasks are created. But I would like
to have a specific no.of tasks created while reading from HDFS itself
instead of using repartition() etc.,

Any suggestions are helpful!

Thanks

Reading from HDFS by increasing split size

Reply via email to