Write your own input format/datasource or split the file yourself beforehand (not recommended).
> On 10. Oct 2017, at 09:14, Kanagha Kumar <kpra...@salesforce.com> wrote: > > Hi, > > I'm trying to read a 60GB HDFS file using spark textFile("hdfs_file_path", > minPartitions). > > How can I control the no.of tasks by increasing the split size? With default > split size of 250 MB, several tasks are created. But I would like to have a > specific no.of tasks created while reading from HDFS itself instead of using > repartition() etc., > > Any suggestions are helpful! > > Thanks > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org