Re: mapreduce.input.fileinputformat.split.maxsize not working for spark 2.4.0

2019-03-07 Thread Akshay Mendole
Hi, No. It's a java application that uses RDD APIs. Thanks, Akshay On Mon, Feb 25, 2019 at 7:54 AM Manu Zhang wrote: > Is your application using Spark SQL / DataFrame API ? Is so, please try > setting > > spark.sql.files.maxPartitionBytes > > to a larger value which is 128MB by default. >

Re: mapreduce.input.fileinputformat.split.maxsize not working for spark 2.4.0

2019-02-24 Thread Manu Zhang
Is your application using Spark SQL / DataFrame API ? Is so, please try setting spark.sql.files.maxPartitionBytes to a larger value which is 128MB by default. Thanks, Manu Zhang On Feb 25, 2019, 2:58 AM +0800, Akshay Mendole , wrote: > Hi, >    We have dfs.blocksize configured to be 512MB  and w

mapreduce.input.fileinputformat.split.maxsize not working for spark 2.4.0

2019-02-24 Thread Akshay Mendole
Hi, We have dfs.blocksize configured to be 512MB and we have some large files in hdfs that we want to process with spark application. We want to split the files get more splits to optimise for memory but the above mentioned parameters are not working The max and min size params as below are con