Hi, We have LZO compressed JSON files in our HDFS locations. I am creating an "External" table on the data in HDFS for the purpose of analytics.
There are 3 LZO compressed part files of size 229.16 MB, 705.79 MB, 157.61 MB respectively along with their index files. When I run count(*) query on the table I observe only 10 mappers causing performance bottleneck. I even tried following, (going for 30MB split) 1) set mapreduce.input.fileinputformat.split.maxsize=31457280; 2) set dfs.blocksize=31457280; But still I am getting 10 mappers. Can you please guide me in fixing the same? Thanks, Sree Harsha