Hi, Bejoy. I find the mapred.min.split.size in mapred-default.xml, but there is no mapred.max.split.size property. I'm using hadoop 0.20.205.0.
Maybe only newer versions support mapred.max.split.size? On Fri, Nov 16, 2012 at 8:08 PM, Cheng Su <scarcer...@gmail.com> wrote: > Thank you so much :) > > On Fri, Nov 16, 2012 at 5:49 PM, Bejoy KS <bejoy...@yahoo.com> wrote: >> Hi Chen >> >> The computation on the number of Input Splits/ map tasks is totally >> determined by the InputFormat used as well as the split size. >> >> Hive used CombineHiveInput format so you may not be having one mapper per >> file if your files are small. You can control the number of maps by >> controlling the split sizes. >> Mapred.min.split.size >> Mapred.max.split.size >> >> Regards >> Bejoy KS >> >> Sent from handheld, please excuse typos. >> >> -----Original Message----- >> From: Cheng Su <scarcer...@gmail.com> >> Date: Fri, 16 Nov 2012 14:39:57 >> To: <user@hive.apache.org> >> Reply-To: user@hive.apache.org >> Subject: How does hive decide to launch how many map tasks? >> >> Hi, all >> >> How does hive decide to launch how many map tasks? >> I know there are some configs to help hive to decide how many reduce >> task to launch? >> But how about map tasks? >> >> I thought that number of map tasks equals to the number of the store files. >> I have a table now with 2 partitions, and one has 4 files in it, the >> other has 2, >> when I execute "select count(*) from table", only one map is launched. >> >> How can I increase the number of map tasks to improve the performance? >> >> Thanks. >> >> -- >> >> Regards, >> Cheng Su > > > > -- > > Regards, > Cheng Su -- Regards, Cheng Su