Hi, Bejoy.
I find the mapred.min.split.size in mapred-default.xml, but there is
no mapred.max.split.size property.
I'm using hadoop 0.20.205.0.

Maybe only newer versions support mapred.max.split.size?


On Fri, Nov 16, 2012 at 8:08 PM, Cheng Su <scarcer...@gmail.com> wrote:
> Thank you so much :)
>
> On Fri, Nov 16, 2012 at 5:49 PM, Bejoy KS <bejoy...@yahoo.com> wrote:
>> Hi Chen
>>
>> The computation on the number of Input Splits/ map tasks is totally 
>> determined by the InputFormat used as well as the split size.
>>
>> Hive used CombineHiveInput format so you may not be having one mapper per 
>> file if your files are small. You can control the number of maps by 
>> controlling the split sizes.
>> Mapred.min.split.size
>> Mapred.max.split.size
>>
>> Regards
>> Bejoy KS
>>
>> Sent from handheld, please excuse typos.
>>
>> -----Original Message-----
>> From: Cheng Su <scarcer...@gmail.com>
>> Date: Fri, 16 Nov 2012 14:39:57
>> To: <user@hive.apache.org>
>> Reply-To: user@hive.apache.org
>> Subject: How does hive decide to launch how many map tasks?
>>
>> Hi, all
>>
>> How does hive decide to launch how many map tasks?
>> I know there are some configs to help hive to decide how many reduce
>> task to launch?
>> But how about map tasks?
>>
>> I thought that number of map tasks equals to the number of the store files.
>> I have a table now with 2 partitions, and one has 4 files in it, the
>> other has 2,
>> when I execute "select count(*) from table", only one map is launched.
>>
>> How can I increase the number of map tasks to improve the performance?
>>
>> Thanks.
>>
>> --
>>
>> Regards,
>> Cheng Su
>
>
>
> --
>
> Regards,
> Cheng Su



-- 

Regards,
Cheng Su

Reply via email to