1. lower your mapper number, 2. Chen Song's suggestion is also work. 3. using shell command cat your small file into bigger one.
2012/9/27 Chen Song <chen.song...@gmail.com> > You can force reduce phase by adding distribute by or order by clause > after your select query. > > On Thu, Sep 27, 2012 at 2:03 PM, 王锋 <wfeng1...@163.com> wrote: > >> but it's map only job >> >> >> At 2012-09-27 05:39:39,"Chen Song" <chen.song...@gmail.com> wrote: >> >> As far as I know, the number of files emitted would be determined by the >> number of mappers for a map only job and the number of reducers for a map >> reduce job. >> >> So it totally depends how your query translates into a MR job. >> >> You can enforce it by setting the property >> >> *mapred.reduce.tasks=1* >> >> Chen >> >> On Wed, Sep 19, 2012 at 11:25 PM, 王锋 <wfeng1...@163.com> wrote: >> >>> Hi >>> I tried to convert and merge many small text files using RCFiles >>> using hivesql,but hive produced some small rcfiles. >>> set hive.exec.compress.output=true; >>> set mapred.output.compress=true; >>> set mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec; >>> set io.compression.codecs=com.hadoop.compression.lzo.LzoCodec; >>> hive.merge.mapfiles=true >>> hive.merge.mapredfiles=true >>> hive.merge.size.per.task=640000000 >>> hive.merge.size.smallfiles.avgsize=80000000 >>> insert overwrite table rctable select ..... >>> >>> >>> the settings: >>> hive.merge.mapfiles=true >>> hive.merge.mapredfiles=true >>> hive.merge.size.per.task=640000000 >>> hive.merge.size.smallfiles.avgsize=80000000 >>> didn't work. >>> >>> >>> who could tell me how to solve it? >> >> >> >> >> -- >> Chen Song >> >> >> >> >> > > > -- > Chen Song > > >