As far as I know, the number of files emitted would be determined by the number of mappers for a map only job and the number of reducers for a map reduce job.
So it totally depends how your query translates into a MR job. You can enforce it by setting the property *mapred.reduce.tasks=1* Chen On Wed, Sep 19, 2012 at 11:25 PM, 王锋 <wfeng1...@163.com> wrote: > Hi > I tried to convert and merge many small text files using RCFiles using > hivesql,but hive produced some small rcfiles. > set hive.exec.compress.output=true; > set mapred.output.compress=true; > set mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec; > set io.compression.codecs=com.hadoop.compression.lzo.LzoCodec; > hive.merge.mapfiles=true > hive.merge.mapredfiles=true > hive.merge.size.per.task=640000000 > hive.merge.size.smallfiles.avgsize=80000000 > insert overwrite table rctable select ..... > > > the settings: > hive.merge.mapfiles=true > hive.merge.mapredfiles=true > hive.merge.size.per.task=640000000 > hive.merge.size.smallfiles.avgsize=80000000 > didn't work. > > > who could tell me how to solve it? -- Chen Song