Re: Re: size of RCFile in hive

2012-09-28 Thread gemini alex
1. lower your mapper number, 2. Chen Song's suggestion is also work. 3. using shell command cat your small file into bigger one. 2012/9/27 Chen Song > You can force reduce phase by adding distribute by or order by clause > after your select query. > > On Thu, Sep 27, 2012 at 2:03 PM, 王锋 wrote:

Re: Re: size of RCFile in hive

2012-09-27 Thread Chen Song
You can force reduce phase by adding distribute by or order by clause after your select query. On Thu, Sep 27, 2012 at 2:03 PM, 王锋 wrote: > but it's map only job > > > At 2012-09-27 05:39:39,"Chen Song" wrote: > > As far as I know, the number of files emitted would be determined by the > number

Re:Re: size of RCFile in hive

2012-09-26 Thread 王锋
but it's map only job At 2012-09-27 05:39:39,"Chen Song" wrote: As far as I know, the number of files emitted would be determined by the number of mappers for a map only job and the number of reducers for a map reduce job. So it totally depends how your query translates into a MR job. You ca

Re: size of RCFile in hive

2012-09-26 Thread Chen Song
As far as I know, the number of files emitted would be determined by the number of mappers for a map only job and the number of reducers for a map reduce job. So it totally depends how your query translates into a MR job. You can enforce it by setting the property *mapred.reduce.tasks=1* Chen

size of RCFile in hive

2012-09-19 Thread 王锋
Hi I tried to convert and merge many small text files using RCFiles using hivesql,but hive produced some small rcfiles. set hive.exec.compress.output=true; set mapred.output.compress=true; set mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec; set io.compression.codecs=com.ha