1. lower your mapper number,
2. Chen Song's suggestion is also work.
3. using shell command cat your small file into bigger one.
2012/9/27 Chen Song
> You can force reduce phase by adding distribute by or order by clause
> after your select query.
>
> On Thu, Sep 27, 2012 at 2:03 PM, 王锋 wrote:
You can force reduce phase by adding distribute by or order by clause after
your select query.
On Thu, Sep 27, 2012 at 2:03 PM, 王锋 wrote:
> but it's map only job
>
>
> At 2012-09-27 05:39:39,"Chen Song" wrote:
>
> As far as I know, the number of files emitted would be determined by the
> number
but it's map only job
At 2012-09-27 05:39:39,"Chen Song" wrote:
As far as I know, the number of files emitted would be determined by the number
of mappers for a map only job and the number of reducers for a map reduce job.
So it totally depends how your query translates into a MR job.
You ca
As far as I know, the number of files emitted would be determined by the
number of mappers for a map only job and the number of reducers for a map
reduce job.
So it totally depends how your query translates into a MR job.
You can enforce it by setting the property
*mapred.reduce.tasks=1*
Chen
Hi
I tried to convert and merge many small text files using RCFiles using
hivesql,but hive produced some small rcfiles.
set hive.exec.compress.output=true;
set mapred.output.compress=true;
set mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
set io.compression.codecs=com.ha