Hi Jeff,
Thanks for your quick reply. Seems my thinking is stuck on the job style I'm
running. Now I'm much clearer about it.
Best Regards,
Carp
2010/6/23 Jeff Zhang
> Hi 李钰
>
> The size of map output depends on your Mapper class. The Mapper class
> will do proces
(because of insufficient io.sort.mb) is
> much better than risking swapping (by setting io.sort.mb and heap too
> large), in terms of relative performance penalty you will pay.
>
> Cheers,
> Sriguru
>
> >-Original Message-
> >From: 李钰 [mailto:car...@gmail.com]
&g
Dear all,
Here I've got a question about the "io.sort.mb" parameter. We can find
material from Yahoo! or Cloudera which recommend setting this value to 200
if the job scale is large, but I'm confused about this. As I know,
the tasktracker will launch a child-JVM for each task, and “*io.sort.mb*”
p
e ones on Google Code are broken,
> please use the ones on github:
>
> http://github.com/toddlipcon/hadoop-lzo
>
> Thanks
> -Todd
>
>
> On Wed, Jun 9, 2010 at 2:59 AM, 李钰 wrote:
>
> > Hi,
> >
> > While using LZO compression to try to improve perform
Hi,
While using LZO compression to try to improve performance of my cluster, I
found that compression didn't work. The job I run is
"org.apache.hadoop.examples.Sort", with the input data generated by
"org.apache.hadoop.examples.RandomWriter".
I've made sure that I configured lzo native library/jar