Re: Questions about recommendation value of the "io.sort.mb" parameter

2010-06-23 Thread
Hi Jeff, Thanks for your quick reply. Seems my thinking is stuck on the job style I'm running. Now I'm much clearer about it. Best Regards, Carp 2010/6/23 Jeff Zhang > Hi 李钰 > > The size of map output depends on your Mapper class. The Mapper class > will do proces

Re: Questions about recommendation value of the "io.sort.mb" parameter

2010-06-23 Thread
(because of insufficient io.sort.mb) is > much better than risking swapping (by setting io.sort.mb and heap too > large), in terms of relative performance penalty you will pay. > > Cheers, > Sriguru > > >-Original Message- > >From: 李钰 [mailto:car...@gmail.com] &g

Questions about recommendation value of the "io.sort.mb" parameter

2010-06-22 Thread
Dear all, Here I've got a question about the "io.sort.mb" parameter. We can find material from Yahoo! or Cloudera which recommend setting this value to 200 if the job scale is large, but I'm confused about this. As I know, the tasktracker will launch a child-JVM for each task, and “*io.sort.mb*” p

Re: Problem found while using LZO compression in Hadoop 0.20.1

2010-06-09 Thread
e ones on Google Code are broken, > please use the ones on github: > > http://github.com/toddlipcon/hadoop-lzo > > Thanks > -Todd > > > On Wed, Jun 9, 2010 at 2:59 AM, 李钰 wrote: > > > Hi, > > > > While using LZO compression to try to improve perform

Problem found while using LZO compression in Hadoop 0.20.1

2010-06-09 Thread
Hi, While using LZO compression to try to improve performance of my cluster, I found that compression didn't work. The job I run is "org.apache.hadoop.examples.Sort", with the input data generated by "org.apache.hadoop.examples.RandomWriter". I've made sure that I configured lzo native library/jar