Owen, one problem with Arun's slide deck is that while it lists the
parameters that matter, it doesn't list suggested values for them. Do you
have any guide about that? In particular, the only places I know that talk
about how to set these parameters are
http://www.cloudera.com/blog/2009/03/30/configuration-parameters-what-can-you-just-ignore/and
http://wiki.apache.org/hadoop/FAQ#3.

On Wed, Jun 10, 2009 at 12:14 PM, Owen O'Malley <[email protected]> wrote:

> Take a look at Arun's slide deck on Hadoop performance:
>
> http://bit.ly/EDCg3
>
> It is important to get io.sort.mb large enough, the io.sort.factor should
> be closer to 100 instead of 10. I'd also use large block sizes to reduce the
> number of maps. Please see the deck for other important factors.
>
> -- Owen
>

Reply via email to