Re: Hadoop benchmarking

Aaron Kimball Wed, 10 Jun 2009 09:14:15 -0700

Hi Stephen,

That will set the maximum heap allowable, but doesn't tell Hadoop's internal
systems necessarily to take advantage of it. There's a number of other
settings that adjust performance. At Cloudera we have a config tool that
generates Hadoop configurations with reasonable first-approximation values
for your cluster -- check out http://my.cloudera.com and look at the
hadoop-site.xml it generates. If you start from there you might find a
better parameter space to explore. Please share back your findings -- we'd
love to tweak the tool even more with some external feedback :)


- Aaron


On Wed, Jun 10, 2009 at 7:39 AM, stephen mulcahy
<[email protected]>wrote:

> Hi,
>
> I'm currently doing some testing of different configurations using the
> Hadoop Sort as follows,
>
> bin/hadoop jar hadoop-*-examples.jar randomwriter
> -Dtest.randomwrite.total_bytes=107374182400 /benchmark100
>
> bin/hadoop jar hadoop-*-examples.jar sort /benchmark100 rand-sort
>
> The only changes I've made from the standard config are the following in
> conf/mapred-site.xml
>
>   <property>
>     <name>mapred.child.java.opts</name>
>     <value>-Xmx1024M</value>
>   </property>
>
>   <property>
>     <name>mapred.tasktracker.map.tasks.maximum</name>
>     <value>8</value>
>   </property>
>
>   <property>
>     <name>mapred.tasktracker.reduce.tasks.maximum</name>
>     <value>4</value>
>   </property>
>
> I'm running this on 4 systems, each with 8 processor cores and 4 separate
> disks.
>
> Is there anything else I should change to stress memory more? The systems
> in questions have 16GB of memory but the most thats getting used during a
> run of this benchmark is about 2GB (and most of that seems to be os
> caching).
>
> Thanks,
>
> -stephen
>
> --
> Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
> NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
> http://di2.deri.ie    http://webstar.deri.ie    http://sindice.com
>

Re: Hadoop benchmarking

Reply via email to