Hi. All Hadoop components are started with -Xmx1000M as per default. I am planning to throw in some data/task nodes here and there in my arch. However most machines have only 4G physical RAM so allocating 2G + overhead ~2.5G to hadoop is a little risky since they could very well become inaccessible if it needs to compete with other processes for RAM. I have experienced this many times with java processes going haywire where I run other services in parallell. Anyway I would like to understand the reasoning about having 1G allocated per process. I figure that the DataNode could survive with a little less as well the TaskTracker if the jobs running in it do not consume so much memory. Of course each process would like to have even more memory than 1G but if I need to cut down I would like to know which to cut and what I loose by doing so.
Any thoughts? Trial and error is of course an option but I would like to hear the basic thoughts about how memory should be utilized to gain max out of the boxes. Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [email protected] http://www.tailsweep.com/ http://blogg.tailsweep.com/
