is the files you said the files from runned jobs of our system? and them can't be so much large.
why is the cause of namenode. what are hiveserver doing when it use so large memory? how do you use hive? our method using hiveserver is correct? Thanks. 在 2011-12-12 14:27:09,"Aaron Sun" <aaron.su...@gmail.com> 写道: Not sure if this is because of the number of files, since the namenode would track each of the file and directory, and blocks. See this one. http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Please correct me if I am wrong, because this seems to be more like a hdfs problem which is actually irrelevant to Hive. Thanks Aaron 2011/12/11 王锋 <wfeng1...@163.com> I want to know why the hiveserver use so large memory,and where the memory has been used ? 在 2011-12-12 10:02:44,"王锋" <wfeng1...@163.com> 写道: The namenode summary: the mr summary and hiveserver: hiveserver jvm args: export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC -XX:ParallelGCThreads=20 -XX:+UseParall elOldGC -XX:-UseGCOverheadLimit -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps" now we using 3 hiveservers in the same machine. 在 2011-12-12 09:54:29,"Aaron Sun" <aaron.su...@gmail.com> 写道: how's the data look like? and what's the size of the cluster? 2011/12/11 王锋 <wfeng1...@163.com> Hi, I'm one of engieer of sina.com. We have used hive ,hiveserver several months. We have our own tasks schedule system .The system can schedule tasks running with hiveserver by jdbc. But The hiveserver use mem very large, usally large than 10g. we have 5min tasks which will be running every 5 minutes.,and have hourly tasks .total num of tasks is 40. And we start 3 hiveserver in one linux server,and be cycle connected . so why Memory of hiveserver using so large and how we do or some suggestion from you ? Thanks and Best Regards! Royce Wang
<<inline: namenode.png>>
<<inline: mr.png>>
<<inline: hiveserver.png>>