The hive log:
Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121840_767713480.txt 8159.581: [GC [PSYoungGen: 1927208K->688K(2187648K)] 9102425K->7176256K(9867648K), 0.0765670 secs] [Times: user=0.36 sys=0.00, real=0.08 secs] Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121841_451939518.txt 8219.455: [GC [PSYoungGen: 1823477K->608K(2106752K)] 8999046K->7176707K(9786752K), 0.0719450 secs] [Times: user=0.66 sys=0.01, real=0.07 secs] Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121842_1930999319.txt Now we have 3 hiveservers and I set the concurrent job num to 4,but the Mem still be so large .I'm mad, God have other suggestions ? 在 2011-12-12 17:59:52,"alo alt" <wget.n...@googlemail.com> 写道: >When you start a high-load hive query can you watch the stack-traces? >Its possible over the webinterface: >http://jobtracker:50030/stacks > >- Alex > > >2011/12/12 王锋 <wfeng1...@163.com> >> >> hiveserver will throw oom after several hours . >> >> >> At 2011-12-12 17:39:21,"alo alt" <wget.n...@googlemail.com> wrote: >> >> what happen when you set xmx=2048m or similar? Did that have any negative >> effects for running queries? >> >> 2011/12/12 王锋 <wfeng1...@163.com> >>> >>> I have modify hive jvm args. >>> the new args is -Xmx15000m -XX:NewRatio=1 -Xms2000m . >>> >>> but the memory used by hiveserver is still large. >>> >>> >>> >>> >>> >>> At 2011-12-12 16:20:54,"Aaron Sun" <aaron.su...@gmail.com> wrote: >>> >>> Not from the running jobs, what I am saying is the heap size of the Hadoop >>> really depends on the number of files, directories on the HDFS. Remove old >>> files periodically or merge small files would bring in some performance >>> boost. >>> >>> On the Hive end, the memory consumed also depends on the queries that are >>> executed. Monitor the reducers of the Hadoop job, and my experiences are >>> that reduce part could be the bottleneck here. >>> >>> It's totally okay to host multiple Hive servers on one machine. >>> >>> 2011/12/12 王锋 <wfeng1...@163.com> >>>> >>>> is the files you said the files from runned jobs of our system? and them >>>> can't be so much large. >>>> >>>> why is the cause of namenode. what are hiveserver doing when it use so >>>> large memory? >>>> >>>> how do you use hive? our method using hiveserver is correct? >>>> >>>> Thanks. >>>> >>>> 在 2011-12-12 14:27:09,"Aaron Sun" <aaron.su...@gmail.com> 写道: >>>> >>>> Not sure if this is because of the number of files, since the namenode >>>> would track each of the file and directory, and blocks. >>>> See this one. http://www.cloudera.com/blog/2009/02/the-small-files-problem/ >>>> >>>> Please correct me if I am wrong, because this seems to be more like a hdfs >>>> problem which is actually irrelevant to Hive. >>>> >>>> Thanks >>>> Aaron >>>> >>>> 2011/12/11 王锋 <wfeng1...@163.com> >>>>> >>>>> >>>>> I want to know why the hiveserver use so large memory,and where the >>>>> memory has been used ? >>>>> >>>>> 在 2011-12-12 10:02:44,"王锋" <wfeng1...@163.com> 写道: >>>>> >>>>> >>>>> The namenode summary: >>>>> >>>>> >>>>> >>>>> the mr summary >>>>> >>>>> >>>>> and hiveserver: >>>>> >>>>> >>>>> hiveserver jvm args: >>>>> export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m >>>>> -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC >>>>> -XX:ParallelGCThreads=20 -XX:+UseParall >>>>> elOldGC -XX:-UseGCOverheadLimit -verbose:gc -XX:+PrintGCDetails >>>>> -XX:+PrintGCTimeStamps" >>>>> >>>>> now we using 3 hiveservers in the same machine. >>>>> >>>>> >>>>> 在 2011-12-12 09:54:29,"Aaron Sun" <aaron.su...@gmail.com> 写道: >>>>> >>>>> how's the data look like? and what's the size of the cluster? >>>>> >>>>> 2011/12/11 王锋 <wfeng1...@163.com> >>>>>> >>>>>> Hi, >>>>>> >>>>>> I'm one of engieer of sina.com. We have used hive ,hiveserver >>>>>> several months. We have our own tasks schedule system .The system can >>>>>> schedule tasks running with hiveserver by jdbc. >>>>>> >>>>>> But The hiveserver use mem very large, usally large than 10g. we >>>>>> have 5min tasks which will be running every 5 minutes.,and have hourly >>>>>> tasks .total num of tasks is 40. And we start 3 hiveserver in one linux >>>>>> server,and be cycle connected . >>>>>> >>>>>> so why Memory of hiveserver using so large and how we do or some >>>>>> suggestion from you ? >>>>>> >>>>>> Thanks and Best Regards! >>>>>> >>>>>> Royce Wang >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >> >> >> >> -- >> Alexander Lorenz >> http://mapredit.blogspot.com >> >> P Think of the environment: please don't print this email unless you really >> need to. >> >> >> >> > > > >-- >Alexander Lorenz >http://mapredit.blogspot.com > >P Think of the environment: please don't print this email unless you >really need to.