Argh, increase! sry, to fast typing 2011/12/12 alo alt <wget.n...@googlemail.com>: > Did you update your JDK in last time? A java-dev told me that could be > a issue in JDK _26 > (https://forums.oracle.com/forums/thread.jspa?threadID=2309872), some > devs report a memory decrease when they use GC - flags. I'm quite not > sure, sounds for me to far away. > > The stacks have a lot waitings, but I see nothing special. > > - Alex > > 2011/12/12 王锋 <wfeng1...@163.com>: >> >> The hive log: >> >> Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121840_767713480.txt >> 8159.581: [GC [PSYoungGen: 1927208K->688K(2187648K)] >> 9102425K->7176256K(9867648K), 0.0765670 secs] [Times: user=0.36 sys=0.00, >> real=0.08 secs] >> Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121841_451939518.txt >> 8219.455: [GC [PSYoungGen: 1823477K->608K(2106752K)] >> 8999046K->7176707K(9786752K), 0.0719450 secs] [Times: user=0.66 sys=0.01, >> real=0.07 secs] >> Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121842_1930999319.txt >> >> Now we have 3 hiveservers and I set the concurrent job num to 4,but the Mem >> still be so large .I'm mad, God >> >> have other suggestions ? >> >> 在 2011-12-12 17:59:52,"alo alt" <wget.n...@googlemail.com >>> 写道: >>>When you start a high-load hive query can you watch the stack-traces? >>>Its possible over the webinterface: >>>http://jobtracker:50030/stacks >>> >>>- Alex >>> >>> >>>2011/12/12 王锋 <wfeng1...@163.com> >>>> >>>> hiveserver will throw oom after several hours . >>>> >>>> >>>> At 2011-12-12 17:39:21,"alo alt" <wget.n...@googlemail.com> wrote: >>>> >>>> what happen when you set xmx=2048m or similar? Did that have any negative >>>> effects for running queries? >>>> >>>> 2011/12/12 王锋 <wfeng1...@163.com> >>>>> >>>>> I have modify hive jvm args. >>>>> the new args is -Xmx15000m -XX:NewRatio=1 -Xms2000m . >>>>> >>>>> but the memory used by hiveserver is still large. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> At 2011-12-12 16:20:54,"Aaron Sun" <aaron.su...@gmail.com> wrote: >>>>> >>>>> Not from the running jobs, what I am saying is the heap size of the >>>>> Hadoop really depends on the number of files, directories on the HDFS. >>>>> Remove old files periodically or merge small files would bring in some >>>>> performance boost. >>>>> >>>>> On the Hive end, the memory consumed also depends on the queries that are >>>>> executed. Monitor the reducers of the Hadoop job, and my experiences are >>>>> that reduce part could be the bottleneck here. >>>>> >>>>> It's totally okay to host multiple Hive servers on one machine. >>>>> >>>>> 2011/12/12 王锋 <wfeng1...@163.com> >>>>>> >>>>>> is the files you said the files from runned jobs of our system? and >>>>>> them can't be so much large. >>>>>> >>>>>> why is the cause of namenode. what are hiveserver doing when it use >>>>>> so large memory? >>>>>> >>>>>> how do you use hive? our method using hiveserver is correct? >>>>>> >>>>>> Thanks. >>>>>> >>>>>> 在 2011-12-12 14:27:09,"Aaron Sun" <aaron.su...@gmail.com> 写道: >>>>>> >>>>>> Not sure if this is because of the number of files, since the namenode >>>>>> would track each of the file and directory, and blocks. >>>>>> See this one. >>>>>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/ >>>>>> >>>>>> Please correct me if I am wrong, because this seems to be more like a >>>>>> hdfs problem which is actually irrelevant to Hive. >>>>>> >>>>>> Thanks >>>>>> Aaron >>>>>> >>>>>> 2011/12/11 王锋 <wfeng1...@163.com> >>>>>>> >>>>>>> >>>>>>> I want to know why the hiveserver use so large memory,and where the >>>>>>> memory has been used ? >>>>>>> >>>>>>> 在 2011-12-12 10:02:44,"王锋" <wfeng1...@163.com> 写道: >>>>>>> >>>>>>> >>>>>>> The namenode summary: >>>>>>> >>>>>>> >>>>>>> >>>>>>> the mr summary >>>>>>> >>>>>>> >>>>>>> and hiveserver: >>>>>>> >>>>>>> >>>>>>> hiveserver jvm args: >>>>>>> export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m >>>>>>> -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC >>>>>>> -XX:ParallelGCThreads=20 -XX:+UseParall >>>>>>> elOldGC -XX:-UseGCOverheadLimit -verbose:gc -XX:+PrintGCDetails >>>>>>> -XX:+PrintGCTimeStamps" >>>>>>> >>>>>>> now we using 3 hiveservers in the same machine. >>>>>>> >>>>>>> >>>>>>> 在 2011-12-12 09:54:29,"Aaron Sun" <aaron.su...@gmail.com> 写道: >>>>>>> >>>>>>> how's the data look like? and what's the size of the cluster? >>>>>>> >>>>>>> 2011/12/11 王锋 <wfeng1...@163.com> >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm one of engieer of sina.com. We have used hive ,hiveserver >>>>>>>> several months. We have our own tasks schedule system .The system can >>>>>>>> schedule tasks running with hiveserver by jdbc. >>>>>>>> >>>>>>>> But The hiveserver use mem very large, usally large than 10g. >>>>>>>> we have 5min tasks which will be running every 5 minutes.,and have >>>>>>>> hourly tasks .total num of tasks is 40. And we start 3 hiveserver in >>>>>>>> one linux server,and be cycle connected . >>>>>>>> >>>>>>>> so why Memory of hiveserver using so large and how we do or some >>>>>>>> suggestion from you ? >>>>>>>> >>>>>>>> Thanks and Best Regards! >>>>>>>> >>>>>>>> Royce Wang >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Alexander Lorenz >>>> http://mapredit.blogspot.com >>>> >>>> P Think of the environment: please don't print this email unless you >>>> really need to. >>>> >>>> >>>> >>>> >>> >>> >>> >>>-- >>>Alexander Lorenz >>>http://mapredit.blogspot.com >>> >>>P Think of the environment: please don't print this email unless you >>>really need to. >> >> >> > > > > -- > Alexander Lorenz > http://mapredit.blogspot.com > > P Think of the environment: please don't print this email unless you > really need to.
-- Alexander Lorenz http://mapredit.blogspot.com P Think of the environment: please don't print this email unless you really need to.