You can identify threads with "top -H", the catch one process (pid) and use jstack: jstack PID
Its quite not possible I think to filter for a task (If I wrong please correct me). Here you need a long running task. - Alex 2011/12/12 王锋 <wfeng1...@163.com>: > > how about watch one hive job's stacks .Can it be watched by jobId? > > use ps -Lf hiveserverPId| wc -l , > the threads num of one hiveserver has 132 theads. > [root@d048049 logs]# ps -Lf 15511|wc -l > 132 > [root@d048049 logs]# > > every stack size is 10m the mem will be 1320M,1g. > > so hive's lowest mem is 1g? > > 在 2011-12-12 17:59:52,"alo alt" <wget.n...@googlemail.com> 写道: >>When you start a high-load hive query can you watch the stack-traces? >>Its possible over the webinterface: >>http://jobtracker:50030/stacks >> >>- Alex >> >> >>2011/12/12 王锋 <wfeng1...@163.com> >>> >>> hiveserver will throw oom after several hours . >>> >>> >>> At 2011-12-12 17:39:21,"alo alt" <wget.n...@googlemail.com> wrote: >>> >>> what happen when you set xmx=2048m or similar? Did that have any negative >>> effects for running queries? >>> >>> 2011/12/12 王锋 <wfeng1...@163.com> >>>> >>>> I have modify hive jvm args. >>>> the new args is -Xmx15000m -XX:NewRatio=1 -Xms2000m . >>>> >>>> but the memory used by hiveserver is still large. >>>> >>>> >>>> >>>> >>>> >>>> At 2011-12-12 16:20:54,"Aaron Sun" <aaron.su...@gmail.com> wrote: >>>> >>>> Not from the running jobs, what I am saying is the heap size of the Hadoop >>>> really depends on the number of files, directories on the HDFS. Remove old >>>> files periodically or merge small files would bring in some performance >>>> boost. >>>> >>>> On the Hive end, the memory consumed also depends on the queries that are >>>> executed. Monitor the reducers of the Hadoop job, and my experiences are >>>> that reduce part could be the bottleneck here. >>>> >>>> It's totally okay to host multiple Hive servers on one machine. >>>> >>>> 2011/12/12 王锋 <wfeng1...@163.com> >>>>> >>>>> is the files you said the files from runned jobs of our system? and >>>>> them can't be so much large. >>>>> >>>>> why is the cause of namenode. what are hiveserver doing when it use so >>>>> large memory? >>>>> >>>>> how do you use hive? our method using hiveserver is correct? >>>>> >>>>> Thanks. >>>>> >>>>> 在 2011-12-12 14:27:09,"Aaron Sun" <aaron.su...@gmail.com> 写道: >>>>> >>>>> Not sure if this is because of the number of files, since the namenode >>>>> would track each of the file and directory, and blocks. >>>>> See this one. >>>>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/ >>>>> >>>>> Please correct me if I am wrong, because this seems to be more like a >>>>> hdfs problem which is actually irrelevant to Hive. >>>>> >>>>> Thanks >>>>> Aaron >>>>> >>>>> 2011/12/11 王锋 <wfeng1...@163.com> >>>>>> >>>>>> >>>>>> I want to know why the hiveserver use so large memory,and where the >>>>>> memory has been used ? >>>>>> >>>>>> 在 2011-12-12 10:02:44,"王锋" <wfeng1...@163.com> 写道: >>>>>> >>>>>> >>>>>> The namenode summary: >>>>>> >>>>>> >>>>>> >>>>>> the mr summary >>>>>> >>>>>> >>>>>> and hiveserver: >>>>>> >>>>>> >>>>>> hiveserver jvm args: >>>>>> export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m >>>>>> -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC >>>>>> -XX:ParallelGCThreads=20 -XX:+UseParall >>>>>> elOldGC -XX:-UseGCOverheadLimit -verbose:gc -XX:+PrintGCDetails >>>>>> -XX:+PrintGCTimeStamps" >>>>>> >>>>>> now we using 3 hiveservers in the same machine. >>>>>> >>>>>> >>>>>> 在 2011-12-12 09:54:29,"Aaron Sun" <aaron.su...@gmail.com> 写道: >>>>>> >>>>>> how's the data look like? and what's the size of the cluster? >>>>>> >>>>>> 2011/12/11 王锋 <wfeng1...@163.com> >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I'm one of engieer of sina.com. We have used hive ,hiveserver >>>>>>> several months. We have our own tasks schedule system .The system can >>>>>>> schedule tasks running with hiveserver by jdbc. >>>>>>> >>>>>>> But The hiveserver use mem very large, usally large than 10g. we >>>>>>> have 5min tasks which will be running every 5 minutes.,and have hourly >>>>>>> tasks .total num of tasks is 40. And we start 3 hiveserver in one >>>>>>> linux server,and be cycle connected . >>>>>>> >>>>>>> so why Memory of hiveserver using so large and how we do or some >>>>>>> suggestion from you ? >>>>>>> >>>>>>> Thanks and Best Regards! >>>>>>> >>>>>>> Royce Wang >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >>> -- >>> Alexander Lorenz >>> http://mapredit.blogspot.com >>> >>> P Think of the environment: please don't print this email unless you really >>> need to. >>> >>> >>> >>> >> >> >> >>-- >>Alexander Lorenz >>http://mapredit.blogspot.com >> >>P Think of the environment: please don't print this email unless you >>really need to. > > > -- Alexander Lorenz http://mapredit.blogspot.com P Think of the environment: please don't print this email unless you really need to.