Re: Re: Re: Re: Re: Re:Re: hiveserver usage

alo alt Mon, 12 Dec 2011 02:48:57 -0800

You can identify threads with "top -H", the catch one process (pid)
and use jstack:
jstack PID


Its quite not possible I think to filter for a task (If I wrong please
correct me). Here you need a long running task.

- Alex

2011/12/12 王锋 <wfeng1...@163.com>:
>
> how about watch one hive job's stacks .Can it be watched by jobId?
>
> use  ps -Lf hiveserverPId| wc -l  ,
> the threads num of one hiveserver has 132 theads.
> [root@d048049 logs]# ps -Lf 15511|wc -l
> 132
> [root@d048049 logs]#
>
> every stack size is 10m the mem will be 1320M,1g.
>
> so hive's lowest  mem  is 1g?
>
> 在 2011-12-12 17:59:52，"alo alt" <wget.n...@googlemail.com> 写道：
>>When you start a high-load hive query can you watch the stack-traces?
>>Its possible over the webinterface:
>>http://jobtracker:50030/stacks
>>
>>- Alex
>>
>>
>>2011/12/12 王锋 <wfeng1...@163.com>
>>>
>>> hiveserver will throw oom after several hours .
>>>
>>>
>>> At 2011-12-12 17:39:21,"alo alt" <wget.n...@googlemail.com> wrote:
>>>
>>> what happen when you set xmx=2048m or similar? Did that have any negative 
>>> effects for running queries?
>>>
>>> 2011/12/12 王锋 <wfeng1...@163.com>
>>>>
>>>> I have modify hive jvm args.
>>>>  the new args is -Xmx15000m -XX:NewRatio=1 -Xms2000m .
>>>>
>>>> but the memory  used by hiveserver  is still large.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> At 2011-12-12 16:20:54,"Aaron Sun" <aaron.su...@gmail.com> wrote:
>>>>
>>>> Not from the running jobs, what I am saying is the heap size of the Hadoop 
>>>> really depends on the number of files, directories on the HDFS. Remove old 
>>>> files periodically or merge small files would bring in some performance 
>>>> boost.
>>>>
>>>> On the Hive end, the memory consumed also depends on the queries that are 
>>>> executed. Monitor the reducers of the Hadoop job, and my experiences are 
>>>> that reduce part could be the bottleneck here.
>>>>
>>>> It's totally okay to host multiple Hive servers on one machine.
>>>>
>>>> 2011/12/12 王锋 <wfeng1...@163.com>
>>>>>
>>>>> is the files you said  the files from runned jobs  of our system? and 
>>>>> them  can't be so much large.
>>>>>
>>>>> why is the cause of namenode.  what are hiveserver doing   when it use so 
>>>>> large memory?
>>>>>
>>>>> how  do you use hive? our method using hiveserver is correct?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> 在 2011-12-12 14:27:09，"Aaron Sun" <aaron.su...@gmail.com> 写道：
>>>>>
>>>>> Not sure if this is because of the number of files, since the namenode 
>>>>> would track each of the file and directory, and blocks.
>>>>> See this one. 
>>>>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>>>>>
>>>>> Please correct me if I am wrong, because this seems to be more like a 
>>>>> hdfs problem which is actually irrelevant to Hive.
>>>>>
>>>>> Thanks
>>>>> Aaron
>>>>>
>>>>> 2011/12/11 王锋 <wfeng1...@163.com>
>>>>>>
>>>>>>
>>>>>> I want to know why the hiveserver use so large memory,and where the 
>>>>>> memory has been used ?
>>>>>>
>>>>>> 在 2011-12-12 10:02:44，"王锋" <wfeng1...@163.com> 写道：
>>>>>>
>>>>>>
>>>>>> The namenode summary:
>>>>>>
>>>>>>
>>>>>>
>>>>>> the mr summary
>>>>>>
>>>>>>
>>>>>> and hiveserver:
>>>>>>
>>>>>>
>>>>>> hiveserver jvm args:
>>>>>> export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m 
>>>>>> -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC 
>>>>>> -XX:ParallelGCThreads=20 -XX:+UseParall
>>>>>> elOldGC -XX:-UseGCOverheadLimit -verbose:gc -XX:+PrintGCDetails 
>>>>>> -XX:+PrintGCTimeStamps"
>>>>>>
>>>>>> now we  using 3 hiveservers in the same machine.
>>>>>>
>>>>>>
>>>>>> 在 2011-12-12 09:54:29，"Aaron Sun" <aaron.su...@gmail.com> 写道：
>>>>>>
>>>>>> how's the data look like? and what's the size of the cluster?
>>>>>>
>>>>>> 2011/12/11 王锋 <wfeng1...@163.com>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>     I'm one of engieer of sina.com.  We have used hive ,hiveserver 
>>>>>>> several months. We have our own tasks schedule system .The system can 
>>>>>>> schedule tasks running with hiveserver by jdbc.
>>>>>>>
>>>>>>>     But The hiveserver use mem very large, usally  large than 10g.   we 
>>>>>>> have 5min tasks which will be  running every 5 minutes.,and have hourly 
>>>>>>> tasks .total num of tasks  is 40. And we start 3 hiveserver in one 
>>>>>>> linux server,and be cycle connected .
>>>>>>>
>>>>>>>     so why Memory of  hiveserver  using so large and how we do or some 
>>>>>>> suggestion from you ?
>>>>>>>
>>>>>>> Thanks and Best Regards!
>>>>>>>
>>>>>>> Royce Wang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>>
>>> P Think of the environment: please don't print this email unless you really 
>>> need to.
>>>
>>>
>>>
>>>
>>
>>
>>
>>--
>>Alexander Lorenz
>>http://mapredit.blogspot.com
>>
>>P Think of the environment: please don't print this email unless you
>>really need to.
>
>
>



-- 
Alexander Lorenz
http://mapredit.blogspot.com

P Think of the environment: please don't print this email unless you
really need to.

Re: Re: Re: Re: Re: Re:Re: hiveserver usage

Reply via email to