what happen when you set xmx=2048m or similar? Did that have any negative
effects for running queries?

2011/12/12 王锋 <wfeng1...@163.com>

> I have modify hive jvm args.
>  the new args is -Xmx15000m -XX:NewRatio=1 -Xms2000m .
>
> but the memory  used by hiveserver  is still large.
>
>
>
>
>
> At 2011-12-12 16:20:54,"Aaron Sun" <aaron.su...@gmail.com> wrote:
>
> Not from the running jobs, what I am saying is the heap size of the Hadoop
> really depends on the number of files, directories on the HDFS. Remove old
> files periodically or merge small files would bring in some performance
> boost.
>
> On the Hive end, the memory consumed also depends on the queries that are
> executed. Monitor the reducers of the Hadoop job, and my experiences are
> that reduce part could be the bottleneck here.
>
> It's totally okay to host multiple Hive servers on one machine.
>
> 2011/12/12 王锋 <wfeng1...@163.com>
>
>> is the files you said  the files from runned jobs  of our system? and
>> them  can't be so much large.
>>
>> why is the cause of namenode.  what are hiveserver doing   when it use so
>> large memory?
>>
>> how  do you use hive? our method using hiveserver is correct?
>>
>> Thanks.
>>
>> 在 2011-12-12 14:27:09,"Aaron Sun" <aaron.su...@gmail.com> 写道:
>>
>> Not sure if this is because of the number of files, since the namenode
>> would track each of the file and directory, and blocks.
>> See this one.
>> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>>
>> Please correct me if I am wrong, because this seems to be more like a
>> hdfs problem which is actually irrelevant to Hive.
>>
>> Thanks
>> Aaron
>>
>> 2011/12/11 王锋 <wfeng1...@163.com>
>>
>>>
>>> I want to know why the hiveserver use so large memory,and where the
>>> memory has been used ?
>>>
>>> 在 2011-12-12 10:02:44,"王锋" <wfeng1...@163.com> 写道:
>>>
>>>
>>> The namenode summary:
>>>
>>>
>>>
>>> the mr summary
>>>
>>>
>>> and hiveserver:
>>>
>>>
>>> hiveserver jvm args:
>>> export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m
>>> -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC
>>> -XX:ParallelGCThreads=20 -XX:+UseParall
>>> elOldGC -XX:-UseGCOverheadLimit -verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCTimeStamps"
>>>
>>> now we  using 3 hiveservers in the same machine.
>>>
>>>
>>> 在 2011-12-12 09:54:29,"Aaron Sun" <aaron.su...@gmail.com> 写道:
>>>
>>> how's the data look like? and what's the size of the cluster?
>>>
>>> 2011/12/11 王锋 <wfeng1...@163.com>
>>>
>>>> Hi,
>>>>
>>>>     I'm one of engieer of sina.com.  We have used hive ,hiveserver
>>>> several months. We have our own tasks schedule system .The system can
>>>> schedule tasks running with hiveserver by jdbc.
>>>>
>>>>     But The hiveserver use mem very large, usally  large than 10g.   we
>>>> have 5min tasks which will be  running every 5 minutes.,and have hourly
>>>> tasks .total num of tasks  is 40. And we start 3 hiveserver in one linux
>>>> server,and be cycle connected .
>>>>
>>>>     so why Memory of  hiveserver  using so large and how we do or some
>>>> suggestion from you ?
>>>>
>>>> Thanks and Best Regards!
>>>>
>>>> Royce Wang
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>


-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*

<<hiveserver1.png>>

<<mr.png>>

<<namenode.png>>

<<hiveserver.png>>

Reply via email to