Re: Re:Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Thanks for the line, looks like a jre-issue. 2011/12/13 王锋 : > I got the question of hive large memory. > > before the jvm args: > export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms2000m > -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC > -XX:ParallelGCThreads=20 -XX:+UseParal

Re:Re:Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
I got the question of hive large memory. before the jvm args: export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms2000m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC -XX:ParallelGCThreads=20 -XX:+UseParallelOldGC -XX:-UseGCOverheadLimit -XX:MaxTenuringThreshold=8 -XX:Perm

Re:Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
yes,we using jdk 1. 0.26 [hdfs@d048049 conf]$ java -version java version "1.6.0_26" Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) I will see the document of the url,thanks very much! 在 2011-12-12 19:08:37,"alo alt" 写道:

Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Argh, increase! sry, to fast typing 2011/12/12 alo alt : > Did you update your JDK in last time? A java-dev told me that could be > a issue in JDK _26 > (https://forums.oracle.com/forums/thread.jspa?threadID=2309872), some > devs report a memory decrease when they use GC - flags. I'm quite not >

Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Did you update your JDK in last time? A java-dev told me that could be a issue in JDK _26 (https://forums.oracle.com/forums/thread.jspa?threadID=2309872), some devs report a memory decrease when they use GC - flags. I'm quite not sure, sounds for me to far away. The stacks have a lot waitings, bu

Re:Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
The hive log: Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121840_767713480.txt 8159.581: [GC [PSYoungGen: 1927208K->688K(2187648K)] 9102425K->7176256K(9867648K), 0.0765670 secs] [Times: user=0.36 sys=0.00, real=0.08 secs] Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121841_45

Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
You can identify threads with "top -H", the catch one process (pid) and use jstack: jstack PID Its quite not possible I think to filter for a task (If I wrong please correct me). Here you need a long running task. - Alex 2011/12/12 王锋 : > > how about watch one hive job's stacks .Can it be watche

Re:Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
how about watch one hive job's stacks .Can it be watched by jobId? use ps -Lf hiveserverPId| wc -l , the threads num of one hiveserver has 132 theads. [root@d048049 logs]# ps -Lf 15511|wc -l 132 [root@d048049 logs]# every stack size is 10m the mem will be 1320M,1g. so hive's lowest mem

Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
When you start a high-load hive query can you watch the stack-traces? Its possible over the webinterface: http://jobtracker:50030/stacks - Alex 2011/12/12 王锋 > > hiveserver will throw oom after several hours . > > > At 2011-12-12 17:39:21,"alo alt" wrote: > > what happen when you set xmx=2048m

Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
what happen when you set xmx=2048m or similar? Did that have any negative effects for running queries? 2011/12/12 王锋 > I have modify hive jvm args. > the new args is -Xmx15000m -XX:NewRatio=1 -Xms2000m . > > but the memory used by hiveserver is still large. > > > > > > At 2011-12-12 16:20:54,

Re:Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
who has test the concurrent performance of hiveserver? and how about it. before I use one hiveserver concurrent running jobs from our schedule system, and after about 12 hours , the hiveserver was blocked and the subsequent jobs were not executed ,no error was thrown. At 2011-12-12 16:25:19,"al

Re:Re:Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
who has test the concurrent performance of hiveserver? and how about it. before I use one hiveserver concurrent running jobs from our schedule system, and after about 12 hours , the hiveserver was blocked and the subsequent jobs were not executed ,no error was thrown. At 2011-12-12 16:32:17,"王锋

Re:Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
yes, before we runnint job, our files are generated in hdfs is by scribe.and it can control the size of files is 128 M . so small files can be omit . I just modify the hive the xms config to 2g,the xmx still 15g . watch it for a time again. Thanks . 在 2011-12-12 16:20:54,"Aaron S

Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Ah, I see. Take a look at the NN, hive use hdfs and if you have jobs with many small files in a table (logfiles as example) and a large cluster the NN could be a bottleneck. - Alex On Mon, Dec 12, 2011 at 9:20 AM, 王锋 wrote: > before I set -xmx 2g, but hiveserver throws many exception OOM. so I

Re:Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
before I set -xmx 2g, but hiveserver throws many exception OOM. so I reset and at the end I set xmx=15g, newRatio=1. Because I watch hiveserver for a long time.It use memory very large when running job, usually it can be 8g ,10g,or 15g. so I set xmx=15g ,and newRatio=1 , the young generation wil

Re:Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
is the files you said the files from runned jobs of our system? and them can't be so much large. why is the cause of namenode. what are hiveserver doing when it use so large memory? how do you use hive? our method using hiveserver is correct? Thanks. 在 2011-12-12 14:27:09,"Aaron Sun

Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Hi, see I right you set java with -xmx=15000M? And you set minimum heap size (xms) = 15000M? Here you give java no chance to use less than 15GB memory, because min says 15000M, and max too. I wondering why any java-process have to need 15G of memory. Could be in large tomcat od jboss environments

Re:Re:Re: hiveserver usage

2011-12-11 Thread 王锋
I want to know why the hiveserver use so large memory,and where the memory has been used ? 在 2011-12-12 10:02:44,"王锋" 写道: The namenode summary: the mr summary and hiveserver: hiveserver jvm args: export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m -XX:MaxHeapFreeRatio=40

Re:Re: hiveserver usage

2011-12-11 Thread 王锋
The namenode summary: the mr summary and hiveserver: hiveserver jvm args: export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms15000m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC -XX:ParallelGCThreads=20 -XX:+UseParall elOldGC -XX:-UseGCOverheadLimit -verbose:gc

Re: hiveserver usage

2011-12-11 Thread Aaron Sun
how's the data look like? and what's the size of the cluster? 2011/12/11 王锋 > Hi, > > I'm one of engieer of sina.com. We have used hive ,hiveserver > several months. We have our own tasks schedule system .The system can > schedule tasks running with hiveserver by jdbc. > > But The hives

hiveserver usage

2011-12-11 Thread 王锋
Hi, I'm one of engieer of sina.com. We have used hive ,hiveserver several months. We have our own tasks schedule system .The system can schedule tasks running with hiveserver by jdbc. But The hiveserver use mem very large, usally large than 10g. we have 5min tasks which will be