Re: Re:Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Thanks for the line, looks like a jre-issue. 2011/12/13 王锋 : > I got the question of hive large memory. > > before the jvm args: > export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms2000m > -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC > -XX:ParallelGCThreads=20 -XX:+UseParal

Re:Re:Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
I got the question of hive large memory. before the jvm args: export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=1 -Xms2000m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParallelGC -XX:ParallelGCThreads=20 -XX:+UseParallelOldGC -XX:-UseGCOverheadLimit -XX:MaxTenuringThreshold=8 -XX:Perm

Re: Confusion about IN clause

2011-12-12 Thread Igor Tatarinov
I think the doc refers to an IN subquery WHERE x IN (SELECT blah FROM ...) the simple WHERE col IN ('x', 'y', 'z') works fine. I imagine none of these work: http://www.dba-oracle.com/sql/t_subquery_not_in_exists.htm igor decide.com On Mon, Dec 12, 2011 at 10:09 PM, rohan monga wrote: > Hi,

Confusion about IN clause

2011-12-12 Thread rohan monga
Hi, I though that 'IN' clause was not supported by hive ( version 0.7 ) according to the documentation https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-WHEREClause but a friend of mine showed me that queries like the following select * from table where r

Re: Hive Basic question

2011-12-12 Thread Abhishek Pratap Singh
Mapping an existing table requires the Hive EXTERNAL keyword, which is also used in other places to access data stored in *unmanaged* Hive tables, i.e., those that are not under Hive's control: Add a comment hive> *CREATE EXTERNAL TABLE

Re: Hive Basic question

2011-12-12 Thread Aditya Kumar
no luck.. here is the Error I am getting: hive>     > CREATE TABLE hbase_table_1(key int, event string)     > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'     > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")     > TBLPROPERTIES ("hbase.table.name" = 'USER_DETAILS'

Re: Hive Basic question

2011-12-12 Thread Abhishek Pratap Singh
it will work, as you will provide the Serde properties related to HBase. Just create the hive table, map it with HBase table and try to access it. ~Abhishek On Tue, Dec 13, 2011 at 12:41 AM, Aditya Kumar wrote: > > Thanks Abhishek for the response. > I did see that link but that is telling more

Re: Hive Basic question

2011-12-12 Thread Aditya Kumar
Thanks Abhishek for the response. I did see that link but that is telling more creating HIVE table and than check in HBAse. my case is I have the data in tables in Hbase. i can view them in HBase. but want to do access them from HIVE. btn I am new to Hadoop world. Can yoyu pl let me know if I a

Re: Hive Basic question

2011-12-12 Thread Abhishek Pratap Singh
Here you go, check this link https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration Regards, Abhishek On Tue, Dec 13, 2011 at 12:30 AM, Aditya Kumar wrote: > Hi, > I have a Hbase table that I am able to insert data and do scan. > I want to have use HIVE,. > > I was able to install HIV

Hive Basic question

2011-12-12 Thread Aditya Kumar
Hi, I have a Hbase table that I am able to insert data and do scan. I want to have use HIVE,. I was able to install HIVE and login to Hive Console. Can you please tell me the way I can access the HBASE Table fgrom HIVE. I want to use HQL Commands on HBase table... please let me know how to do tha

Re: Hive Metadata URI error

2011-12-12 Thread Periya.Data
Thanks for all your suggestions. I terminated my instances and re-launched a set of new instances. And, installed hive via apt-get. I do not see any problem now. Earlier, I had installed hive by downloading the tarball. Interestingly, I noticed hive-site.xml file when installed through apt-get (an

Re: Hive Metadata URI error

2011-12-12 Thread Carl Steinbach
Hi Periya, You should only set the hive.metastore.uris property if you are running a standalone MetaStore server, in which case you need to set hive.metastore.local=false and set hive.metastore.uris to a Thrift URI. Please see this document for more details: https://cwiki.apache.org/confluence/di

Re: Getting a NULL from a MAP operation

2011-12-12 Thread Mark Grover
Hi Michael, Try returning "\\N" instead. Mark Grover, Business Intelligence Analyst OANDA Corporation www: oanda.com www: fxtrade.com e: mgro...@oanda.com "Best Trading Platform" - World Finance's Forex Awards 2009. "The One to Watch" - Treasury Today's Adam Smith Awards 2009. - Original M

Getting a NULL from a MAP operation

2011-12-12 Thread mdefoinplatel.ext
Dear all , I am trying to process some web log data and one of the steps consists in parsing a string to extract the relevant fields. So I am using a FROM ... MAP... USING ... statement calling a python script. My problem is that the returned fields may not exist in the original string and there

Re:Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
yes,we using jdk 1. 0.26 [hdfs@d048049 conf]$ java -version java version "1.6.0_26" Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) I will see the document of the url,thanks very much! 在 2011-12-12 19:08:37,"alo alt" 写道:

Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Argh, increase! sry, to fast typing 2011/12/12 alo alt : > Did you update your JDK in last time? A java-dev told me that could be > a issue in JDK _26 > (https://forums.oracle.com/forums/thread.jspa?threadID=2309872), some > devs report a memory decrease when they use GC - flags. I'm quite not >

Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Did you update your JDK in last time? A java-dev told me that could be a issue in JDK _26 (https://forums.oracle.com/forums/thread.jspa?threadID=2309872), some devs report a memory decrease when they use GC - flags. I'm quite not sure, sounds for me to far away. The stacks have a lot waitings, bu

Re:Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
The hive log: Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121840_767713480.txt 8159.581: [GC [PSYoungGen: 1927208K->688K(2187648K)] 9102425K->7176256K(9867648K), 0.0765670 secs] [Times: user=0.36 sys=0.00, real=0.08 secs] Hive history file=/tmp/hdfs/hive_job_log_hdfs_201112121841_45

Re: Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
You can identify threads with "top -H", the catch one process (pid) and use jstack: jstack PID Its quite not possible I think to filter for a task (If I wrong please correct me). Here you need a long running task. - Alex 2011/12/12 王锋 : > > how about watch one hive job's stacks .Can it be watche

Re:Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
how about watch one hive job's stacks .Can it be watched by jobId? use ps -Lf hiveserverPId| wc -l , the threads num of one hiveserver has 132 theads. [root@d048049 logs]# ps -Lf 15511|wc -l 132 [root@d048049 logs]# every stack size is 10m the mem will be 1320M,1g. so hive's lowest mem

Re: Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
When you start a high-load hive query can you watch the stack-traces? Its possible over the webinterface: http://jobtracker:50030/stacks - Alex 2011/12/12 王锋 > > hiveserver will throw oom after several hours . > > > At 2011-12-12 17:39:21,"alo alt" wrote: > > what happen when you set xmx=2048m

Re: Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
what happen when you set xmx=2048m or similar? Did that have any negative effects for running queries? 2011/12/12 王锋 > I have modify hive jvm args. > the new args is -Xmx15000m -XX:NewRatio=1 -Xms2000m . > > but the memory used by hiveserver is still large. > > > > > > At 2011-12-12 16:20:54,

Re:Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
who has test the concurrent performance of hiveserver? and how about it. before I use one hiveserver concurrent running jobs from our schedule system, and after about 12 hours , the hiveserver was blocked and the subsequent jobs were not executed ,no error was thrown. At 2011-12-12 16:25:19,"al

Re:Re:Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
who has test the concurrent performance of hiveserver? and how about it. before I use one hiveserver concurrent running jobs from our schedule system, and after about 12 hours , the hiveserver was blocked and the subsequent jobs were not executed ,no error was thrown. At 2011-12-12 16:32:17,"王锋

Re:Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
yes, before we runnint job, our files are generated in hdfs is by scribe.and it can control the size of files is 128 M . so small files can be omit . I just modify the hive the xms config to 2g,the xmx still 15g . watch it for a time again. Thanks . 在 2011-12-12 16:20:54,"Aaron S

Re: Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Ah, I see. Take a look at the NN, hive use hdfs and if you have jobs with many small files in a table (logfiles as example) and a large cluster the NN could be a bottleneck. - Alex On Mon, Dec 12, 2011 at 9:20 AM, 王锋 wrote: > before I set -xmx 2g, but hiveserver throws many exception OOM. so I

Re:Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
before I set -xmx 2g, but hiveserver throws many exception OOM. so I reset and at the end I set xmx=15g, newRatio=1. Because I watch hiveserver for a long time.It use memory very large when running job, usually it can be 8g ,10g,or 15g. so I set xmx=15g ,and newRatio=1 , the young generation wil

Re:Re: Re:Re: hiveserver usage

2011-12-12 Thread 王锋
is the files you said the files from runned jobs of our system? and them can't be so much large. why is the cause of namenode. what are hiveserver doing when it use so large memory? how do you use hive? our method using hiveserver is correct? Thanks. 在 2011-12-12 14:27:09,"Aaron Sun

Re: Re:Re: hiveserver usage

2011-12-12 Thread alo alt
Hi, see I right you set java with -xmx=15000M? And you set minimum heap size (xms) = 15000M? Here you give java no chance to use less than 15GB memory, because min says 15000M, and max too. I wondering why any java-process have to need 15G of memory. Could be in large tomcat od jboss environments