Re: Hive too slow?

2011-03-07 Thread abhishek pathak
Thank you all for the tips.I'll dig into all these and let you people know :) From: Igor Tatarinov To: user@hive.apache.org Sent: Tue, 8 March, 2011 11:47:20 AM Subject: Re: Hive too slow? Most likely, Hadoop's memory settings are too high and Linux starts swap

Re: Performance between Hive queries vs. Hive over HBase queries

2011-03-07 Thread Vaibhav Aggarwal
If you are querying for particular key you should see better performance though. We have filter push-down for equals on hbase key column. On Mar 7, 2011 10:18 PM, "John Sichi" wrote:

Re: Hive too slow?

2011-03-07 Thread Igor Tatarinov
Most likely, Hadoop's memory settings are too high and Linux starts swapping. You should be able to detect that too using vmstat. Just a guess. On Mon, Mar 7, 2011 at 10:11 PM, Ajo Fod wrote: > hmm I don't know of such a place ... but if I had to debug, I'd try to > understand the following: > 1

Re: Performance between Hive queries vs. Hive over HBase queries

2011-03-07 Thread John Sichi
For native tables, Hive reads rows directly from HDFS. For HBase tables, it has to go through the HBase region servers, which reconstruct rows from column families (combining cache + HDFS). HBase makes it possible to keep your table up to date in real time, but you have to pay an overhead cost

Re: Performance between Hive queries vs. Hive over HBase queries

2011-03-07 Thread Biju Kaimal
Hi, Could you please explain the reason for the behavior? Regards, Biju On Tue, Mar 8, 2011 at 11:35 AM, John Sichi wrote: > Yes. > > JVS > > On Mar 7, 2011, at 9:59 PM, Biju Kaimal wrote: > > > Hi, > > > > I loaded a data set which has 1 million rows into both Hive and HBase > tables. For the

Re: Hive too slow?

2011-03-07 Thread Ajo Fod
hmm I don't know of such a place ... but if I had to debug, I'd try to understand the following: 1) are the underlying files zipped/compressed ... that ususally makes it slower. 2) are the files located on the hard drive or hdfs? 3) are all the cores being used? ... check number of reduce and map t

Re: Hive too slow?

2011-03-07 Thread Vijay
If you go to the jobtracker's web UI, it provides plenty of details about each job. Even with all the default settings of a typical hadoop/hive installation, 4 minutes for 2200 rows is extremely slow. It feels like there is some kind of problem but it is hard to guess what that could be. Digging th

Re: Performance between Hive queries vs. Hive over HBase queries

2011-03-07 Thread John Sichi
Yes. JVS On Mar 7, 2011, at 9:59 PM, Biju Kaimal wrote: > Hi, > > I loaded a data set which has 1 million rows into both Hive and HBase tables. > For the HBase table, I created a corresponding Hive table so that the data in > HBase can be queried from Hive QL. Both tables have a key column an

Performance between Hive queries vs. Hive over HBase queries

2011-03-07 Thread Biju Kaimal
Hi, I loaded a data set which has 1 million rows into both Hive and HBase tables. For the HBase table, I created a corresponding Hive table so that the data in HBase can be queried from Hive QL. Both tables have a key column and a value column For the same query (select value, count(*) from table

Re: Hive too slow?

2011-03-07 Thread abhishek pathak
I suspected as such.My system is a Core2Duo,1.86 Ghz.I understand that map-reduce is not instantaneous, just wanted to confirm that 2200 rows in 4 minutes is indeeed not normal behaviour.Could you point me at some places where i can get some info on how to tune this up? Regards, Abhishek ___

Re: hello everybody,i am fresher,i meet a problem,please help.

2011-03-07 Thread 徐厚道
sorry,i have not reply Immediately,i have confirmed the commons-lang-2.4.jar is in the installation/lib. my installation info is hive 0.6.0 hadoop 0.20.2 with nutch 1.1 i have view the bin/hive script ,and echo the CLASSPTH,HADOOP_CLASSPATH,they all contains the commons-lang-2.4.jar. but throw the

Loading data into a Clustered/bucketed table

2011-03-07 Thread Jay Ramadorai
I am Sqooping data from an external source into a bucketed Hive table. Sqoop seems completely bucket-unaware, it simply used LOAD INPATH which moves the single file containing Sqooped data into the Hive warehouse location. My question: - is there any way to get data into an empty clustered/buck

Re: Pointers for talking directly to Hive server to execute queries

2011-03-07 Thread Ryan LeCompte
Nevermind, looks like this has already been done using the Thrift APIs! https://github.com/forward/rbhive On Mon, Mar 7, 2011 at 1:24 PM, Ryan LeCompte wrote: > Hey guys, > > I'm thinking about writing a native Ruby client that can be used to connect > to a running Hive server and issue queri

Pointers for talking directly to Hive server to execute queries

2011-03-07 Thread Ryan LeCompte
Hey guys, I'm thinking about writing a native Ruby client that can be used to connect to a running Hive server and issue queries and get back results. I know that there's a native JDBC API. Could anyone please point me to any docs or source code that would cover the connection/protocol details? Wo

Re: Hive too slow?

2011-03-07 Thread Ajo Fod
In my experience, hive is not instantaneous like other DBs, but 4 minutes to count 2200 rows seems unreasonable. For comparison my query of 169k rows one one computer with 4 cores running 1Ghz (approx) took 20 seconds. Cheers, Ajo. On Mon, Mar 7, 2011 at 1:19 AM, abhishek pathak < forever_yours_

RE: hello everybody,i am fresher,i meet a problem,please help.

2011-03-07 Thread Chinna
No, It won't be a conflict. In u r hive installation/lib/commons-lang-2.4.jar if this jar is there . It will come to class path while starting the hive. I think u r using hive version 0.5.0 or above. if still this problem is there send the details like how u r starting and which ve

Re: hello everybody,i am fresher,i meet a problem,please help.

2011-03-07 Thread 徐厚道
thank you reply! yes,it is. and hadoop lib dir has commons-lang-2.1.jar, is they Conflict ? 2011/3/7 Chinna > Check the lib path, > > > > commons-lang-2.4.jar is in the lib or not. > > > > > -- > > *From:* 徐厚道 [mailto:xuhou...@gmail.com] > *Sent:* Monday, March

RE: hello everybody,i am fresher,i meet a problem,please help.

2011-03-07 Thread Chinna
Check the lib path, commons-lang-2.4.jar is in the lib or not. _ From: 徐厚道 [mailto:xuhou...@gmail.com] Sent: Monday, March 07, 2011 11:54 AM To: user@hive.apache.org Subject: hello everybody,i am fresher,i meet a problem,please help. my eng is very poor. i set up hive env use

RE: hello everybody,i am fresher,i meet a problem,please help.

2011-03-07 Thread Chinna
*** This e-mail and attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained

RE: hello everybody,i am fresher,i meet a problem,please help.

2011-03-07 Thread Chinna
Check the lib path, commons-lang-2.4.jar is in the lib or not. _ From: 徐厚道 [mailto:xuhou...@gmail.com] Sent: Monday, March 07, 2011 11:54 AM To: user@hive.apache.org Subject: hello everybody,i am fresher,i meet a problem,please help. my eng is very poor. i set up hive env

Hive too slow?

2011-03-07 Thread abhishek pathak
Hi, I am a hive newbie.I just finished setting up hive on a cluster of two servers for my organisation.As a test drill, we operated some simple queries.It took the standard map-reduce algorithm around 4 minutes just to execute this query: count(1) from tablename; The answer returned was aroun