Thank you all for the tips.I'll dig into all these and let you people know :)



________________________________
From: Igor Tatarinov <i...@decide.com>
To: user@hive.apache.org
Sent: Tue, 8 March, 2011 11:47:20 AM
Subject: Re: Hive too slow?

Most likely, Hadoop's memory settings are too high and Linux starts swapping. 
You should be able to detect that too using vmstat.
Just a guess.


On Mon, Mar 7, 2011 at 10:11 PM, Ajo Fod <ajo....@gmail.com> wrote:

hmm I don't know of such a place ... but if I had to debug, I'd try to 
understand the following:
>1) are the underlying files zipped/compressed ... that ususally makes it 
slower.
>2) are the files located on the hard drive or hdfs?
>3) are all the cores being used? ... check number of reduce and map tasks.
>
>-Ajo
>
>
>
>On Mon, Mar 7, 2011 at 9:24 PM, abhishek pathak 
><forever_yours_a...@yahoo.co.in> 
>wrote:
>
>I suspected as such.My system is a Core2Duo,1.86 Ghz.I understand that 
>map-reduce is not instantaneous, just wanted to confirm that 2200 rows in 4 
>minutes is indeeed not normal behaviour.Could you point me at some places 
>where 
>i can get some info on how to tune this up?
>>
>>
>>Regards,
>>Abhishek
>>
>>
>>
________________________________
 From: Ajo Fod <ajo....@gmail.com>
>>To: user@hive.apache.org
>>Sent: Mon, 7 March, 2011 9:21:51 PM
>>Subject: Re: Hive too slow?
>>
>>
>>In my experience, hive is not instantaneous like other DBs, but 4 minutes to 
>>count 2200 rows seems unreasonable.
>>
>>For comparison my query of 169k rows one one computer with 4 cores running 
>>1Ghz 
>>(approx) took 20 seconds.
>>
>>Cheers,
>>Ajo.
>>
>>
>>On Mon, Mar 7, 2011 at 1:19 AM, abhishek pathak 
>><forever_yours_a...@yahoo.co.in> 
>>wrote:
>>
>>Hi,
>>>
>>>
>>>I am a hive newbie.I just finished setting up hive on a cluster of two 
>>>servers 
>>>for my organisation.As a test drill, we operated some simple queries.It took 
>>>the 
>>>standard map-reduce algorithm around 4 minutes just to execute this query:
>>>
>>>
>>>count(1) from tablename;
>>>
>>>
>>>The answer returned was around 2200.Clearly, this is not a big number by 
>>>hadoop 
>>>standards.My question is whether this is a standard performance or is there 
>>>some 
>>>configuration that is not optimised?Will scaling up of data to say,50 times, 
>>>produce any drastic slowness?I tried reading the documentation but was not 
>>>clear 
>>>on these issues, and i would like to have an idea before this setup starts 
>>>working in a production  environment.
>>>
>>>
>>>Thanks in advance,
>>>Regards,
>>>Abhishek Pathak
>>>
>>>
>>>
>>>
>>>
>>
>>
>


Reply via email to