Factor of 5 closely matches the results I got when I was testing.

JVS

On Mar 9, 2011, at 1:23 PM, Otis Gospodnetic wrote:

> Hi,
> 
> Biju's example shows a factor of 5 decrease in performance when Hive points 
> to 
> HBase tables.
> 
> Does anyone know how much this factor varies?  Is if often closer to 1 or is 
> is 
> more often close to 10?
> Just trying to get a better feel for this...
> 
> Thanks,
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
> 
> 
> 
> ----- Original Message ----
>> From: John Sichi <jsi...@fb.com>
>> To: "<user@hive.apache.org>" <user@hive.apache.org>
>> Sent: Tue, March 8, 2011 1:05:34 AM
>> Subject: Re: Performance between Hive queries vs. Hive over HBase queries
>> 
>> Yes.
>> 
>> JVS
>> 
>> On Mar 7, 2011, at 9:59 PM, Biju Kaimal  wrote:
>> 
>>> Hi,
>>> 
>>> I loaded a data set which has 1 million  rows into both Hive and HBase 
>> tables. For the HBase table, I created a  corresponding Hive table so that 
>> the 
>> data in HBase can be queried from Hive QL.  Both tables have a key column 
>> and a 
>> value column
>>> 
>>> For the same  query (select value, count(*) from table group by value), the 
>> Hive only query  runs much faster (~ 30 seconds) as compared to Hive over 
>> HBase 
>> (~ 150  seconds).
>>> 
>>> Is this expected?
>>> 
>>> Regards,
>>> Biju
>> 
>> 

Reply via email to