Re: hive to hbase mapping

Mario Casola Mon, 17 Jun 2013 06:01:19 -0700

Hi,

for the first question the answer is yes, with a 500,000 rows Hbase table,
the job complete successfully.
Second, the jobs are in running state. Attached you can see a syslog of one
of the jobs.
Third, I've tryed to set the "hbase.zookeeper.quorum" property but nothing
is changed.


Let me know if I can check other configurations.

thanks
Mario



2013/6/15 <kulkarni.swar...@gmail.com>

> Since your jobs are at 0%, it might actually be a problem with your hadoop
> cluster rather than hive. Couple of things to check would be:
>
> 1. Does a simple M/R job complete successfully?
> 2. Do logs for the jobs say something? Are the jobs in running state or
> pending state?
> 3. It is possible that job submitted from hive is unable to find the
> zookeeper quorum. To do that, you need to set "hbase.zookeeper.quorum"
> property in your hive-site.xml to point to your zookeeper quorum.
>
> Hope this helps.
>
> On Jun 14, 2013, at 11:54 AM, Mario Casola <mario.cas...@gmail.com> wrote:
>
> Hi Sanjay,
>
> thanks for the response.
>
> I need Hbase because is perfect for aggregating data through the counters,
> and write performance is great.
> Now the problem is...Which is the best way for loading periodically (every
> hour for example) Hbase data in Hive table?
>
> Mario
>
>
>
> 2013/6/14 Sanjay Subramanian <sanjay.subraman...@wizecommerce.com>
>
>>  6 months back I was tasked with building a Data platform for logs and I
>> benchmarked
>> Hbase + Hive (queries were 8X slower)
>> Hive only
>>
>>  So I decided for Hive option and am deploying that solution to
>> production.
>>
>>  Couple of things u can think while u design if u really want to go
>> HBase+Hive (also look at this
>> http://hadoopstack.com/hive-on-hbase-part-1/)
>> - Query only todays data in a Hive+Hbase architecture
>> - Older data than one day query Hive only
>>
>>  Hope I am not diverting from your question and problem
>>
>>  sanjay
>>
>>   From: Mario Casola <mario.cas...@gmail.com>
>> Reply-To: "user@hive.apache.org" <user@hive.apache.org>
>> Date: Friday, June 14, 2013 8:54 AM
>> To: "user@hive.apache.org" <user@hive.apache.org>
>> Subject: hive to hbase mapping
>>
>>    Hi,
>>
>>  I have a performance issue when I query HBase from Hive.
>> My idea is to build the scenario below:
>> 1. Collect data in hbase for aggregation purpose
>> 2. Create an external table that map Hive to Hbase
>> 3. Create a real Hive table
>> 4. Periodically transfer data from hbase to Hive through "INSERTO INTO
>> <real hive table> SELECT * FROM <external table> WHERE time = 201305212909"
>>
>>  Currently I'm doing a test on a Hbase table that has 70,000,000 rows
>> and I'm trying to query this table with a single column value filter, like
>> the query above.
>> If I try this type of query directly in Hbase the response time is around
>> 80 seconds.
>> If I try the query in Hive shell, after 30 minutes, all the tasks (9 in
>> my case) are 0,00% complete.
>>
>>  Which could be the problem?
>>
>>  thanks
>> Mario
>>
>> CONFIDENTIALITY NOTICE
>> ======================
>> This email message and any attachments are for the exclusive use of the
>> intended recipient(s) and may contain confidential and privileged
>> information. Any unauthorized review, use, disclosure or distribution is
>> prohibited. If you are not the intended recipient, please contact the
>> sender by reply email and destroy all copies of the original message along
>> with any attachments, from your computer system. If you are the intended
>> recipient, please be advised that the content of this message is subject to
>> access, review and disclosure by the sender's Email System Administrator.
>>
>
>

syslog
Description: Binary data

Re: hive to hbase mapping

Reply via email to