Thank you so much Serega.
Regards,
Krishna
On Sun, Sep 28, 2014 at 11:01 PM, Serega Sheypak
wrote:
>
> https://pig.apache.org/docs/r0.11.0/api/org/apache/pig/backend/hadoop/hbase/HBaseStorage.html
> I'm not sure how does Pig HBaseStroage works. I suppose it would read all
> data and then join i
https://pig.apache.org/docs/r0.11.0/api/org/apache/pig/backend/hadoop/hbase/HBaseStorage.html
I'm not sure how does Pig HBaseStroage works. I suppose it would read all
data and then join it as usual dataset. So you should get serious hbase
perfomace degradation during read, you would get key-by-key
We actually have 2 data sets in HDFS, location (3-5 GB, approx 10 columns
in each record) and weblog (2-3 TB, approx 50 columns in each record). We
need to join the data sets using the locationId, which is in both the
data-sets.
We have 2 options:
1. Have both the data-sets in HDFS only and JOIN t
store location to hdfs
store weblog to hdfs
join them
use HBase bulk load tool to load join result to hbase.
What's the reason to keep location dataset in hbase and weblogs in hdfs?
You can expect data load perfomance improvement. For me it takes few
minutes to bulk load 500.000.000 records to 10
Thanks Serega,
Our usecase details:
We have a location table which will be stored in HBase with locationID as
the rowkey / Joinkey.
We intend to join this table with a transactional WebLog file in HDFS
(Expected size can be around 2TB).
Joining query will be passed from Pig.
Can we expect a perfor
Depends on the datasets size and HBase workload. The best way is to do join
in pig, store it and then use HBase bulk load tool.
It's general recommendation. I have no idea about your task details
2014-09-27 7:32 GMT+04:00 Krishna Kalyan :
> Hi,
> We have a use case that involves ETL on data comin
Hi,
We have a use case that involves ETL on data coming from several different
sources using pig.
We plan to store the final output table in HBase.
What will be the performance impact if we do a join with an external CSV
table using pig?.
Regards,
Krishna
Hi Jean,
This issue had been solved by following the suggestions of Cheolsoo
*1) ClassNotFoundError
Even though you're "registering" jars in your script, they're not present
in classpath. So you're seeing that ClassNotFound error. Can you try this?
PIG_CLASSPATH=/hbase-0.94.1.jar:/
lib/zookeepe
On Thu, Oct 25, 2012 at 7:44 AM, Manu S wrote:
> Hi,
>
> I am using Pig-0.10.0 & hbase-0.94.2.
>
> I am trying to store the processed output to Hbase cluster using pig
> script.
>
> I registered the required .jar and set the mapreduce and zookeeper
> parameters within the script itself.
>
> *# cat
gt;> (via Tom White)
>>
>>
>> - Original Message -
>> > From: Mikael Sitruk
>> > To: user@hbase.apache.org; Andrew Purtell
>> > Cc:
>> > Sent: Wednesday, February 15, 2012 11:32 PM
>> > Subject: Re: LeaseException while extrac
(via Tom White)
>
>
> - Original Message -
> > From: Mikael Sitruk
> > To: user@hbase.apache.org; Andrew Purtell
> > Cc:
> > Sent: Wednesday, February 15, 2012 11:32 PM
> > Subject: Re: LeaseException while extracting data via pig/hbase
> integ
Hein (via
Tom White)
- Original Message -
> From: Mikael Sitruk
> To: user@hbase.apache.org; Andrew Purtell
> Cc:
> Sent: Wednesday, February 15, 2012 11:32 PM
> Subject: Re: LeaseException while extracting data via pig/hbase integration
>
> Andy hi
>
> Not sure what
ds,
>
> - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>
>
>
> - Original Message -
> > From: Jean-Daniel Cryans
> > To: user@hbase.apache.org
> > Cc:
> > Sent: Wednesday, February 1
ginal Message -
> From: Jean-Daniel Cryans
> To: user@hbase.apache.org
> Cc:
> Sent: Wednesday, February 15, 2012 10:17 AM
> Subject: Re: LeaseException while extracting data via pig/hbase integration
>
> You would have to grep the lease's id, in your first email it w
Ok, I don't have this log anymore but since the problem was reproduced in
other log (which i keep), here is the grep
2012-02-08 14:13:02,970 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.regionserver.LeaseException: lease
'-6992210222685255354' does not exist
You would have to grep the lease's id, in your first email it was
"-7220618182832784549".
About the time it takes to process each row, I meant client (pig) side
not in the RS.
J-D
On Tue, Feb 14, 2012 at 1:33 PM, Mikael Sitruk wrote:
> Please see answer inline
> Thanks
> Mikael.S
>
> On Tue, Fe
Please see answer inline
Thanks
Mikael.S
On Tue, Feb 14, 2012 at 8:30 PM, Jean-Daniel Cryans wrote:
> On Tue, Feb 14, 2012 at 2:01 AM, Mikael Sitruk
> wrote:
> > hi,
> > Well no, i can't figure out what is the problem, but i saw that someone
> > else had the same problem (see email: "LeaseExcept
On Tue, Feb 14, 2012 at 2:01 AM, Mikael Sitruk wrote:
> hi,
> Well no, i can't figure out what is the problem, but i saw that someone
> else had the same problem (see email: "LeaseException despite high
> hbase.regionserver.lease.period")
> What can i tell is the following:
> Last week the problem
hi,
Well no, i can't figure out what is the problem, but i saw that someone
else had the same problem (see email: "LeaseException despite high
hbase.regionserver.lease.period")
What can i tell is the following:
Last week the problem was consistent
1. I updated hbase.regionserver.lease.period=30
Late answer, did you figure it out?
This exception happens when you don't use your scanner lease for more
than the lease time (default one minute). AFAIK that didn't change, so
maybe something else got slow? Or maybe some special configurations
you had didn't make it during the upgrade?
J-D
On M
Hi all
Recently I have upgraded my cluster from Hbase 0.90.1 to 0.90.4 (using
cloudera from cdh3u0 to cdh3u2)
Everything was ok till I ran pig extract on the new cluster, from the old
cluster everything worked well.
Now each time i run the extract in conjunction to other work performed on
the clus
21 matches
Mail list logo