Is your CheckAndPut involving a local or remote READ? Due to the nature of
LSM, read is much slower compared to a write...
Best Regards,
Wei
From: Prakash Kadel
To: "user@hbase.apache.org" ,
Date: 02/17/2013 07:49 PM
Subject:coprocessor enabled put very slow, help please~~~
We dont have any hook like postScan().. In ur case you can try with
postScannerClose().. This will be called once per region. When the scan on
that region is over the scanner opened on that region will get closed and at
that time this hook will get executed.
-Anoop-
__
Thanks you Amit,I will check that.
@Anoop: I wanna run that just after scanning a region or after scanning the
regions that to belong to one regionserver.
On Mon, Feb 18, 2013 at 7:45 AM, Anoop Sam John wrote:
> >I wanna use a custom code after scanning a large table and prefer to run
> the code
>I wanna use a custom code after scanning a large table and prefer to run
the code after scanning each region
Exactly at what point you want to run your custom code? We have hooks at
points like opening a scanner at a region, closing scanner at a region, calling
next (pre/post) etc
-Anoop-
___
Hi Harsh J,
Thanks for your replying.
It looks good to me to place the client-sided hbase-site.xml in the same
directory as hive-site.xml.
I already tried this approach before but didn't work. I will investigate this
way wondering if there're some gaps here or possibly
It was just caused due
The main advantage of coprocessors is that they keep the logic local to the
region server. Putting data into other region servers is supported, but defeats
the performance purpose.
From: Prakash Kadel
To: "user@hbase.apache.org"
Sent: Sunday, February 17, 2
is there a way to do async writes with coprocessors triggered by Put operations?
thanks
Sincerely,
Prakash Kadel
On Feb 18, 2013, at 10:31 AM, Michael Segel wrote:
> Hmmm. Can you have async writes using a coprocessor?
>
>
> On Feb 17, 2013, at 7:17 PM, lars hofhansl wrote:
>
>> Index main
one more question.
even if the coprocessors are making insertions to different region, since i use
"postCheckAndPut" shouldnt there be not much prefomance slow down?
thanks
Sincerely,
Prakash Kadel
On Feb 18, 2013, at 10:17 AM, lars hofhansl wrote:
> Index maintenance will always be slower.
Hmmm. Can you have async writes using a coprocessor?
On Feb 17, 2013, at 7:17 PM, lars hofhansl wrote:
> Index maintenance will always be slower. An interesting comparison would be
> to also update your indexes from the M/R and see whether that performs better.
>
>
>
>
So its a *shameless* plug? :-)
Depending on the project id, it could be a good key, but it would have to be
something more meaningful than just a number.
To answer the question about time... Time Stamps are Longs which hold the
number of ms since the a set time. (I forget the date and time bu
thanks again,
i did try making indexes with the MR. dont have exact evaluation data, but
inserting indexes directly with mapreduce does seem to be much much faster than
making the indexes with the coprocessors. guess i am missing the point about
the coprosessors.
my reason for trying out the
Michael is right - Phoenix wouldn't automatically solve these issues for
you - it would just a) decrease the amount of code you need to write
while still giving you coprocessor-speed performance, and b) give you an
industry standard API to read/write your data.
However, since the date is not t
Index maintenance will always be slower. An interesting comparison would be to
also update your indexes from the M/R and see whether that performs better.
From: Prakash Kadel
To: "user@hbase.apache.org"
Sent: Sunday, February 17, 2013 5:13 PM
Subject: Re: co
thank you lars,
That is my guess too. I am confused, isnt that something that cannot be
controlled. Is this approach of creating some kind of index wrong?
Sincerely,
Prakash Kadel
On Feb 18, 2013, at 10:07 AM, lars hofhansl wrote:
> Presumably the coprocessor issues Puts to another region serv
Presumably the coprocessor issues Puts to another region server in most cases,
that could explain it being (much) slower.
From: Prakash Kadel
To: "user@hbase.apache.org"
Sent: Sunday, February 17, 2013 4:52 PM
Subject: Re: coprocessor enabled put very slow,
Forgot to mention. I am using 0.92.
Sincerely,
Prakash
On Feb 18, 2013, at 9:48 AM, Prakash Kadel wrote:
> hi,
> i am trying to insert few million documents to hbase with mapreduce. To
> enable quick search of docs i want to have some indexes, so i tried to use
> the coprocessors, but they
hi,
i am trying to insert few million documents to hbase with mapreduce. To
enable quick search of docs i want to have some indexes, so i tried to use the
coprocessors, but they are slowing down my inserts. Arent the coprocessors not
supposed to increase the latency?
my settings:
3 regio
Yes, but implementing a constraint requires implementing a coprocessor,
which is not something I'd recommend to users as a general practice. HBase
doesn't offer referential integrity as commonly understood in the RDBMS
world (foreign keys, etc.) out of the box.
On Sat, Feb 16, 2013 at 6:52 PM, Te
> Is there a way to increment counters in HBase via bulk upload?
I thought about maybe doing this once, as
https://issues.apache.org/jira/browse/HBASE-3936, but we decided to resolve
it as maybe something to try later if there was ever a compelling need. I
wonder if doing the in memory merge of mo
Hello Mehmet,
If ProjectIds are sequential, then it is definitely not a feasible
approach. Division is just to make sure that all the regions are
evenly loaded. You can create pre-splitted tables to avoid the
region hotspotting. Alternatively hash your rowkeys so that all
the regionservers receiv
I'm not sure how a SQL interface above HBase will solve some of the issues with
regional hot spotting when using time as the key. Or the problem with always
adding data to the right of the last row.
The same would apply with the project id, assuming that it too is a number that
grows increment
Hello,
Have you considered using Phoenix
(https://github.com/forcedotcom/phoenix) for this use case? Phoenix is a
SQL layer on top of HBase. For this use case, you'd connect to your
cluster like this:
Class.forName("com.salesforce.phoenix.jdbc.PhoenixDriver"); // register
driver
Connection c
It is a bad idea only cause it will temporarily distort the perfect
locality of the regions hosted by each RegionServer. This gets
corrected only at the end of the next major compaction of all regions,
eventually, but both the events would cause some small level of
performance dips and increase in
You can use a postScan() region observer as a trigger to run your code or
use endpoint that you (your code) will have to call after the scan.
Try this link
https://blogs.apache.org/hbase/entry/coprocessor_introduction
Good luck.
On Feb 17, 2013 8:51 PM, "Farrokh Shahriari"
wrote:
> Hi there
> I
Hi,
I want to hold event log data in hbase but I couldn't decide row key. I must
hold project id and time,I will use project ld and time combination while
searching.
Row key can be below
ProjectId+timeInMs
In similiar application(open source TSDB) time is divided 1000 to round in this
projec
You could replicate the configs or place the client-sided
hbase-site.xml in the same directory as hive-site.xml for it to get
picked up via the classpath by HBase centric classes.
On Sun, Feb 17, 2013 at 2:41 PM, Zheng, Kai wrote:
> Hi all,
>
>
>
> In Hive over HBase case, how to configure and g
26 matches
Mail list logo