Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread Wei Tan
Is your CheckAndPut involving a local or remote READ? Due to the nature of LSM, read is much slower compared to a write... Best Regards, Wei From: Prakash Kadel To: "user@hbase.apache.org" , Date: 02/17/2013 07:49 PM Subject:coprocessor enabled put very slow, help please~~~

RE: Co-Processor in scanning the HBase's Table

2013-02-17 Thread Anoop Sam John
We dont have any hook like postScan().. In ur case you can try with postScannerClose().. This will be called once per region. When the scan on that region is over the scanner opened on that region will get closed and at that time this hook will get executed. -Anoop- __

Re: Co-Processor in scanning the HBase's Table

2013-02-17 Thread Farrokh Shahriari
Thanks you Amit,I will check that. @Anoop: I wanna run that just after scanning a region or after scanning the regions that to belong to one regionserver. On Mon, Feb 18, 2013 at 7:45 AM, Anoop Sam John wrote: > >I wanna use a custom code after scanning a large table and prefer to run > the code

RE: Co-Processor in scanning the HBase's Table

2013-02-17 Thread Anoop Sam John
>I wanna use a custom code after scanning a large table and prefer to run the code after scanning each region Exactly at what point you want to run your custom code? We have hooks at points like opening a scanner at a region, closing scanner at a region, calling next (pre/post) etc -Anoop- ___

RE: Hive over HBase Integration Issue

2013-02-17 Thread Zheng, Kai
Hi Harsh J, Thanks for your replying. It looks good to me to place the client-sided hbase-site.xml in the same directory as hive-site.xml. I already tried this approach before but didn't work. I will investigate this way wondering if there're some gaps here or possibly It was just caused due

Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread lars hofhansl
The main advantage of coprocessors is that they keep the logic local to the region server. Putting data into other region servers is supported, but defeats the performance purpose. From: Prakash Kadel To: "user@hbase.apache.org" Sent: Sunday, February 17, 2

Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread Prakash Kadel
is there a way to do async writes with coprocessors triggered by Put operations? thanks Sincerely, Prakash Kadel On Feb 18, 2013, at 10:31 AM, Michael Segel wrote: > Hmmm. Can you have async writes using a coprocessor? > > > On Feb 17, 2013, at 7:17 PM, lars hofhansl wrote: > >> Index main

Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread Prakash Kadel
one more question. even if the coprocessors are making insertions to different region, since i use "postCheckAndPut" shouldnt there be not much prefomance slow down? thanks Sincerely, Prakash Kadel On Feb 18, 2013, at 10:17 AM, lars hofhansl wrote: > Index maintenance will always be slower.

Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread Michael Segel
Hmmm. Can you have async writes using a coprocessor? On Feb 17, 2013, at 7:17 PM, lars hofhansl wrote: > Index maintenance will always be slower. An interesting comparison would be > to also update your indexes from the M/R and see whether that performs better. > > > >

Re: Row Key Design in time based aplication

2013-02-17 Thread Michael Segel
So its a *shameless* plug? :-) Depending on the project id, it could be a good key, but it would have to be something more meaningful than just a number. To answer the question about time... Time Stamps are Longs which hold the number of ms since the a set time. (I forget the date and time bu

Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread Prakash Kadel
thanks again, i did try making indexes with the MR. dont have exact evaluation data, but inserting indexes directly with mapreduce does seem to be much much faster than making the indexes with the coprocessors. guess i am missing the point about the coprosessors. my reason for trying out the

Re: Row Key Design in time based aplication

2013-02-17 Thread James Taylor
Michael is right - Phoenix wouldn't automatically solve these issues for you - it would just a) decrease the amount of code you need to write while still giving you coprocessor-speed performance, and b) give you an industry standard API to read/write your data. However, since the date is not t

Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread lars hofhansl
Index maintenance will always be slower. An interesting comparison would be to also update your indexes from the M/R and see whether that performs better. From: Prakash Kadel To: "user@hbase.apache.org" Sent: Sunday, February 17, 2013 5:13 PM Subject: Re: co

Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread Prakash Kadel
thank you lars, That is my guess too. I am confused, isnt that something that cannot be controlled. Is this approach of creating some kind of index wrong? Sincerely, Prakash Kadel On Feb 18, 2013, at 10:07 AM, lars hofhansl wrote: > Presumably the coprocessor issues Puts to another region serv

Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread lars hofhansl
Presumably the coprocessor issues Puts to another region server in most cases, that could explain it being (much) slower. From: Prakash Kadel To: "user@hbase.apache.org" Sent: Sunday, February 17, 2013 4:52 PM Subject: Re: coprocessor enabled put very slow,

Re: coprocessor enabled put very slow, help please~~~

2013-02-17 Thread Prakash Kadel
Forgot to mention. I am using 0.92. Sincerely, Prakash On Feb 18, 2013, at 9:48 AM, Prakash Kadel wrote: > hi, > i am trying to insert few million documents to hbase with mapreduce. To > enable quick search of docs i want to have some indexes, so i tried to use > the coprocessors, but they

coprocessor enabled put very slow, help please~~~

2013-02-17 Thread Prakash Kadel
hi, i am trying to insert few million documents to hbase with mapreduce. To enable quick search of docs i want to have some indexes, so i tried to use the coprocessors, but they are slowing down my inserts. Arent the coprocessors not supposed to increase the latency? my settings: 3 regio

Re: HBase and Data Integrity

2013-02-17 Thread Andrew Purtell
Yes, but implementing a constraint requires implementing a coprocessor, which is not something I'd recommend to users as a general practice. HBase doesn't offer referential integrity as commonly understood in the RDBMS world (foreign keys, etc.) out of the box. On Sat, Feb 16, 2013 at 6:52 PM, Te

Re: increment counters via bulk upload in HBase

2013-02-17 Thread Andrew Purtell
> Is there a way to increment counters in HBase via bulk upload? I thought about maybe doing this once, as https://issues.apache.org/jira/browse/HBASE-3936, but we decided to resolve it as maybe something to try later if there was ever a compelling need. I wonder if doing the in memory merge of mo

Re: Row Key Design in time based aplication

2013-02-17 Thread Mohammad Tariq
Hello Mehmet, If ProjectIds are sequential, then it is definitely not a feasible approach. Division is just to make sure that all the regions are evenly loaded. You can create pre-splitted tables to avoid the region hotspotting. Alternatively hash your rowkeys so that all the regionservers receiv

Re: Row Key Design in time based aplication

2013-02-17 Thread Michael Segel
I'm not sure how a SQL interface above HBase will solve some of the issues with regional hot spotting when using time as the key. Or the problem with always adding data to the right of the last row. The same would apply with the project id, assuming that it too is a number that grows increment

Re: Row Key Design in time based aplication

2013-02-17 Thread James Taylor
Hello, Have you considered using Phoenix (https://github.com/forcedotcom/phoenix) for this use case? Phoenix is a SQL layer on top of HBase. For this use case, you'd connect to your cluster like this: Class.forName("com.salesforce.phoenix.jdbc.PhoenixDriver"); // register driver Connection c

Re: Re-balancer on datanodes that run hbase regions servers

2013-02-17 Thread Harsh J
It is a bad idea only cause it will temporarily distort the perfect locality of the regions hosted by each RegionServer. This gets corrected only at the end of the next major compaction of all regions, eventually, but both the events would cause some small level of performance dips and increase in

Re: Co-Processor in scanning the HBase's Table

2013-02-17 Thread Amit Sela
You can use a postScan() region observer as a trigger to run your code or use endpoint that you (your code) will have to call after the scan. Try this link https://blogs.apache.org/hbase/entry/coprocessor_introduction Good luck. On Feb 17, 2013 8:51 PM, "Farrokh Shahriari" wrote: > Hi there > I

Row Key Design in time based aplication

2013-02-17 Thread Mehmet Simsek
Hi, I want to hold event log data in hbase but I couldn't decide row key. I must hold project id and time,I will use project ld and time combination while searching. Row key can be below ProjectId+timeInMs In similiar application(open source TSDB) time is divided 1000 to round in this projec

Re: Hive over HBase Integration Issue

2013-02-17 Thread Harsh J
You could replicate the configs or place the client-sided hbase-site.xml in the same directory as hive-site.xml for it to get picked up via the classpath by HBase centric classes. On Sun, Feb 17, 2013 at 2:41 PM, Zheng, Kai wrote: > Hi all, > > > > In Hive over HBase case, how to configure and g