Working with Hlog

2013-03-28 Thread Rishabh Agrawal
Hello, I am pretty new to Hbase and I am working on HLogs. I have following doubts: * I wish to know when does Hbase removes/deletes log files from log directory and what parameters governs it. * Can we control deletion of Hlog files say by time factor or by something else. A

Re: Understanding scan behaviour

2013-03-28 Thread ramkrishna vasudevan
Mohith, It is always better to go with start row and end row if you are knowing what are they. Just add one byte more to the actual end row (inclusive row) and form the end key. This will narrow down the search. Remeber the byte comparison is the way that HBase scans. Regards Ram On Fri, Mar 29

RE: Understanding scan behaviour

2013-03-28 Thread Li, Min
Hi, Mohit, Try using ENDROW. STARTROW&ENDROW is much faster than PrefixFilter. "+" ascii code is 43 "," ascii code is 44 scan 'SESSIONID_TIMELINE', {LIMIT => 1,STARTROW => '', ENDROW=>'+++,'} Min -Original Message- From: Mohit Anchlia [mailto:mohitanch...@gmail.com] Sent: Friday,

Re: coprocessor is timing out in 0.94

2013-03-28 Thread Ted Yu
Saurabh: I guess you are aware of the following for SingleColumnValueFilter, but just in case: * To prevent the entire row from being emitted if the column is not found * on a row, use {@link #setFilterIfMissing}. On Thu, Mar 28, 2013 at 6:51 PM, Agarwal, Saurabh wrote: > Ted, > > Thanks for

Re: coprocessor is timing out in 0.94

2013-03-28 Thread Marcos Luis Ortiz Valmaseda
Regards, Saurabh. I see that you are using SingleColumnValueFilter. Look for these links: http://gbif.blogspot.com/2012/05/optimizing-hbase-mapreduce-scans-for.html http://mapredit.blogspot.com/2012/05/using-filters-in-hbase-to-match-two.html Take a look later to this link, about the working to im

RE: coprocessor is timing out in 0.94

2013-03-28 Thread Agarwal, Saurabh
Ted, Thanks for response. Here is the filter we are using - SingleColumnValueFilter(Bytes.toBytes(columnFamily), Bytes.toBytes(columnQualifier), CompareFilter.CompareOp.EQUAL, new RegexStringComparator("(?i)"+"keyword")); The thread dump at different points show that coprocessor is getting c

Re: Task info on Hadoop web console.

2013-03-28 Thread GuoWei
Dear, We solved the problem because of the mismatch of the Hbase and Hadoop. We use base 0.94.1 and hadoop 0.20.2. After we change hadoop from 0.20.2 to 1.0.4. The problem solved. Thanks a lot. Best Regards Weibo: http://weibo.com/guowee Web: http://www.wbkit.com --

Re: coprocessor is timing out in 0.94

2013-03-28 Thread Ted Yu
bq. I checked thread dump If there was no exception in region server logs, thread dump of region server when your coprocessor was running would reveal where it got stuck. >From your description below, looks like you can utilize HBASE-5416 Improve performance of scans with some kind of filters. b

Re: coprocessor is timing out in 0.94

2013-03-28 Thread Ted Yu
bq. when I removed the filter, it ran fine in 0.94 Can you disclose more information about your filter ? BTW 0.94.6 was just released which is fully compatible with 0.94.2 Cheers On Thu, Mar 28, 2013 at 3:18 PM, Agarwal, Saurabh wrote: > Hi, > > We are in process of migrating from 0.92.1 to 0.

coprocessor is timing out in 0.94

2013-03-28 Thread Agarwal, Saurabh
Hi, We are in process of migrating from 0.92.1 to 0.94.2. A coprocessor was running fine in 0.92. After migrating to 0.94, the client is timing out (java.net.SocketTimeoutException). We are using coprocessor to apply the filter on one of the column and return the columns that match with that

Re: FuzzyRowFilter in hbase shell

2013-03-28 Thread Stack
What Alex said or try the filter language that is described here: http://hbase.apache.org/book.html#thrift.filter-language You can use it from the shell too (Do "help 'scan'" in shell and see where it talks about filters: "The filter can be specified in two ways: 1. Using a filterString - mor

Re: FuzzyRowFilter in hbase shell

2013-03-28 Thread Alex Baranau
I guess you have to use JRuby language to use it in shell. You may have an idea how to do that if you already using other filters in shell. I haven't done that.. sorry. Java example: Scan scan = new Scan(); List> fuzzyKeysData = new ArrayList>(); // search for "?BC??FG" fuzzyKeysD

Re: Unable to start regionserver in distributed mode

2013-03-28 Thread Mohammad Tariq
also make sure that you have proper name resolution as it is vital for a healthy Hbase operation. Sometimes even after configuring everything perfectly Hbase refuses to work as intended just because of improper name resolution. Add the actual host name along with their IPs in the /etc/hosts file.

FuzzyRowFilter in hbase shell

2013-03-28 Thread Robert Hamilton
Hi all. It it possible to test FuzzyRowFilter from the shell? If so, could somebody kindly point me to an example? TIA -- Bob -- This e-mail, including attachments, contains confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. The

Re: Unable to start regionserver in distributed mode

2013-03-28 Thread Leonid Fedotov
Make sure you use hosts names, not "localhost" as name in the hadoop and base configurations. Thank you! Sincerely, Leonid Fedotov On Mar 28, 2013, at 8:30 AM, anand nalya wrote: > Hi, > > I'm trying to run HBase in distrbuted mode with 2 region servers, but when > starting it, I'm getting th

Re: Understanding scan behaviour

2013-03-28 Thread Ted Yu
See javadoc of TimestampsFilter which reveals how you can narrow the scan: * Note: Use of this filter overrides any time range/time stamp * options specified using {@link org.apache.hadoop.hbase.client.Get#setTimeRange(long, long)}, * {@link org.apache.hadoop.hbase.client.Scan#setTimeRange(lo

Re: Understanding scan behaviour

2013-03-28 Thread Mohit Anchlia
Could the prefix filter lead to full tablescan? In other words is PrefixFilter applied after fetching the rows? Another question I have is say I have row key abc and abd and I search for row "abc", is it always guranteed to be the first key when returned from scanned results? If so I can alway put

Don't forget! hbasecon2013 call for presentations closes April 1st

2013-03-28 Thread Stack
Don't miss the deadline. Get your abstract in before April 1st (There are still a few out there hiding in the bushes as best as I can tell). Thanks all, St.Ack

Re: Understanding scan behaviour

2013-03-28 Thread Ted Yu
Take a look at the following in hbase-server/src/main/ruby/shell/commands/scan.rb (trunk) hbase> scan 't1', {FILTER => "(PrefixFilter ('row2') AND (QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, 456))"} Cheers On Thu, Mar 28, 2013 at 9:02 AM, Mohit Anchlia wrote: > I se

Re: Unable to start regionserver in distributed mode

2013-03-28 Thread Ted Yu
bq. 1 regionserver running on localhost:60020 But you were trying to run in distrbuted mode, right ? How did you deploy / start HBase ? Cheers On Thu, Mar 28, 2013 at 8:30 AM, anand nalya wrote: > Hi, > > I'm trying to run HBase in distrbuted mode with 2 region servers, but when > starting it,

Re: Understanding scan behaviour

2013-03-28 Thread Mohit Anchlia
I see then I misunderstood the behaviour. My keys are id + timestamp so that I can do a range type search. So what I really want is to return a row where id matches the prefix. Is there a way to do this without having to scan large amounts of data? On Thu, Mar 28, 2013 at 8:26 AM, Jean-Marc Spag

Re: Unable to start regionserver in distributed mode

2013-03-28 Thread shashwat shriparv
On Thu, Mar 28, 2013 at 9:00 PM, anand nalya wrote: > gionserver running on localhost:60020 check your hadoop configuration whether it is running with localhost or hbase if anywhere localhost??/ replace all localhost with actual domain name.. this is issue is just because hbase is starting at

Unable to start regionserver in distributed mode

2013-03-28 Thread anand nalya
Hi, I'm trying to run HBase in distrbuted mode with 2 region servers, but when starting it, I'm getting the following error: starting master, logging to /home/centos/opt/hbase-0.94.5/bin/../logs/hbase-centos-master-IMPETUS-DSRV01.IMPETUS.CO.IN.out 192.168.145.191: starting regionserver, logging t

Re: Understanding scan behaviour

2013-03-28 Thread Jean-Marc Spaggiari
Hi Mohit, "+" ascii code is 43 "9" ascii code is 57. So "+9" is coming after "++". If you don't have any row with the exact key "+", HBase will look for the first one after this one. And in your case, it's +9hC\xFC\x82s\xABL3\xB3B\xC0\xF9\x87\x03\x7F\xFF\xF. JM 2013/3/28 Mohit Anchlia : > M

Re: Understanding scan behaviour

2013-03-28 Thread Mohit Anchlia
My understanding is that the row key would start with + for instance. On Thu, Mar 28, 2013 at 7:53 AM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > Hi Mohit, > > I see nothing wrong with the results below. What would I have expected? > > JM > > 2013/3/28 Mohit Anchlia : > > I am r

Re: Understanding scan behaviour

2013-03-28 Thread Jean-Marc Spaggiari
Hi Mohit, I see nothing wrong with the results below. What would I have expected? JM 2013/3/28 Mohit Anchlia : > I am running 92.1 version and this is what happens. > > > hbase(main):003:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1, STARTROW => > 'sdw0'} > ROW

Re: Understanding scan behaviour

2013-03-28 Thread Mohit Anchlia
I am running 92.1 version and this is what happens. hbase(main):003:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1, STARTROW => 'sdw0'} ROW COLUMN+CELL s\xC1\xEAR\xDF\xEA&\x89\x91\xFF\x1A^\xB6d\xF0\xEC\x column=SID_T_MTX:\x00\x00Rc, timestamp=136305626

Re: Getting less write throughput due to more number of columns

2013-03-28 Thread Ted Yu
Prefix compression would lower the cost of storing value in rowkey. It was inspired by long rowkey, short value schema design. PREFIX and FAST_DIFF encodings are most often used. Cheers On Thu, Mar 28, 2013 at 7:26 AM, Pankaj Gupta wrote: > Would prefix compression (https://issues.apache.org/j

Re: Getting less write throughput due to more number of columns

2013-03-28 Thread Pankaj Gupta
Would prefix compression (https://issues.apache.org/jira/browse/HBASE-4676) improve this? This is an important question in terms of schema design. Given the choice of storing a value in column vs rowkey, I would many times want to store a value in a rowkey if I foresee it being used for constr

Re: Load balancing by table

2013-03-28 Thread Nurettin Şimşek
Thanks Ted. On Thu, Mar 28, 2013 at 10:21 AM, Ted wrote: > This feature is absent in 0.92 > > Please upgrade to 0.94.6 > > Thanks > > On Mar 28, 2013, at 12:45 AM, Nurettin Şimşek > wrote: > > > Hi, I want to load balance regions by table but in 0.92 version, > balancing > > doing by all table

Re: Please help me how and when to catch java.net.ConnectException in case hbase is not running as it goes on connecting with hbase

2013-03-28 Thread Ted Yu
The second sentence below seems incomplete. Can you give us more information (such as zookeeper log) ? Thanks On Thu, Mar 28, 2013 at 3:26 AM, gaurhari dass wrote: > I am using hbase .94.5 version ,I havent mentioned in class path > > On Thu, Mar 28, 2013 at 5:01 AM, Ted Yu wrote: > > > bq. or

Re: Task info on Hadoop web console.

2013-03-28 Thread Agarwal, Saurabh
Are you sure that MR job is running on the targeted cluster? Check the configuration dir you are passing to your hbase configuration. The MR job running on HBase should come up in Job tracker web gui. Regards, Saurabh. From: GuoWei [mailto:wei@wbkit.com] Sent: Wednesday, March 27, 2013 1

Re: Please help me how and when to catch java.net.ConnectException in case hbase is not running as it goes on connecting with hbase

2013-03-28 Thread gaurhari dass
I am using hbase .94.5 version ,I havent mentioned in class path On Thu, Mar 28, 2013 at 5:01 AM, Ted Yu wrote: > bq. org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists( > RecoverableZooKeeper.java:172) > > The exception showed problem connecting to zookeeper. > Is hbase-site.xml on t

Re: Load balancing by table

2013-03-28 Thread Ted
This feature is absent in 0.92 Please upgrade to 0.94.6 Thanks On Mar 28, 2013, at 12:45 AM, Nurettin Şimşek wrote: > Hi, I want to load balance regions by table but in 0.92 version, balancing > doing by all table in one group. > > Can I use hbase.master.loadbalance.bytable property in 0.92?

Load balancing by table

2013-03-28 Thread Nurettin Şimşek
Hi, I want to load balance regions by table but in 0.92 version, balancing doing by all table in one group. Can I use hbase.master.loadbalance.bytable property in 0.92? If I can't, how can I load balance regions by table? Thanks