Re: 0.92 Branch in Maven

2012-02-22 Thread Ulrich Staudinger
Hey Stack, if i recall correctly, all that is needed is to push to apache's central maven is a jar, a pom and an md5 fingerprint. I'll contact you offlist regards, ulrich On Thu, Feb 23, 2012 at 6:48 AM, Stack wrote: > On Wed, Feb 22, 2012 at 6:03 PM, Stephen Boesch wrote: > > hear hear.

number of region servers is wrong

2012-02-22 Thread Lu, Wei
Hi, I met with a weird problem when using HBase. There are 3 machines: 1 master and 2 region servers (wlu-rs1/10.27.17.251 and wlu-rs2/10.27.16.11). But when I use "status 'detailed'" to see region servers' status, it show there are three server, and one server appears twice (exactly same). 3 l

Re: Seeking HBase schema design examples

2012-02-22 Thread Igor Lautar
I've found OpenTSDB interesting: http://opentsdb.net/schema.html On Wed, Feb 22, 2012 at 10:40 PM, Ian Varley wrote: > All: > > I’m doing a study on HBase schema design, with a goal of contributing back > a presentation or summary about how data modeling is practically done in > HBase. I'd lik

Re: 0.92 Branch in Maven

2012-02-22 Thread Stack
On Wed, Feb 22, 2012 at 6:03 PM, Stephen Boesch wrote: > hear hear.. been using a maven local repo in the interim > Sorry. This has been going on too long. I'll buy beer for all the frustrated and those who have had to run w/ it manually installed into local repo. I'll get it eventually. (Any

Re: HBase0.92: In Filter, ReturnCode.NEXT_ROW may lead to next columnFamily but not next row?

2012-02-22 Thread NNever
Thanks Ted, I don't know mailing list strips attachment before. Here is the attache: TestFilter.java:http://pastebin.com/zC6EF8pX and the log: http://pastebin.com/RsKJSHcn 2012/2/23 Ted Yu > N: > Can you publish your code on pastebin or somewhere ? > Mailing list strips attachment. > > T

Re: HBase0.92: In Filter, ReturnCode.NEXT_ROW may lead to next columnFamily but not next row?

2012-02-22 Thread Ted Yu
N: Can you publish your code on pastebin or somewhere ? Mailing list strips attachment. Thanks On Tue, Feb 21, 2012 at 5:47 PM, NNever wrote: > Attach is my test customFilter code --- TestFilter. > It just simply extends FilterBase and do some system.out... > You can just try any Table has more

Re: HBase0.92: In Filter, ReturnCode.NEXT_ROW may lead to next columnFamily but not next row?

2012-02-22 Thread NNever
Anyone got time to make a test for this? It really confuse me 2012/2/22 NNever > Attach is my test customFilter code --- TestFilter. > It just simply extends FilterBase and do some system.out... > You can just try any Table has more than one columnFamily like below: > > *Scan scan = new Scan

Re: 0.92 Branch in Maven

2012-02-22 Thread Stephen Boesch
hear hear.. been using a maven local repo in the interim 2012/2/22 Chris Carter > Hi guys, > > We're currently using 0.90 from the maven repo, and would > love to upgrade to 0.92, but can only find a recent > 0.92.1 SNAPSHOT release from a few days ago. Are we > missing any repos or places to l

0.92 Branch in Maven

2012-02-22 Thread Chris Carter
Hi guys, We're currently using 0.90 from the maven repo, and would love to upgrade to 0.92, but can only find a recent 0.92.1 SNAPSHOT release from a few days ago. Are we missing any repos or places to look for the latest release in Maven? (been looking in Maven Central and the Apache Repos

Re: .logs directory keeps growing

2012-02-22 Thread Jean-Daniel Cryans
Many things going on here. 1- The logs are receiving edits from all the regions, so a log can't be cleared until all the regions it contains are flushed. 2- When you close regions (drop a tables implies doing that), it doesn't force roll logs (it would be very bad on performance) so they won't ge

.logs directory keeps growing

2012-02-22 Thread Alok Singh
Environment Hbase: 0.92 Hadoop: hadoop-0.20.2-cdh3u3 I am testing hbase 0.92 for a new storage system we are building. In the tests, I insert around 2-3 billion rows and then run some scans/queries against it to test the performance. Once the tests are complete, I drop all of the tables and recrea

Seeking HBase schema design examples

2012-02-22 Thread Ian Varley
All: I’m doing a study on HBase schema design, with a goal of contributing back a presentation or summary about how data modeling is practically done in HBase. I'd like to base it as much as possible on real world examples (i.e. things that are running in production today). I’ve got several exa

Re: Using HBase as ACL store for Spring Security

2012-02-22 Thread Alan Chaney
On 2/22/2012 12:14 PM, Enis Söztutar wrote: Interesting use case. For your product, do you also need to secure hbase as well? What do you mean "secure hbase"? We use hbase to store information which has different ownerships and permissions, but the store is only accessible by our software, acti

Re: Using HBase as ACL store for Spring Security

2012-02-22 Thread Enis Söztutar
Interesting use case. For your product, do you also need to secure hbase as well? Enis On Wed, Feb 22, 2012 at 10:03 AM, Alan Chaney wrote: > Hi > > We are using Spring Security and HBase in our product. We are adding ACL > support through Spring and are looking at implementing the ACL store in

[ANN] Hadoop Summit Call-for-Papers Deadline Extended

2012-02-22 Thread Joe McGonnell
At the request of many, the call-for-papers deadline for Hadoop Summit has been extended by two weeks to March 7th. Also, please note that you only need to submit an abstract at this time. You will have much more time to create your actual presentations. Additional details are here: www.hadoops

[ANN] Hadoop Summit Call-for-Papers Deadline Extended

2012-02-22 Thread Joe McGonnell
At the request of many, the call-for-papers deadline for Hadoop Summit has been extended by two weeks to March 7th. Also, please note that you only need to submit an abstract at this time. You will have much more time to create your actual presentations. Additional details are here: www.hadoops

Re: Solr & HBase - Re: How is Data Indexed in HBase?

2012-02-22 Thread Ian Varley
One minor clarification: HBase is primarily built for retrieving a single row at a time based on a predetermined and known location (the key). Substitute that with: "HBase is primarily built for retrieving sets of contiguous sorted rows based on a predetermined and known location (the start key

Re: Corresponding table in Hbase

2012-02-22 Thread Jacques
I'll be... didn't even know that existed in the HBase code base :) On Wed, Feb 22, 2012 at 9:25 AM, Stack wrote: > On Wed, Feb 22, 2012 at 8:40 AM, Jacques wrote: > > We have a crawl table and here are a couple quick thoughts: > > > > - I'd suggest that you use reverse url as your primary key.

Re: Solr & HBase - Re: How is Data Indexed in HBase?

2012-02-22 Thread Jacques
>> Solr does not provide a complex enough support to rank. I believe Solr has a bunch of plug-ability to write your own custom ranking approach. If you think you can't do your desired ranking with Solr, you're probably wrong and need to ask for help from the Solr community. >> retrieving data by

Using HBase as ACL store for Spring Security

2012-02-22 Thread Alan Chaney
Hi We are using Spring Security and HBase in our product. We are adding ACL support through Spring and are looking at implementing the ACL store in HBase. I just wondered if anyone else has done this, and if so, maybe they could share code/experiences? Regards Alan Chaney

Re: Solr & HBase - Re: How is Data Indexed in HBase?

2012-02-22 Thread Bing Li
Mr Gupta, Thanks so much for your reply! In my use cases, retrieving data by keyword is one of them. I think Solr is a proper choice. However, Solr does not provide a complex enough support to rank. And, frequent updating is also not suitable in Solr. So it is difficult to retrieve data randomly

Re: Solr & HBase - Re: How is Data Indexed in HBase?

2012-02-22 Thread T Vinod Gupta
Bing, Its a classic battle on whether to use solr or hbase or a combination of both. both systems are very different but there is some overlap in the utility. they also differ vastly when it compares to computation power, storage needs, etc. so in the end, it all boils down to your use case. you ne

Re: Solr & HBase - Re: How is Data Indexed in HBase?

2012-02-22 Thread Ted Yu
There is no secondary index support in HBase at the moment. It's on our road map. FYI On Wed, Feb 22, 2012 at 9:28 AM, Bing Li wrote: > Jacques, > > Yes. But I still have questions about that. > > In my system, when users search with a keyword arbitrarily, the query is > forwarded to Solr. No

Solr & HBase - Re: How is Data Indexed in HBase?

2012-02-22 Thread Bing Li
Jacques, Yes. But I still have questions about that. In my system, when users search with a keyword arbitrarily, the query is forwarded to Solr. No any updating operations but appending new indexes exist in Solr managed data. When I need to retrieve data based on ranking values, HBase is used. A

Re: Corresponding table in Hbase

2012-02-22 Thread Stack
On Wed, Feb 22, 2012 at 8:40 AM, Jacques wrote: > We have a crawl table and here are a couple quick thoughts: > > - I'd suggest that you use reverse url as your primary key.  Specifically, > reversed host name but normal path and query string. Maybe this utility in hbase helps do what Jacque sugg

Re: How is Data Indexed in HBase?

2012-02-22 Thread Brock Noland
Agreed, you could use Solr on top of HBase though. https://github.com/Photobucket/Solbase Brock On Wed, Feb 22, 2012 at 11:17 AM, Jacques wrote: > It is highly unlikely that you could replace Solr with HBase.  They're > really apples and oranges. > > > On Wed, Feb 22, 2012 at 1:09 AM, Bing Li

Re: How is Data Indexed in HBase?

2012-02-22 Thread Jacques
It is highly unlikely that you could replace Solr with HBase. They're really apples and oranges. On Wed, Feb 22, 2012 at 1:09 AM, Bing Li wrote: > Dear all, > > I wonder how data in HBase is indexed? Now Solr is used in my system > because data is managed in inverted index. Such an index is su

Re: Corresponding table in Hbase

2012-02-22 Thread Jacques
We have a crawl table and here are a couple quick thoughts: - I'd suggest that you use reverse url as your primary key. Specifically, reversed host name but normal path and query string. - Rather than maintaining separate rows for the same url using timestamp or similar, I'd recommend that you us

Re: Flushing to HDFS sooner

2012-02-22 Thread Manuel de Ferran
I tried also with hbase-0.92 with hadoop-1.0.0 (same configuration than before) and it works fine (means no data loss). With hbase-0.90.3/hadoop-0.20-append, I checked my append configuration, and ran the unit tests successfully. Maybe the master starts hlog processing then blocks on something (o

Re: Corresponding table in Hbase

2012-02-22 Thread Ian Varley
Adarsh, HBase doesn't have the concept of a globally unique auto-incrementing "ID" column; that would require that all PUTs to any region of a table first go through some central ID authority to get a unique ID, and that sort of goes against the general HBase approach (in which operations on re

Re: How is Data Indexed in HBase?

2012-02-22 Thread Doug Meil
You probably want to start with reading about the StoreFiles and how Hbase stores data internally. http://hbase.apache.org/book.html#regions.arch On 2/22/12 4:09 AM, "Bing Li" wrote: >Dear all, > >I wonder how data in HBase is indexed? Now Solr is used in my system >because data is managed

RE: hbase delete operation is very slow

2012-02-22 Thread Haijia Zhou
Thanks for the suggestion. I did use List with size 1000, actually the performance was not that different from deleting one row at a time. I investigated HRegion.delete() method, my understanding is that when you call delete() to delete a row, it's actually going to delete all the column families

How is Data Indexed in HBase?

2012-02-22 Thread Bing Li
Dear all, I wonder how data in HBase is indexed? Now Solr is used in my system because data is managed in inverted index. Such an index is suitable to retrieve unstructured and huge amount of data. How does HBase deal with the issue? May I replaced Solr with HBase? Thanks so much! Best regards,