hbase 0.90.1 upgrade issue - mapreduce job

2011-03-15 Thread Venkatesh
Hi When I upgraded to 0.90.1, mapreduce fails with exception.. system/job_201103151601_0121/libjars/hbase-0.90.1.jar does not exist. I have the jar file in classpath (hadoop-env.sh) any ideas? thanks

Re: habse schema design and retrieving values through REST interface

2011-03-15 Thread tsuna
On Tue, Mar 15, 2011 at 10:19 AM, sreejith P. K. wrote: > I need to maintain a huge table for a 'web crawler' project in HBASE. > Basically it contains thousands of keywords and for each keyword i need to > maintain a list of urls (it again will count in thousands). Corresponding to > each url, i

Re: Coprocessor Endpoints

2011-03-15 Thread Jason Rutherglen
Stack, Solr currently skips obtaining the global term frequency (see http://wiki.apache.org/solr/DistributedSearch under 'No distributed idf'). This means there'll be one network trip per region. What is nice about coupling realtime search to HBase is the search results will always be consistent

Re: Coprocessor Endpoints

2011-03-15 Thread Stack
Jason: What about the double trip done in solr, es, nutch, etc. where first query is about term frequency in each index (region) and then second query is the actual search w/ the term distribution factored in? Will you need to do something equivalent? St.Ack On Tue, Mar 15, 2011 at 2:53 PM, Ja

Re: Coprocessor postWALRestore deletes

2011-03-15 Thread Ted Yu
I think there may be more than one delete per edit. On Tue, Mar 15, 2011 at 2:45 PM, Jason Rutherglen < jason.rutherg...@gmail.com> wrote: > Ted, thanks for the info. > > > Please note that WALEdit parameter for postWALRestore() contains the > > collection of edits (KeyValue objects) > > Is the

Re: Coprocessor Endpoints

2011-03-15 Thread Jason Rutherglen
Gary, Thanks for your descriptive definition of how the HBase RPC works. > They will issue RPC calls in parallel to > all of the in the range starting with the region containing "startRow" and > ending with the region containing "endRow" (again using the row keys for the > region lookups). I thi

Re: Coprocessor postWALRestore deletes

2011-03-15 Thread Jason Rutherglen
Ted, thanks for the info. > Please note that WALEdit parameter for postWALRestore() contains the > collection of edits (KeyValue objects) Is the collection of edit limited to a single row? Can there be multiple deletes per edit? On Tue, Mar 15, 2011 at 1:38 PM, Ted Yu wrote: > Jason: > There'r

Re: Long client pauses with compression

2011-03-15 Thread Andrew Purtell
Created https://issues.apache.org/jira/browse/HBASE-3649 --- On Tue, 3/15/11, Stack wrote: > From: Stack > Subject: Re: Long client pauses with compression > To: user@hbase.apache.org, apurt...@apache.org > Date: Tuesday, March 15, 2011, 9:06 AM > Sounds like a nice feature to have > and to sh

Re: Coprocessor postWALRestore deletes

2011-03-15 Thread Ted Yu
Jason: There're 3 types of deletes: Delete((byte)8), DeleteColumn((byte)12), DeleteFamily((byte)14), You can choose the corresponding KeyValue method Please note that WALEdit parameter for postWALRestore() contains the collection of edits (KeyValue objects) On Tue, Mar 15, 2011 at 12:

Coprocessor postWALRestore deletes

2011-03-15 Thread Jason Rutherglen
How does one know if a postWALRestore is a delete? There's a set of methods KeyValue.isDelete*. Should one use these? For a single postWALRestore method call, is there only one delete (rather than an a batch delete, add, etc)?

Re: habse schema design and retrieving values through REST interface

2011-03-15 Thread Jean-Daniel Cryans
Can you tell why it's not able to get the bigger rows? Why would you try another schema if you don't even know what's going on right now? If you have the same issue with the new schema, you're back to square one right? Looking at the logs should give you some hints. J-D On Tue, Mar 15, 2011 at 1

Re: Retrieve values mechanism from an HBASE table+PHP+REST

2011-03-15 Thread Jean-Daniel Cryans
It's basically just a get right? So using REST isn't special to HBase, any tutorial will do, then it's just about using HBase with http://wiki.apache.org/hadoop/Hbase/Stargate#A3 J-D On Mon, Mar 14, 2011 at 10:52 PM, sreejith P. K. wrote: > Hello Experts, > I have a doubt regarding the value ret

Re: One of the regionserver aborted, then the master shut down itself

2011-03-15 Thread Jean-Daniel Cryans
Inline. J-D On Tue, Mar 15, 2011 at 8:32 AM, 茅旭峰 wrote: > Thanks J-D for your reply. > > It looks like HBASE-3617 will be included in 0.92, then when will 0.92 be > released? It should be included in the bug fix release 0.90.2, which isn't scheduled at the moment. Historically, HBase never had

habse schema design and retrieving values through REST interface

2011-03-15 Thread sreejith P. K.
Hello experts, I have a scenario as follows, I need to maintain a huge table for a 'web crawler' project in HBASE. Basically it contains thousands of keywords and for each keyword i need to maintain a list of urls (it again will count in thousands). Corresponding to each url, i need to store a num

Re: Coprocessor Endpoints

2011-03-15 Thread Gary Helmling
Hi Jason, That's basically correct. To export your own RPC methods from a coprocessor, you: 1) Define an interface containing the RPC methods. This interface must extend CoprocessorProtocol (which only requires you to implement getProtocolVersion()) 2) Implement the defined RPC interface in yo

Re: CopyTable MR job hangs

2011-03-15 Thread Jean-Daniel Cryans
Strangely enough I did answer that question the day you sent it but it doesn't show up on the mailing list aggregators even tho gmail marks it as sent... anyways here's what I said: It won't work because those versions aren't wire-compatible. What you can do instead is doing an Export, distcp the

Re: Long client pauses with compression

2011-03-15 Thread Matt Corgan
I've run into this problem and mitigated it by setting the memstore flush size to 256mb, but i'm curious why flushing an uncompressed file would help the situation? Also, would there be a downside to setting the default higher than 64mb in general, especially since most people use compression? Is

Coprocessor Endpoints

2011-03-15 Thread Jason Rutherglen
I'm taking a look at TestCoprocessorEndpoint for example, in trying to figure out how the Coprocessor RPC works. I think HTable.coprocessorProxy should be used? Which will return an interface that when called performs the network marshaling etc. The purpose of the row byte[] in coprocessorProxy

Re: Long client pauses with compression

2011-03-15 Thread Stack
Sounds like a nice feature to have and to ship as the default. St.Ack On Tue, Mar 15, 2011 at 1:53 AM, Andrew Purtell wrote: > We have a separate compression setting for major compaction vs store files > written during minor compaction (for background/archival apps). > > Why not a separate compr

Re: One of the regionserver aborted, then the master shut down itself

2011-03-15 Thread 茅旭峰
Thanks J-D for your reply. It looks like HBASE-3617 will be included in 0.92, then when will 0.92 be released? Yes, you're right, we launched tens of threads, putting values of 4MB on average, endless. Does the region server meant to die because of OOM? I thought it's region servers' responsibilt

Re: CopyTable MR job hangs

2011-03-15 Thread Lars George
Hi Eran, We need more details. It sounds like an issue with the ZooKeeper quorum. In other words that it cannot connect to the ZK servers. Often this is then logged during the task failures as it trying to connect to localhost. Could you grab more logs and up them to pastebin or some such? Lars

Re: Enable Bloomfilter on HFile

2011-03-15 Thread Lars George
Hi Alex, This was added in https://issues.apache.org/jira/browse/HBASE-1200 affecting 0.90 and later only. But this is an interesting question, the bulk import using HFOF can handle compression, but not bloom filters. If you have bloom fliters enables then compactions will add them, but it may ma

Re: Long client pauses with compression

2011-03-15 Thread Andrew Purtell
We have a separate compression setting for major compaction vs store files written during minor compaction (for background/archival apps). Why not a separate compression setting for flushing? I.e. none? --- On Mon, 3/14/11, Jean-Daniel Cryans wrote: > From: Jean-Daniel Cryans > Subject: Re:

Re: Long client pauses with compression

2011-03-15 Thread Lars George
Hi, Whenever I am with clients and we design for HBase the first thing I do is spent a few hours explaining exactly that scenario and the architecture behind it. As for the importing and HBase simply lacking a graceful degradation that works in all cases I nowadays quickly point to the bulk import

Re: CopyTable MR job hangs

2011-03-15 Thread Eran Kutner
No idea anyone? -eran On Wed, Mar 2, 2011 at 16:40, Eran Kutner wrote: > Hi, > I'm trying to copy data from an older cluster using 0.89 (CDH3b3) to a new > one using 0.91 (CDH3b4) using the CopyTable MR job but it always hangs on > "map 0% reduce 0%" until eventually the job is killed by Hado

Enable Bloomfilter on HFile

2011-03-15 Thread Nanheng Wu
Hi, I am bulk loading data into HBase using a MR job with HFileOutput format, the data is read-only once it's loaded. Is it possible to still enable Bloomfilter? I am guessing no, since it needs to be written as part of the HFile and at least for Hbase-0.20.6 I don't see such option. Is my assumpt