Re: Row Keys

2011-01-28 Thread Dani Rayan
In HBase the concept of "column qualifiers" is interesting, it can be created on fly for a "column-family" So it is as good as tagging the data. Hence, you can get all rows belonging to particular tag/qualifier using rowscan. I'm not sure if this answers your query. I know they are always sorted b

Re: Row Keys

2011-01-28 Thread Dani Rayan
Hey can explain your query with example ? I know they are always sorted but if they are how do you know which row key > belong to which data? Currently I use a row key of ID|Date > > I don't clearly understand "which data", there are few things like getFamilyMap etc. which allows you to get more

Re: Tables & rows disappear

2011-01-28 Thread Dani Rayan
Hey, This can happen in couple of scenarios: 1. If the "writeBuffer" value is quite large and the writes are too little for "autoflush" to be called [default is 2mb for writeBuffer] 2. You have set the "autoFlush" to false and never call flushCommits If you haven't configured these properties i

Re: How to improve the speed of HTable scan

2011-01-28 Thread 陈加俊
final Pair ranges = table.getStartEndKeys(); final byte[][] startKeys = ranges.getFirst(); final byte[][] endKeys = ranges.getSecond(); I scan the first range and delete the rows,and get the second range and delete the rows. But I find the ranges is not changed after about 5 minutes. 2011/1/26

Re: Delete reveals older version of a column even when VERSIONS=1

2011-01-28 Thread Mike Percy
Hmm... how does this relate to setting VERSIONS => '1'? By setting # of versions to 1 are we getting some space benefit over say VERSIONS => '10'? Thanks, Mike On Jan 28, 2011, at 5:47 PM, Ryan Rawson wrote: > I would call it 'a surprising, perhaps unexpected consequence of our > storage model'

Re: Delete reveals older version of a column even when VERSIONS=1

2011-01-28 Thread Ryan Rawson
I would call it 'a surprising, perhaps unexpected consequence of our storage model'. There are 2 types of deletes in hbase, you are doing type (a) "delete a single version", but you probably want type (b) "delete all versions in this column" On Fri, Jan 28, 2011 at 5:43 PM, Mike Percy wrote: >

Delete reveals older version of a column even when VERSIONS=1

2011-01-28 Thread Mike Percy
Hi folks, I am seeing some unexpected behavior with HBase 0.20.6 when deleting columns. Our cluster has been running for some time however we recently upgraded from Hbase 0.20.3. The family I am writing to is specified as VERSIONS => '1' when doing a describe, yet HBase appears to be maintaining

Re: Unresponsive master in Hbase 0.90.0

2011-01-28 Thread Vidhyashankar Venkataraman
64 bit Java 1.6. Why is the master even trying to issue a split with an empty log/region in hand? ( private List splitLog(final FileStatus[] logfiles) ) V On 1/28/11 3:06 PM, "Todd Lipcon" wrote: The 16000 second sleep is really strange... never seen anything like it. What JVM are you runni

Re: multiple masters

2011-01-28 Thread Bill Graham
Thanks Stack, this is really helpful. On Fri, Jan 28, 2011 at 2:06 PM, Stack wrote: > On Fri, Jan 28, 2011 at 1:15 PM, Bill Graham wrote: >> I also don't have a solid understanding of the responsibilities of >> master, but it seems like it's job is really about managing regions >> (i.e., coordin

Re: Unresponsive master in Hbase 0.90.0

2011-01-28 Thread Todd Lipcon
The 16000 second sleep is really strange... never seen anything like it. What JVM are you running? -Todd On Fri, Jan 28, 2011 at 11:29 AM, Stack wrote: > On Fri, Jan 28, 2011 at 11:23 AM, Vidhyashankar Venkataraman > wrote: > > We are working on trying to fix this (cc'ed Adam as well). > > >

Re: multiple masters

2011-01-28 Thread Stack
On Fri, Jan 28, 2011 at 1:15 PM, Bill Graham wrote: > I also don't have a solid understanding of the responsibilities of > master, but it seems like it's job is really about managing regions > (i.e., coordinating splits and compactions, etc.) and updating ROOT > and META. Is that correct? > > Yes

Re: multiple masters

2011-01-28 Thread Bill Graham
I also don't have a solid understanding of the responsibilities of master, but it seems like it's job is really about managing regions (i.e., coordinating splits and compactions, etc.) and updating ROOT and META. Is that correct? On Fri, Jan 28, 2011 at 9:31 AM, Weishung Chung wrote: > Great, th

Re: script to delete regions with no rows

2011-01-28 Thread Venkatesh
thankyou -Original Message- From: Stack To: user@hbase.apache.org Sent: Fri, Jan 28, 2011 3:43 pm Subject: Re: script to delete regions with no rows The end key of one region must match the start key of the next so you can't just remove the region from .META. and its direct

RE: Row Keys

2011-01-28 Thread Peter Haidinyak
I know they are always sorted but if they are how do you know which row key belong to which data? Currently I use a row key of ID|Date so I always know what the startrow and endrow should be. I know I'm missing something really fundamental here. :-( Thanks -Pete -Original Message- Fro

Re: script to delete regions with no rows

2011-01-28 Thread Stack
The end key of one region must match the start key of the next so you can't just remove the region from .META. and its directory -- if one -- in HDFS. You'd need to adjust the start or end key on the region previous or after to include the scope of the just removed region. There is no script to do

script to delete regions with no rows

2011-01-28 Thread Venkatesh
Is there a script? thanks

Re: Row Keys

2011-01-28 Thread tsuna
On Fri, Jan 28, 2011 at 12:09 PM, Peter Haidinyak wrote: >        This is going to seem like a dumb question but it is recommended that > you use a random key to spread the insert/read load among your region > servers. My question is if I am using a scan with startrow and endrow  how > does tha

Row Keys

2011-01-28 Thread Peter Haidinyak
Hi, This is going to seem like a dumb question but it is recommended that you use a random key to spread the insert/read load among your region servers. My question is if I am using a scan with startrow and endrow how does that work with random row keys? Thanks -Pete

Re: Use loadtable.rb with compressed data?

2011-01-28 Thread Stack
So, seems like in 0.20.6, we're not doing compression right. St.Ack On Fri, Jan 28, 2011 at 11:23 AM, Nanheng Wu wrote: > Ah, sorry I should've read the usage. I ran it just now and the meta > data dump threw the same error "Not in GZIP format" > > On Fri, Jan 28, 2011 at 10:51 AM, Stack wrote:

Re: Unresponsive master in Hbase 0.90.0

2011-01-28 Thread Stack
On Fri, Jan 28, 2011 at 11:23 AM, Vidhyashankar Venkataraman wrote: > We are working on trying to fix this (cc'ed Adam as well). > >>> Hmm.. maybe before you restart remove the directory >>> hdfs://b3110120.yst.yahoo.net:4600/hbase/.logs/ completely so no files >>> to be processed on restart. > >

Re: Unresponsive master in Hbase 0.90.0

2011-01-28 Thread Vidhyashankar Venkataraman
We are working on trying to fix this (cc'ed Adam as well). >> Hmm.. maybe before you restart remove the directory >> hdfs://b3110120.yst.yahoo.net:4600/hbase/.logs/ completely so no files >> to be processed on restart. This one, I had tried during one of the attempts: and it created new logs dir

Re: Use loadtable.rb with compressed data?

2011-01-28 Thread Nanheng Wu
Ah, sorry I should've read the usage. I ran it just now and the meta data dump threw the same error "Not in GZIP format" On Fri, Jan 28, 2011 at 10:51 AM, Stack wrote: > hfile metadata, the -m option? > St.Ack > > On Fri, Jan 28, 2011 at 10:41 AM, Nanheng Wu wrote: >> Sorry, by dumping the metad

Re: Inconsistent META data for a region.

2011-01-28 Thread Stack
On Fri, Jan 28, 2011 at 10:56 AM, Chris Howe wrote: > The region had been deployed, but I dropped the table before I tried to re-add > it. > OK. This could have been the cause. Our disable/drop was flakey pre-0.90. Maybe it failed close out all regions. > When I would stop a single regionse

Re: Inconsistent META data for a region.

2011-01-28 Thread Chris Howe
Stack writes: > > ERROR: Region test,,1296067171940.0200bfe58a9e9fadf8ebfa523c47332f. found on > > server 10.101.45.82:60020 but is listed in META to be on server ip-10-117- 86- > > 81.ec2.internal:60020. > > Could this region have been deployed on this server before you ran > add_table? Or 'te

Re: Use loadtable.rb with compressed data?

2011-01-28 Thread Stack
hfile metadata, the -m option? St.Ack On Fri, Jan 28, 2011 at 10:41 AM, Nanheng Wu wrote: > Sorry, by dumping the metadata did you mean running the same HFile > tool on ".region" file in each region? > > On Fri, Jan 28, 2011 at 10:25 AM, Stack wrote: >> If you dump the metadata, does it claim GZ

Re: Unresponsive master in Hbase 0.90.0

2011-01-28 Thread Stack
On Fri, Jan 28, 2011 at 10:40 AM, Vidhyashankar Venkataraman wrote: >>> Is this new cluster start or master joining an already running cluster >>> (looks >>> like former). > > Either way, I get this problem. In particular, these logs were pulled out > after I had done a createTable with boundari

Re: Use loadtable.rb with compressed data?

2011-01-28 Thread Nanheng Wu
Sorry, by dumping the metadata did you mean running the same HFile tool on ".region" file in each region? On Fri, Jan 28, 2011 at 10:25 AM, Stack wrote: > If you dump the metadata, does it claim GZIP compressor?  If so, yeah, > seems to be mismatch between what data is and what metadata is. > St.

Re: Unresponsive master in Hbase 0.90.0

2011-01-28 Thread Vidhyashankar Venkataraman
>> Is this new cluster start or master joining an already running cluster (looks >> like former). Either way, I get this problem. In particular, these logs were pulled out after I had done a createTable with boundaries (around 100 empty regions per node) and shut it down and then restarted. A si

Re: is there a pluggable conflict resolver in hbase

2011-01-28 Thread Jean-Daniel Cryans
> Cool, so the coprocessor will feed the value in the database to me and the > value that is coming in just before it is written? > > With bytes, I am using serialized json so the example still applies perfectly > where I could merge the results in the coprocessor and the coprocessor writes > th

Re: Use loadtable.rb with compressed data?

2011-01-28 Thread Stack
If you dump the metadata, does it claim GZIP compressor? If so, yeah, seems to be mismatch between what data is and what metadata is. St.Ack On Fri, Jan 28, 2011 at 9:58 AM, Nanheng Wu wrote: > Awesome. I ran it on one of the hfiles and got this: > 11/01/28 09:57:15 INFO compress.CodecPool: Got

RE: is there a pluggable conflict resolver in hbase

2011-01-28 Thread Hiller, Dean (Contractor)
Cool, so the coprocessor will feed the value in the database to me and the value that is coming in just before it is written? With bytes, I am using serialized json so the example still applies perfectly where I could merge the results in the coprocessor and the coprocessor writes the final res

Re: Use loadtable.rb with compressed data?

2011-01-28 Thread Nanheng Wu
Awesome. I ran it on one of the hfiles and got this: 11/01/28 09:57:15 INFO compress.CodecPool: Got brand-new decompressor java.io.IOException: Not in GZIP format at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:137) at java.util.zip.GZIPInputStream.(GZIPInputStream.

Re: .oldlogs Cleanup

2011-01-28 Thread Stack
There is http://hbase.apache.org/configuration.html#hbase.master.logcleaner.ttl and 'this.maxLogs = conf.getInt("hbase.regionserver.maxlogs", 32);' (The latter does not seem to be hbase-default.xml which looks like a bit of an oversight. Check a few of the older ones. See if you can figure if the

Re: TableMapReduceUtil.initTableMapperJob takes only 1 scan object

2011-01-28 Thread Stack
On Fri, Jan 28, 2011 at 9:36 AM, manobal wrote: > > 1 table will contain all the event types.. key is eventId (type of event) + > timestamp and value would be visitorId.. we want to find out all the > visitorId that has seen 4-5 specific event types in sequence.. > If your row key was timestamp an

Re: Use loadtable.rb with compressed data?

2011-01-28 Thread Stack
The section in 0.90 book on hfile tool should apply to 0.20.6: http://hbase.apache.org/ch08s02.html#hfile_tool It might help you w/ your explorations. St.Ack On Fri, Jan 28, 2011 at 9:38 AM, Nanheng Wu wrote: > Hi Stack, > >  Get doesn't work either. It was a fresh table created by > loadtable.

.oldlogs Cleanup

2011-01-28 Thread Wayne
How is the .oldlogs folder cleaned up? My cluster size kept going up and I looked closely and realized that 91% of the space was going to .oldlogs that do not appear to be archived. This adds up to 12.5TB with rf=3 in the 4 days we have been up with .90. How can this be configured to be cleaned out

Re: Use loadtable.rb with compressed data?

2011-01-28 Thread Nanheng Wu
Hi Stack, Get doesn't work either. It was a fresh table created by loadtable.rb. Finally, the uncompressed version had the same number of regions (8 total). I totally understand you guys shouldn't be patching the older version, upgrading for me is an option but will be pretty painful. I wonder i

Re: TableMapReduceUtil.initTableMapperJob takes only 1 scan object

2011-01-28 Thread manobal
1 table will contain all the event types.. key is eventId (type of event) + timestamp and value would be visitorId.. we want to find out all the visitorId that has seen 4-5 specific event types in sequence.. As far as I understand the scanner object takes starting key and ending key.. if that is

Re: multiple masters

2011-01-28 Thread Weishung Chung
Great, thank you :D I guess I need to read up more on zookeeper. On Fri, Jan 28, 2011 at 10:56 AM, Stack wrote: > On Fri, Jan 28, 2011 at 8:52 AM, Weishung Chung > wrote: > > Correct me if I am wrong :) > > In HConnectionManager, it seems to me that a zookeeper instance is used > to > > get to

Re: HBase access from C#.NET

2011-01-28 Thread Stack
On Fri, Jan 28, 2011 at 8:58 AM, Stuart Scott wrote: > I will give it a go.. wishful thinking that someone may have already > done it. > Of course. No harm asking. Let us know how it goes. St.Ack

Re: Use loadtable.rb with compressed data?

2011-01-28 Thread Stack
On Thu, Jan 27, 2011 at 9:35 PM, Nanheng Wu wrote: > In the compressed case, there are 8 regions and the region start/end > keys do line up. Which actually is confusing to me, how can hbase read > the files if they are compressed? does each hfile have some metadata > in it that has compression inf

Re: HBase 0.90.0 cannot be put more data after running hours

2011-01-28 Thread Stack
On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang wrote: > 1. The .META. table seems ok >     I can read my data table (one thread for reading). >     I can use hbase shell to scan my data table. >     And I can use 1~4 threads to put more data into my data table. > Good. This would seem to say t

RE: HBase access from C#.NET

2011-01-28 Thread Stuart Scott
I will give it a go.. wishful thinking that someone may have already done it. (Thanks for all your assistance so far-it's been very helpful. My project is taking shape). Regards Stuart -Original Message- From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack Sent: 28

Re: HRegionLocation locateRegionInMeta

2011-01-28 Thread Stack
On Thu, Jan 27, 2011 at 11:43 PM, Weishung Chung wrote: > The fun thing about HBase is that I can browse the source code to understand > the internal workings of the system and get amazed by the awesome coding > done by talented engineers/coders :) Surely you must be joking! > Also, there is an

Re: multiple masters

2011-01-28 Thread Stack
On Fri, Jan 28, 2011 at 8:52 AM, Weishung Chung wrote: > Correct me if I am wrong :) > In HConnectionManager, it seems to me that a zookeeper instance is used to > get to the HBase master for META and ROOT info. What would happen if HBase > master became unavailable? Would zookeeper be able to get

Re: Unresponsive master in Hbase 0.90.0

2011-01-28 Thread Stack
On Thu, Jan 27, 2011 at 11:56 PM, Vidhyashankar Venkataraman wrote: > 2011-01-28 07:35:49,866 INFO org.apache.hadoop.hbase.master.MasterFileSystem: > Log folder > hdfs://b3110120.yst.yahoo.net:4600/hbase/.logs/b3110270.yst.yahoo.net,60020,1296199618314 > belongs to an existing region server > 2

Re: multiple masters

2011-01-28 Thread Weishung Chung
Correct me if I am wrong :) In HConnectionManager, it seems to me that a zookeeper instance is used to get to the HBase master for META and ROOT info. What would happen if HBase master became unavailable? Would zookeeper be able to get the ROOT and META info from another backup/replicated master? S

Re: Inconsistent META data for a region.

2011-01-28 Thread Stack
On Thu, Jan 27, 2011 at 9:12 PM, Chris Howe wrote: > Howdy, > Howdy back. See in below. > ERROR: Region test,,1296067171940.0200bfe58a9e9fadf8ebfa523c47332f. found on > server 10.101.45.82:60020 but is listed in META to be on server ip-10-117-86- > 81.ec2.internal:60020. Could this region ha

Re: multiple masters

2011-01-28 Thread Stack
On Fri, Jan 28, 2011 at 8:10 AM, Weishung Chung wrote: > Is zookeeper responsible for the backup/replication of -ROOT- and .META. > files? No. These are kept in HDFS and rely on its replication. > It looks like I need multiple HBase masters setup to achieve high > availability. In the multiple

multiple masters

2011-01-28 Thread Weishung Chung
Is zookeeper responsible for the backup/replication of ROOT and META files? It looks like I need multiple HBase masters setup to achieve high availability. In the multiple masters setup, would there be any data loss in the switch over after the first master became unavailable. Thank you

Re: Inconsistent META data for a region.

2011-01-28 Thread Chris Howe
Chris Howe writes: > > I was trying to use "add_table.rb" to restore a table that I had copied the hdfs > files for, and I had some trouble. Now when I run "hbase hbck" I get the > following: > > ... > ERROR: Region test,,1296067171940.0200bfe58a9e9fadf8ebfa523c47332f. found on > server 10.

multiple masters

2011-01-28 Thread Weishung Chung
Is zookeeper responsible for the backup/replication of -ROOT- and .META. files? It looks like I need multiple HBase masters setup to achieve high availability. In the multiple masters setup, would there be any data loss in the switch over after the first master became unavailable.

Re: HBase access from C#.NET

2011-01-28 Thread Stack
Can you use thrift? St.Ack On Fri, Jan 28, 2011 at 7:55 AM, Stuart Scott wrote: > Hi, > > > > Has anyone tried to get a Windows C#.NET application to connect to > HBase? > > If so, how did you manage it? > > > > Regards > > > > Stuart Scott > > System Architect > emis intellectual technology > Fu

HBase access from C#.NET

2011-01-28 Thread Stuart Scott
Hi, Has anyone tried to get a Windows C#.NET application to connect to HBase? If so, how did you manage it? Regards Stuart Scott System Architect emis intellectual technology Fulford Grange, Micklefield Lane Rawdon Leeds LS19 6BA E-mail: stuart.sc...@e-mis.com