simple export --> bulk import

2013-07-01 Thread Michael Ellery
I'm currently struggling with export/import between two hbase clusters. I have managed to create incremental exports from the source cluster (using hbase Export). Now I would like to bulk load the export into the destination (presumably using HFiles). The reason for the bulk load requirement is t

Linux kernel recommendations for HBase 0.94.2

2013-04-02 Thread Michael Ellery
We are currently building a cluster based on CDH4.2 (HBase 0.94.2). We are trying to decide whether to use a 3.8 kernel or stick with our current 3.2 kernel. Does anyone have operational experience with either kernel version and HBase that indicates to a version preference? Are there specific re

Re: column count guidelines

2013-02-07 Thread Michael Ellery
ave the two > features I cited below. > > On Thu, Feb 7, 2013 at 5:02 PM, Michael Ellery wrote: > >> There is only one CF in this schema. >> >> Yes, we are looking at upgrading to CDH4, but it is not trivial since we >> cannot have cluster downtime. Our curren

Re: column count guidelines

2013-02-07 Thread Michael Ellery
4:34 PM, Ted Yu wrote: > How many column families are involved ? > > Have you considered upgrading to 0.94.4 where you would be able to benefit > from lazy seek, Data Block Encoding, etc ? > > Thanks > > On Thu, Feb 7, 2013 at 3:47 PM, Michael Ellery wrote: > >

column count guidelines

2013-02-07 Thread Michael Ellery
I'm looking for some advice about per row CQ (column qualifier) count guidelines. Our current schema design means we have a HIGHLY variable CQ count per row -- some rows have one or two CQs and some rows have upwards of 1 million. Each CQ is on the order of 100 bytes (for round numbers) and the