Bryan,
I wasn’t saying St.Ack’s post wasn’t relevant, but that its not addressing the
easiest thing to fix. Schema design.
IMHO, that’s shooting one’s self in the foot.
You shouldn’t be using versioning to capture temporal data.
On Nov 3, 2014, at 1:54 PM, Bryan Beaudreault wrote:
> There
There are many blog posts and articles about people turning for > 16GB
heaps since java7 and the G1 collector became mainstream. We run with 25GB
heap ourselves with very short GC pauses using a mostly untuned G1
collector. Just one example is the excellent blog post by Intel,
https://software.in
St.Ack,
I think you're side stepping the issue concerning schema design.
Since HBase isn't my core focus, I also have to ask since when has heap sizes
over 16GB been the norm?
(Really 8GB seems to be quite a large heap size... )
On Oct 31, 2014, at 11:15 AM, Stack wrote:
> On Thu, Oct 3
On Thu, Oct 30, 2014 at 8:20 AM, Andrejs Dubovskis
wrote:
> Hi!
>
> We have a bunch of rows on HBase which store varying sizes of data
> (1-50MB). We use HBase versioning and keep up to 1 column
> versions. Typically each column has only few versions. But in rare
> cases it may has thousands
Here’s the simple answer.
Don’t do it.
They way you are abusing versioning is a bad design.
Redesign your schema.
On Oct 30, 2014, at 10:20 AM, Andrejs Dubovskis wrote:
> Hi!
>
> We have a bunch of rows on HBase which store varying sizes of data
> (1-50MB). We use HBase versioning and
Hi!
We have a bunch of rows on HBase which store varying sizes of data
(1-50MB). We use HBase versioning and keep up to 1 column
versions. Typically each column has only few versions. But in rare
cases it may has thousands versions.
The Mapreduce alghoritm uses full scan and our algorithm req