Hello Kevin.

 With CQL3 there are some important terms to define:

 a. Row : means a logical row in the CQL3 semantics, logical row is what is
displayed as a row in cqlsh client
 b. Partition: means a physical row on disk in the CQL3 semantics

Even if you have tiny logical rows, if you store a lot of them under the
same partition (physical row on disk) then it can add up a lot.

Quick maths: 200k per logical row * 1000 logical rows = 200Mb roughtly for
the partition


On Mon, Jun 30, 2014 at 6:53 PM, Kevin Burton <bur...@spinn3r.com> wrote:

> I'm running a full compaction now and noticed this:
>
> Compacting large row … incrementally
>
> … and the values were in the 300-500MB range.
>
> I'm storing NOTHING anywhere near that large.  Max is about 200k...
>
> However, I'm storing my schema in a way so that I can do efficient
> time/range scans of the data and placing things into buckets.
>
> So my schema looks like:
>
> bucket,
> timestamp
>
> … and the partition key is bucket.  Since this is a clustering row, does
> that mean that EVERYTHING is in one "row" under 'bucket' ?
>
> So even though my INSERTs are like 200k, they're all pooling under the
> same 'bucket' which is the partition key so cassandra is going to have a
> hard time compacting them.
>
> Part of the problem here is the serious abuse of vocabulary.  The
> thrift/CQL impedance mismatch means that things have slightly different
> names and not-so-straigtforward nomenclature.  So it makes it confusing as
> to what's actually happening under the hood.
>
> ….
>
> Then I saw:
>
>
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201106.mbox/%3cbanlktik0g+epq4ctw28ty+dpexprtis...@mail.gmail.com%3E
>
>
> look for in_memory_compaction_limit_in_mb in cassandra.yaml
>
>
> … so this seems like it will be a problem and slow me down moving forward.
>  Unless I figure out a workaround.
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>
>

Reply via email to