Re: compression

Tamar Fraenkel Mon, 24 Sep 2012 03:32:11 -0700

Hi!
I ran
UPDATE COLUMN FAMILY cf_name WITH
compression_options={sstable_compression:SnappyCompressor,
chunk_length_kb:64};


I then ran on all my nodes (3)
sudo nodetool -h localhost scrub tok cf_name

I have replication factor 3. The size of the data on disk was cut in half
in the first node and in the jmx I can see that indeed the compression
ration is 0.46. But on nodes 2 and 3 nothing happened. In the jmx I can see
that compression ratio is 0 and the size of the files of disk stayed the
same.

In cli

ColumnFamily: cf_name
      Key Validation Class: org.apache.cassandra.db.marshal.UUIDType
      Default column value validator:
org.apache.cassandra.db.marshal.UTF8Type
      Columns sorted by:
org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)
      Row cache size / save period in seconds / keys to save : 0.0/0/all
      Row Cache Provider:
org.apache.cassandra.cache.SerializingCacheProvider
      Key cache size / save period in seconds: 200000.0/14400
      GC grace seconds: 864000
      Compaction min/max thresholds: 4/32
      Read repair chance: 1.0
      Replicate on write: true
      Bloom Filter FP chance: default
      Built indexes: []
      Compaction Strategy:
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        chunk_length_kb: 64
        sstable_compression:
org.apache.cassandra.io.compress.SnappyCompressor

Can anyone help?
Thanks

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Sep 24, 2012 at 8:37 AM, Tamar Fraenkel <ta...@tok-media.com> wrote:

> Thanks all, that helps. Will start with one - two CFs and let you know the
> effect
>
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> [image: Inline image 1]
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956
>
>
>
>
>
> On Sun, Sep 23, 2012 at 8:21 PM, Hiller, Dean <dean.hil...@nrel.gov>wrote:
>
>> As well as your unlimited column names may all have the same prefix,
>> right? Like "accounts".rowkey56, "accounts".rowkey78, etc. etc.  so the
>> "accounts gets a ton of compression then.
>>
>> Later,
>> Dean
>>
>> From: Tyler Hobbs <ty...@datastax.com<mailto:ty...@datastax.com>>
>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <
>> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>> Date: Sunday, September 23, 2012 11:46 AM
>> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <
>> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>> Subject: Re: compression
>>
>>  column metadata, you're still likely to get a reasonable amount of
>> compression.  This is especially true if there is some amount of repetition
>> in the column names, values, or TTLs in wide rows.  Compression will almost
>> always be beneficial unless you're already somehow CPU bound or are using
>> large column values that are high in entropy, such as pre-compressed or
>> encrypted data.
>>
>
>

<<tokLogo.png>>

Re: compression

Reply via email to