Check your log for messages about rebuilding indices: that might grow your
dataset some.

One thing is for sure: the data import removed all the crap that lasted in
the 0.8.1 cluster (duplicates, thombstones etc). The decrease is fairly
dramatic but not unlogical at all.

2012/3/16 Jeremiah Jordan <jeremiah.jor...@morningstar.com>

>  I would guess more aggressive compaction settings, did you update rows
> or insert some twice?
> If you run major compaction a couple times on the 0.8.1 cluster does the
> data size get smaller?
>
> You can use the "describe" command to check if compression got turned on.
>
> -Jeremiah
>
>  ------------------------------
> *From:* Ravikumar Govindarajan [ravikumar.govindara...@gmail.com]
> *Sent:* Thursday, March 15, 2012 4:41 AM
> *To:* user@cassandra.apache.org
> *Subject:* 0.8.1 Vs 1.0.7
>
>  Hi,
>
>  I ran some data import tests for cassandra 0.8.1 and 1.0.7. The results
> were a little bit surprising
>
>  0.8.1, SimpleStrategy, Rep_Factor=3,QUORUM Writes, RP, SimpleSnitch
>
>  XXX.XXX.XXX.A  datacenter1 rack1       Up     Normal  140.61 GB
> 12.50%
> XXX.XXX.XXX.B  datacenter1 rack1       Up     Normal  139.92 GB
> 12.50%
> XXX.XXX.XXX.C  datacenter1 rack1       Up     Normal  138.81 GB
> 12.50%
> XXX.XXX.XXX.D  datacenter1 rack1       Up     Normal  139.78 GB
> 12.50%
> XXX.XXX.XXX.E  datacenter1 rack1       Up     Normal  137.44 GB
> 12.50%
> XXX.XXX.XXX.F  datacenter1 rack1       Up     Normal  138.48 GB
> 12.50%
> XXX.XXX.XXX.G  datacenter1 rack1       Up     Normal  140.52 GB
> 12.50%
> XXX.XXX.XXX.H  datacenter1 rack1       Up     Normal  145.24 GB
> 12.50%
>
>  1.0.7, NTS, Rep_Factor{DC1:3, DC2:2}, LOCAL_QUORUM writes, RP [DC2 m/c
> yet to join ring],
> PropertyFileSnitch
>
>  XXX.XXX.XXX.A  DC1 RAC1       Up     Normal   48.72  GB       12.50%
> XXX.XXX.XXX.B  DC1 RAC1       Up     Normal   51.23  GB       12.50%
> XXX.XXX.XXX.C  DC1 RAC1       Up     Normal   52.4    GB       12.50%
>
> XXX.XXX.XXX.D  DC1 RAC1       Up     Normal   49.64  GB       12.50%
> XXX.XXX.XXX.E  DC1 RAC1       Up     Normal   48.5    GB       12.50%
>
> XXX.XXX.XXX.F  DC1 RAC1       Up     Normal    53.38  GB       12.50%
>
> XXX.XXX.XXX.G  DC1 RAC1       Up     Normal   51.11  GB       12.50%
> XXX.XXX.XXX.H  DC1 RAC1       Up     Normal   53.36  GB       12.50%
>
>  There seems to be 3X savings in size for the same dataset running 1.0.7.
> I have not enabled compression for any of the CFs. Will it be enabled by
> default when creating a new CF in 1.0.7? cassandra.yaml is also mostly
> identical.
>
>  Thanks and Regards,
> Ravi
>

Reply via email to