I don't think compression can be the cause of the difference, because of two reasons:
1) The partition size I calculated myself (3 MB) is the uncompressed size, and so is the reported size (2.3 GB) 2) The difference is simply way too big to be explained by compression, even if the calculated size would have been the compressed size. The compression would be 0.125% of the original, which is not realistic. In the logs, I can see that the typical compression that is achieved for this table is around 80% of the original. Tom On Fri, Mar 4, 2016 at 9:48 PM, Robert Coli <rc...@eventbrite.com> wrote: > On Fri, Mar 4, 2016 at 5:56 AM, Tom van den Berge <t...@drillster.com> > wrote: > >> Compacting large partition >> drillster/subscriberstats:rqtPewK-1chi0JSO595u-Q (1,470,058,292 bytes) >> >> This means that this single partition is about 1.4GB large. This is much >> larger that it can possibly be, because of two reasons: >> 1) the partition has appr. 50K rows, each roughly 62 bytes = ~3 MB >> 2) the entire table consumes appr. 500MB of disk space on the node >> containing the partition (including snapshots) >> >> Furthermore, nodetool cfstats tells me this: >> Space used (live): 253,928,111 >> Space used (total): 253,928,111 >> Compacted partition maximum bytes: 2,395,318,855 >> The space used seem to match the actual size (excl. snapshots), but the >> Compacted partition maximum bytes (2,3 GB) seems to be far higher than >> possible. Does anyone know how it is possible that Cassandra reports such >> unlikely sizes? >> > > Compression is enabled by default, and compaction reports the uncompressed > size. > > =Rob > > -- Tom van den Berge Lead Software Engineer [image: Drillster] Middenburcht 136 3452 MT Vleuten Netherlands +31 30 755 53 30 www.drillster.com [image: Follow us on Facebook] Follow us on Facebook <https://www.facebook.com/Drillster>