Edward/Sylvain, I also came across this post on DataStax's blog:
> When to use compression > Compression is best suited for ColumnFamilies where there are many rows, with > each row having the same columns, or at least many columns in common. For > example, a ColumnFamily containing user data such as username, email, etc., > would be a good candidate for compression. The more similar the data across > rows, the greater the compression ratio will be, and the larger the gain in > read performance. > Compression is not as good a fit for ColumnFamilies where each row has a > different set of columns, or where there are just a few very wide rows. > Dynamic column families such as this will not yield good compression ratios. http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression @Sylvain, does this still apply on more recent versions of C*? -- Drew On Mar 18, 2013, at 7:16 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > I feel this has come up before. I believe the compression is block based, so > just because no two column names are the same does not mean the compression > will not be effective. Possibly in their case the compression was not > effective. > > On Mon, Mar 18, 2013 at 9:08 PM, Drew Kutcharian <d...@venarc.com> wrote: > That's what I originally thought but the OOYALA presentation from C*2012 got > me confused. Do you guys know what's going on here? > > The video: > http://www.youtube.com/watch?v=r2nGBUuvVmc&feature=player_detailpage#t=790s > The slides: Slide 22 @ > http://www.datastax.com/wp-content/uploads/2012/08/C2012-Hastur-NoahGibbs.pdf > > -- Drew > > > On Mar 18, 2013, at 6:14 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > >> >> Imho it is probably more efficient for wide. When you decompress 8k blocks >> to get at a 200 byte row you create overhead , particularly young gen. >> On Monday, March 18, 2013, Sylvain Lebresne <sylv...@datastax.com> wrote: >> > The way compression is implemented, it is oblivious to the CF being >> > wide-row or narrow-row. There is nothing intrinsically less efficient in >> > the compression for wide-rows. >> > -- >> > Sylvain >> > >> > On Fri, Mar 15, 2013 at 11:53 PM, Drew Kutcharian <d...@venarc.com> wrote: >> >> >> >> Hey Guys, >> >> >> >> I remember reading somewhere that C* compression is not very effective >> >> when most of the CFs are in wide-row format and some folks turn the >> >> compression off and use disk level compression as a workaround. >> >> Considering that wide rows with composites are "first class citizens" in >> >> CQL3, is this still the case? Has there been any improvements on this? >> >> >> >> Thanks, >> >> >> >> Drew >> > > >