Yes. The block size is specified as part of the compression options for the CF / Table.
Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 20/03/2013, at 5:31 AM, Drew Kutcharian <d...@venarc.com> wrote: > Thanks Sylvain. So C* compression is block based and has nothing to do with > format of the rows. > > On Mar 19, 2013, at 1:31 AM, Sylvain Lebresne <sylv...@datastax.com> wrote: > >> That's just describing what compression is about. Compression (not in C*, in >> general) is based on recognizing repeated pattern. >> >> So yes, in that sense, static column families are more likely to yield >> better compression ratio because it is more likely to have repeated patterns >> in the compressed blocks. But: >> 1) it doesn't necessarily mean that wide column families won't have a good >> compression ratio per se. >> 2) you can absolutely have crappy compression ratio with a static column >> family. Just create a column family where each row has 1 column 'image' that >> contains a png. >> >> And to come back to your initial question, I highly doubt disk level >> compression would be much of a workaround because again, that's more about >> how compression is working than how Cassandra use it. >> >> At the end of the day, I really think the best choice is to try it and >> decide for yourself if it does more good than harm or the converse. >> >> -- >> Sylvain >> >> >> On Tue, Mar 19, 2013 at 3:58 AM, Drew Kutcharian <d...@venarc.com> wrote: >> Edward/Sylvain, >> >> I also came across this post on DataStax's blog: >> >>> When to use compression >>> Compression is best suited for ColumnFamilies where there are many rows, >>> with each row having the same columns, or at least many columns in common. >>> For example, a ColumnFamily containing user data such as username, email, >>> etc., would be a good candidate for compression. The more similar the data >>> across rows, the greater the compression ratio will be, and the larger the >>> gain in read performance. >>> Compression is not as good a fit for ColumnFamilies where each row has a >>> different set of columns, or where there are just a few very wide rows. >>> Dynamic column families such as this will not yield good compression ratios. >> >> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression >> >> @Sylvain, does this still apply on more recent versions of C*? >> >> >> -- Drew >> >> >> >> On Mar 18, 2013, at 7:16 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote: >> >>> I feel this has come up before. I believe the compression is block based, >>> so just because no two column names are the same does not mean the >>> compression will not be effective. Possibly in their case the compression >>> was not effective. >>> >>> On Mon, Mar 18, 2013 at 9:08 PM, Drew Kutcharian <d...@venarc.com> wrote: >>> That's what I originally thought but the OOYALA presentation from C*2012 >>> got me confused. Do you guys know what's going on here? >>> >>> The video: >>> http://www.youtube.com/watch?v=r2nGBUuvVmc&feature=player_detailpage#t=790s >>> The slides: Slide 22 @ >>> http://www.datastax.com/wp-content/uploads/2012/08/C2012-Hastur-NoahGibbs.pdf >>> >>> -- Drew >>> >>> >>> On Mar 18, 2013, at 6:14 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote: >>> >>>> >>>> Imho it is probably more efficient for wide. When you decompress 8k blocks >>>> to get at a 200 byte row you create overhead , particularly young gen. >>>> On Monday, March 18, 2013, Sylvain Lebresne <sylv...@datastax.com> wrote: >>>> > The way compression is implemented, it is oblivious to the CF being >>>> > wide-row or narrow-row. There is nothing intrinsically less efficient in >>>> > the compression for wide-rows. >>>> > -- >>>> > Sylvain >>>> > >>>> > On Fri, Mar 15, 2013 at 11:53 PM, Drew Kutcharian <d...@venarc.com> >>>> > wrote: >>>> >> >>>> >> Hey Guys, >>>> >> >>>> >> I remember reading somewhere that C* compression is not very effective >>>> >> when most of the CFs are in wide-row format and some folks turn the >>>> >> compression off and use disk level compression as a workaround. >>>> >> Considering that wide rows with composites are "first class citizens" >>>> >> in CQL3, is this still the case? Has there been any improvements on >>>> >> this? >>>> >> >>>> >> Thanks, >>>> >> >>>> >> Drew >>>> > >>> >>> >> >> >