I just read through the DataStax compression post (
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression).

My question is around good use cases for enabling compression.  In my
scenario I have very wide rows with many thousands of columns where its
essentially time-series information where the column names are all utc
timestamps.  There is significant overlap between rows, but there is wide
variance at times between the # of columns in each row.  I would imagine
(based on what I'd read) that because there are a significant number of
columns in common that compression would be useful.  At the same time, there
is a lot of variance, and right now there are more columns than rows (though
that will change over time), and that suggests compression might not be
useful.

Does anyone have any thoughts on what would make the most sense?  It would
be awesome if we could cut our storage needs consistently.

-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*

Reply via email to