I just read through the DataStax compression post ( http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression).
My question is around good use cases for enabling compression. In my scenario I have very wide rows with many thousands of columns where its essentially time-series information where the column names are all utc timestamps. There is significant overlap between rows, but there is wide variance at times between the # of columns in each row. I would imagine (based on what I'd read) that because there are a significant number of columns in common that compression would be useful. At the same time, there is a lot of variance, and right now there are more columns than rows (though that will change over time), and that suggests compression might not be useful. Does anyone have any thoughts on what would make the most sense? It would be awesome if we could cut our storage needs consistently. -- *David McNelis* Lead Software Engineer Agentis Energy www.agentisenergy.com o: 630.359.6395 c: 219.384.5143 *A Smart Grid technology company focused on helping consumers of energy control an often under-managed resource.*