I need to store one trillion data points. The data is highly compressible
down to 1 byte per data point using simple custom compression combined with
standard dictionary compression. What's the most space-efficient way to
store the data in Cassandra? How much per-row overhead is there if I store
one data point per row?

The data is particularly hard to group. It's a large number of time series
with highly variable density. That makes it hard to pack subsets of the
data into meaningful column families / wide rows. Is there a table layout
scheme that would allow me to approach the 1B per data point without
forcing me to implement complex abstraction layer on application level?

Reply via email to