Hi,

I've been experimenting quite a bit with Cassandra and think I'm getting to 
understand it, but I would like some advice on modeling my data in Cassandra 
for an application I'm developing.

The application will have a large number of records, with the records 
consisting of a fixed part and a number (n) of periodic parts.
* The fixed part is updated occasionally.
* The periodic parts are never updated, but a new one is added every 5 to 10 
minutes. Only the last n periodic parts need to be kept, so that the oldest one 
can be deleted after adding a new part.
* The records will always be read completely (meaning fixed part and all 
periodic parts). Reads are less frequent than writes.
The application will be running continuosly, at least for a few weeks, so there 
will be many, many stale periodic parts, so I'm a bit worried about data 
comsumption and compactions.

With respect to modeling the above in Cassandra I have the following questions:

Does anyone want to provide insights into the alternatives below:

1) For every period, add a new column to each record and delete the oldest 
column with a batch_mutate. This obviously causes many tombstones.
2) For every period, overwrite the oldest column for each record with the new 
one (cyclic/modulo behaviour). AFAIK this does not cause any tombstones, but 
will probably cause the SSTables to get polluted.
3) (0.7 only) For every period, create a new CF and add columns to it with a 
batch_mutate and drop the oldest CF. The obsolete data can be cleaned up 
immediately, but I'm not sure if this is proper/recommended use of dynamic CFs.
4) Don't use Cassandra at all and investigate other storage solutions. 
Suggestions would be welcome if you favour this approach.

Also I'm wondering whether I should be putting the fixed and periodic parts 
together in one Super CF, or whether it would be better to separate the fixed 
part into one CF and the periodic parts in another. Since I'll be reading all 
data of a record at the same time, my preference would go to a Super CF, but 
I'm open to anyone wanting to talk me out of this ;-)

Thanks, Steven.
                                          

Reply via email to