Howdy all,

Our use of cassandra unfortunately makes use of lots of deletes.  Yes, I
know that C* is not well suited to this kind of workload, but that's where
we are, and before I go looking for an entirely new data layer I would
rather explore whether C* could be tuned to work well for us.

However, deletions are never driven by users in our app - deletions always
occur by backend processes to "clean up" data after it has been processed,
and thus they do not need to be 100% available.  So this made me think,
what if I did the following?

   - gc_grace_seconds = 0, which ensures that tombstones are never created
   - replication factor = 3
   - for writes that are inserts, consistency = QUORUM, which ensures that
   writes can proceed even if 1 replica is slow/down
   - for deletes, consistency = ALL, which ensures that when we delete a
   record it disappears entirely (no need for tombstones)
   - for reads, consistency = QUORUM

Also, I should clarify that our data essentially append only, so I don't
need to worry about inconsistencies created by partial updates (e.g. value
gets changed on one machine but not another).  Sometimes there will be
duplicate writes, but I think that should be fine since the value is always
identical.

Any red flags with this approach?  Has anyone tried it and have experiences
to share?  Also, I *think* that this means that I don't need to run
repairs, which from an ops perspective is great.

Thanks, as always,
- Ian

Reply via email to