Re: A question of 'referential integrity'...

Steve Tue, 06 Apr 2010 14:13:36 -0700

On 06/04/2010 21:40, Benjamin Black wrote:
> I suggest the reasons you list (which are certainly great reasons!)
> are also the reasons there is no referential integrity or transaction
> support.  
Quite.  I'm not trying to make recommendations for how Cassandra should
be changed to be more like a traditional RDBMS... I just have a
requirement, at the logical level, that would be trivial with
traditional technology - so the analogy seemed an ideal way to
illustrate the issue.
> It seems the common practice of using a system like
> Zookeeper for the synchronization parts alongside Cassandra would be
> applicable here.  Have you investigated that?
>   
I started looking at Zookeeper when it was mentioned in an earlier
reply.  I've discovered it supports something called "Ledgers" - but I'm
still unclear if they'd be useful to me - I've only uncovered a very
high-level overview so far.  I'm concerned that Zookeeper looks as if it
might become a problematic bottleneck if all the updates must be routed
through it.  I don't see Zookeeper mutexes as being especially
helpful... because my problem isn't really about two incompatible
requests in quick succession - but, rather, about needing to ensure that
"referential integrity" is eventually established between two, otherwise
independent, keysets.  I need to eliminate the possibility that I end up
with 'dangling' inaccessible data should a hash-value become recorded in
the range of the first map but not the domain of the second (or vice-versa.)


Should I assume that it isn't common practice to write updates
atomically in-real time, and batch process them 'off-line' to increase
the atomic granularity?  It seems an obvious strategy... possibly one
for which an implementation might use "MapReduce" or something similar? 
I don't want to re-invent the wheel, of course.

Re: A question of 'referential integrity'...

Reply via email to