A question of 'referential integrity'...

Steve Tue, 06 Apr 2010 04:00:42 -0700

First, I apologise sending this to the 'dev' mailing list - I couldn't
find one for Cassandra users - and also for the basic nature of my
questions...


I'm trying to get my head around the possibility of using Cassandra as
the back-end to a project... and while, in most respects, Cassandra
looks absolutely ideal... I'm finding it difficult to ascertain an
appropriate strategy to ensure consistency (which would come 'for free'
with a traditional, transactional, back end.)

As a sufficient abstraction of my storage requirements, imagine two
(application oriented) universes of SHA-512 hashes - SRC and DST (each
will, in practice, be tagged with other application data).  I need to
support a remote API to manage a one-many mapping from SRC to DST, and a
consistent (efficiently addressable) one-one mapping from DST to SRC.  I
need to envisage millions of clients and tens of billions of mappings
with billions of updates and lookups daily...

newAssoc(s:SRC,d:DST)
listAssoc(s:SRC) => List<d:DST>
delAssoc(d:DST)
lookupAssoc(d:DST) => s:SRC

If I were using BDB, I'd have two maps - the first with s:SRC as key and
d:DST as value - the second with (d:DST,s:SRC) as key with no values....
and update these maps in a transaction.

If I were in SQL land, I'd need a table a bit like this:

create table Assoc( src binary(64) , dst binary(64) unique, primary key
(src,dst) )

The implementations of the API calls would be trivial insert, delete and
select operations - and consistency between the primary key and the
implicit (unique constraint) index would arise as a consequence of
transactions.  I realise that, with Cassandra, I need a different
approach - since I don't have the same notion of transactions on which
to rely... and, in any case, given a desire for scalability, relying
upon such fine grained transactions would definitely prove a
bottleneck.  That said, the uniqueness of DST values is systemically
critical - so, even while I do not anticipate duplicate hashes in
practice, I need uniqueness to be verified - and for the second SRC
values asking to associate with an existing DST value to fail without
violating the one->one mapping from DST to SRC... and for this failure
to be notified ASAP.

It strikes me that a plausible design might be one that writes a log of
'insert/delete' with pairs of hashes which some background process
eventually indexes in a big batch... before clearing the transaction
log.  If this is "The Cassandra Way" - I'm surprised not to have found
any examples... am I looking in the wrong places for them?  Is my log of
'insert' and 'delete' operations something I'd implement myself using
ad-hoc techniques, or is there explicit support for this in Cassandra? 
Do I need to develop my own process (from scratch) to merge updates with
on-disk data - or is there a neat way to get Cassandra to do that for me?

Another issue I'm considering is if I should map from SRC to a list of
DST as my low-level representation with Cassandra... or should I map
individually.  A potential problem is that one SRC value can map to
arbitrarily many DST values.   At the level of the RPC API, I can
address this by returning an object resembling a scrollable cursor
instead of a static list - but, I guess, I'd need to be concerned about
resource limitations (memory, etc.) for the on-disk representation?  I
presume that there's a significant advantage to storing the one-to-many
map explicitly (locality of reference, for example) - as well as
minimising the size of the encoded data... I'm guessing that there is no
prefix-compression for keys?  Key compression would likely lead to the
opposite architectural decisions from a resource-use perspective... and
would eliminate concerns about maps from single SRC values to very large
numbers of DST values.

Any hints, tips, comments, pointers to relevant documentation etc. would
be much appreciated...  I'm guessing many others have tackled a problem
similar to this?

A question of 'referential integrity'...

Reply via email to