You could try the following:

i:20110728 {
   tx00001="va1",
   tx00002="va1",
   tx00003="va1",
   tx00004="va1",
   tx00005="va1",
   tx00006="va1",
}


The value could either be a blob / json pojo, or a reference off to
another row storing the columns representing the value.

Taking it further adding a hash(val) % 512 will distribute the logical
row across a cluster for added resilience.

The above becomes
i:20110728:A9 {
   tx00001="va1",
   tx00008="va1",
}


p
On 28/07/11 15:45, Kent Narling wrote:
> Hi! 
> 
> I am considering to use cassandra for clustered transaction logging in a
> project. 
> 
> What I need are in principal 3 functions: 
> 
> 1 - Log transaction with a unique (but possibly non-sequential) id 
> 2 - Fetch transaction with a specific id 
> 3 - Fetch X new transactions "after" a specific cursor/transaction 
>      This function must be guaranteed to: 
>      A, eventually return all known transactions 
>      B, Not return the same transaction more than once 
>      The order of the transactions fetches does not have to be strictly
> time-sorted 
>      but in practice it probably has to be based on some time-oriented
> order to be able to support cursors. 
> 
> I can see that 1 & 2 are trivial to solve in Cassandra, but is there any
> elegant way to solve 3? 
> Since there might be multiple nodes logging transactions, their clocks
> might not be perfectly synchronized (to millisec level) etc so sorting
> on time is not stable. 
> Possibly creating a synchronized incremental id might be one option but
> that could create a cluster bottleneck etc? 
> 
> Another alternative might be to use cassandra for 1 & 2 and then store
> an ordered list of id:s in a standard DB. This might be a reasonable
> compromise since 3 is less critical from a HA point of view, but maybe
> someone can point me to a more elegant solution using Cassandra? 

Reply via email to