You could try the following:
i:20110728 { tx00001="va1", tx00002="va1", tx00003="va1", tx00004="va1", tx00005="va1", tx00006="va1", } The value could either be a blob / json pojo, or a reference off to another row storing the columns representing the value. Taking it further adding a hash(val) % 512 will distribute the logical row across a cluster for added resilience. The above becomes i:20110728:A9 { tx00001="va1", tx00008="va1", } p On 28/07/11 15:45, Kent Narling wrote: > Hi! > > I am considering to use cassandra for clustered transaction logging in a > project. > > What I need are in principal 3 functions: > > 1 - Log transaction with a unique (but possibly non-sequential) id > 2 - Fetch transaction with a specific id > 3 - Fetch X new transactions "after" a specific cursor/transaction > This function must be guaranteed to: > A, eventually return all known transactions > B, Not return the same transaction more than once > The order of the transactions fetches does not have to be strictly > time-sorted > but in practice it probably has to be based on some time-oriented > order to be able to support cursors. > > I can see that 1 & 2 are trivial to solve in Cassandra, but is there any > elegant way to solve 3? > Since there might be multiple nodes logging transactions, their clocks > might not be perfectly synchronized (to millisec level) etc so sorting > on time is not stable. > Possibly creating a synchronized incremental id might be one option but > that could create a cluster bottleneck etc? > > Another alternative might be to use cassandra for 1 & 2 and then store > an ordered list of id:s in a standard DB. This might be a reasonable > compromise since 3 is less critical from a HA point of view, but maybe > someone can point me to a more elegant solution using Cassandra?