A Strategy that should Cover at least some use Cases is roughly like this:

Given cf A and B should Be in Sync
In write 'a' to cf A Add another Column 'Synchronisation_token' and Write a 
tuuid 'T' (or a timestamp or some Otter Value that Allows (Time based) 
ordering) As its value.
On the related write to cfB Write the Token As well.
When Reading check Client Side if tokens Match and reread Data with Lower Token 
until it does.


Roland


Am 10.04.2011 um 03:53 sc"aaron morton" 
<aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>>:

My understanding of what they did with locking (based on the examples) was to 
achieve a level of transaction isolation 
<http://en.wikipedia.org/wiki/Isolation_(database_systems)> 
http://en.wikipedia.org/wiki/Isolation_(database_systems)

I think the issue here is more about atomicity 
<http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic> 
http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

<http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic>We cannot guarantee 
that all or none of the mutations in your batch are completed. There is some 
work in this area though <https://issues.apache.org/jira/browse/CASSANDRA-1684> 
https://issues.apache.org/jira/browse/CASSANDRA-1684

<https://issues.apache.org/jira/browse/CASSANDRA-1684>AFAIK the best approach 
now is to work at Quourm, and write your code to handle missing relations. Also 
cassandra does do a lot of work upfront before the write starts to ensure it 
will succeed, failures during a write will probably be due to a SW/HW failure 
or overload on a node that gossip has not picked up.

Retrying is the recommended approach when a request fails.

Hope that helps.
Aaron

On 9 Apr 2011, at 15:58, Dan Washusen wrote:

Here's a good writeup on how <http://www.fightmymonster.com/> 
fightmymonster.com<http://fightmymonster.com> does it...

<http://ria101.wordpress.com/category/nosql-databases/locking/>http://ria101.wordpress.com/category/nosql-databases/locking/

--
Dan Washusen
Make big files fly
visit <http://digitalpigeon.com/> digitalpigeon.com<http://digitalpigeon.com>

On Saturday, 9 April 2011 at 11:53 AM, Alex Araujo wrote:

On 4/8/11 5:46 PM, Drew Kutcharian wrote:
I'm interested in this too, but I don't think this can be done with Cassandra 
alone. Cassandra doesn't support transactions. I think hector can retry 
operations, but I'm not sure about the atomicity of the whole thing.



On Apr 8, 2011, at 1:26 PM, Alex Araujo wrote:

Hi, I was wondering if there are any patterns/best practices for creating 
atomic units of work when dealing with several column families and their 
inverted indices.

For example, if I have Users and Groups column families and did something like:

Users.insert( user_id, columns )
UserGroupTimeline.insert( group_id, { timeuuid() : user_id } )
UserGroupStatus.insert( group_id + ":" + user_id, { "Active" : "True" } )
UserEvents.insert( timeuuid(), { "user_id" : user_id, "group_id" : group_id, 
"event_type" : "join" } )

Would I want the client to retry all subsequent operations that failed against 
other nodes after n succeeded, maintain an "undo" queue of operations to run, 
batch the mutations and choose a strong consistency level, some combination of 
these/others, etc?

Thanks,
Alex
Thanks Drew. I'm familiar with lack of transactions and have read about
people usiing ZK (possibly Cages as well?) to accomplish this, but since
it seems that inverted indices are common place I'm interested in how
anyone is mitigating lack of atomicity to any extent without the use of
such tools. It appears that Hector and Pelops have retrying built in to
their APIs and I'm fairly confident that proper use of those
capabilities may help. Just trying to cover all bases. Hopefully
someone can share their approaches and/or experiences. Cheers, Alex.


Reply via email to