Re: Atomicity Strategies

Roland Gude Sun, 10 Apr 2011 11:47:17 -0700

A Strategy that should Cover at least some use Cases is roughly like this:

Given cf A and B should Be in Sync
In write 'a' to cf A Add another Column 'Synchronisation_token' and Write a
tuuid 'T' (or a timestamp or some Otter Value that Allows (Time based)
ordering) As its value.
On the related write to cfB Write the Token As well.
When Reading check Client Side if tokens Match and reread Data with Lower Token
until it does.

Roland

Am 10.04.2011 um 03:53 sc"aaron morton"
<aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>>:

My understanding of what they did with locking (based on the examples) was to
achieve a level of transaction isolation
<http://en.wikipedia.org/wiki/Isolation_(database_systems)>
http://en.wikipedia.org/wiki/Isolation_(database_systems)

I think the issue here is more about atomicity
<http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic>
http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

<http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic>We cannot guarantee
that all or none of the mutations in your batch are completed. There is some
work in this area though <https://issues.apache.org/jira/browse/CASSANDRA-1684>
https://issues.apache.org/jira/browse/CASSANDRA-1684

<https://issues.apache.org/jira/browse/CASSANDRA-1684>AFAIK the best approach
now is to work at Quourm, and write your code to handle missing relations. Also
cassandra does do a lot of work upfront before the write starts to ensure it
will succeed, failures during a write will probably be due to a SW/HW failure
or overload on a node that gossip has not picked up.

Retrying is the recommended approach when a request fails.

Hope that helps.
Aaron

On 9 Apr 2011, at 15:58, Dan Washusen wrote:

Here's a good writeup on how <http://www.fightmymonster.com/>
fightmymonster.com<http://fightmymonster.com> does it...

<http://ria101.wordpress.com/category/nosql-databases/locking/>http://ria101.wordpress.com/category/nosql-databases/locking/

--
Dan Washusen
Make big files fly
visit <http://digitalpigeon.com/> digitalpigeon.com<http://digitalpigeon.com>

On Saturday, 9 April 2011 at 11:53 AM, Alex Araujo wrote:

On 4/8/11 5:46 PM, Drew Kutcharian wrote:
I'm interested in this too, but I don't think this can be done with Cassandra
alone. Cassandra doesn't support transactions. I think hector can retry
operations, but I'm not sure about the atomicity of the whole thing.

On Apr 8, 2011, at 1:26 PM, Alex Araujo wrote:

Hi, I was wondering if there are any patterns/best practices for creating
atomic units of work when dealing with several column families and their
inverted indices.

For example, if I have Users and Groups column families and did something like:

Users.insert( user_id, columns )
UserGroupTimeline.insert( group_id, { timeuuid() : user_id } )
UserGroupStatus.insert( group_id + ":" + user_id, { "Active" : "True" } )
UserEvents.insert( timeuuid(), { "user_id" : user_id, "group_id" : group_id,
"event_type" : "join" } )

Would I want the client to retry all subsequent operations that failed against
other nodes after n succeeded, maintain an "undo" queue of operations to run,
batch the mutations and choose a strong consistency level, some combination of
these/others, etc?

Thanks,
Alex
Thanks Drew. I'm familiar with lack of transactions and have read about
people usiing ZK (possibly Cages as well?) to accomplish this, but since
it seems that inverted indices are common place I'm interested in how
anyone is mitigating lack of atomicity to any extent without the use of
such tools. It appears that Hector and Pelops have retrying built in to
their APIs and I'm fairly confident that proper use of those
capabilities may help. Just trying to cover all bases. Hopefully
someone can share their approaches and/or experiences. Cheers, Alex.

Re: Atomicity Strategies

Reply via email to