FWIW I would recommend first trying to solve the issue in your application rather than with Cages or Zoo Keeper. Although I do not have experience with Cages or Zoo Keeper, it's another major server component in your stack.

If you really do have a queue and multiple simultaneous readers consider using something like Rabbit MQ http://www.rabbitmq.com/ . Or try something like Redis http://code.google.com/p/redis/ or Gear Man http://gearman.org/ to get a quick prototype going.

Hope that helps. 
Aaron

On 08 Nov, 2010,at 02:05 PM, Mubarak Seyed <mubarak.se...@gmail.com> wrote:

Hi All,

Can someone please validate and recommend a solution for the given design problem?

Problem statement: Need to de-queue data from Cassandra (from Standard ColumnFamily) using a job but multiple instances of a job can run simultaneously (kinda multiple threads), trying to access a same row but need to make sure that only one instance of a job (thread) can access a row, meaning if job A is accessing Row #1, then job B can't access Row #1.

Possible solutions:

Solution #1: Using Cages (and ZooKeeper) to make sure that one only job at a time can access a row in CF. How do we make sure that Cages (transaction coordinator using ZooKeeper) is not a Single Point of Failure? What is the performance impact on write/read on nodes? There is some blog on distributed concurrent queue at http://www.cloudera.com/blog/2009/05/building-a-distributed-concurrent-queue-with-apache-zookeeper/

Solution #2: Using some home-grown approach to store/maintain who is accessing what, meaning which job is accessing which row.

Are there any other solutions to the above problem?  

Can someone please help me on validate the design?

--
Thanks,
Mubarak Seyed.

Reply via email to