i want answer the first question why one might use cassandra as a queuing solution: - its the only opensource distributed persistence layer (i.e. no SPOF), that you can run over WAN and provide lan/wan specific quorum controls i know its sub optimal, as the deletion imposes additional compaction/repair penalties, but there no other solution i am awaee of. Most AMQP solutions are broker based and clustering is pain, while things like riak only supports wan based cluster in their commercial solution. I would love to know about other alternatives,
And thaks for sharing the ruby based priority queue prototype, it helps people like me (sys ad :-) ) exploring these concepts betrter, cheers ranjib On Mon, Oct 6, 2014 at 1:35 PM, Jan Algermissen <jan.algermis...@nordsc.com> wrote: > Shane, > > On 06 Oct 2014, at 16:34, Shane Hansen <shanemhan...@gmail.com> wrote: > > > Sorry if I'm hijacking the conversation, but why in the world would you > want > > to implement a queue on top of Cassandra? It seems like using a proper > queuing service > > would make your life a lot easier. > > Agreed - however, the use case simply does not justify the additional > operations. > > > > > That being said, there might be a better way to play to the strengths of > C*. Ideally everything you do > > is append only with few deletes or updates. So an interesting way to > implement a queue might be > > to do one insert to put the job in the queue and another insert to mark > the job as done or in process > > or whatever. This would also give you the benefit of being able to > replay the state of the queue. > > Thanks, I’ll try that, too. > > Jan > > > > > > > > On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen < > jan.algermis...@nordsc.com> wrote: > > Chris, > > > > thanks for taking a look. > > > > On 06 Oct 2014, at 04:44, Chris Lohfink <clohf...@blackbirdit.com> > wrote: > > > > > It appears you are aware of the tombstones affect that leads people to > label this an anti-pattern. Without "due" or any time based value being > part of the partition key means you will still get a lot of buildup. You > only have 1 partition per shard which just linearly decreases the > tombstones. That isn't likely to be enough to really help in a situation > of high queue throughput, especially with the default of 4 shards. > > > > Yes, dealing with the tombstones effect is the whole point. The work > loads I have to deal with are not really high throughput, it is unlikely > we’ll ever reach multiple messages per second.The emphasis is also more on > coordinating producer and consumer than on high volume capacity problems. > > > > Your comment seems to suggest to include larger time frames (e.g. the > due-hour) in the partition keys and use the current time to select the > active partitions (e.g. the shards of the hour). Once an hour has passed, > the corresponding shards will never be touched again. > > > > Am I understanding this correctly? > > > > > > > > You may want to consider switching to LCS from the default STCS since > re-writing to same partitions a lot. It will still use STCS in L0 so in > high write/delete scenarios, with low enough gc_grace, when it never gets > higher then L1 it will be sameish write throughput. In scenarios where you > get more LCS will shine I suspect by reducing number of obsolete > tombstones. Would be hard to identify difference in small tests I think. > > > > Thanks, I’ll try to explore the various effects > > > > > > > > Whats the plan to prevent two consumers from reading same message off > of a queue? You mention in docs you will address it at a later point in > time but its kinda a biggy. Big lock & batch reads like astyanax recipe? > > > > I have included a static column per shard to act as a lock (the ’lock’ > column in the examples) in combination with conditional updates. > > > > I must admit, I have not quite understood what Netfix is doing in terms > of coordination - but since performance isn’t our concern, CAS should do > fine, I guess(?) > > > > Thanks again, > > > > Jan > > > > > > > > > > --- > > > Chris Lohfink > > > > > > > > > On Oct 5, 2014, at 6:03 PM, Jan Algermissen < > jan.algermis...@nordsc.com> wrote: > > > > > >> Hi, > > >> > > >> I have put together some thoughts on realizing simple queues with > Cassandra. > > >> > > >> https://github.com/algermissen/cassandra-ruby-queue > > >> > > >> The design is inspired by (the much more sophisticated) Netfilx > approach[1] but very reduced. > > >> > > >> Given that I am still a C* newbie, I’d be very glad to hear some > thoughts on the design path I took. > > >> > > >> Jan > > >> > > >> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue > > > > > > > > >