Hi Jan, Both Chris and Shane say what I believe the correct thinking.
Just let you know if you base your implementation on Netflix's queue recipe, there are many issues with it. In general, we don't advise people to use that recipe so I suggest you to save your time by not going that same route again. Minh On Mon, Oct 6, 2014 at 7:34 AM, Shane Hansen <shanemhan...@gmail.com> wrote: > Sorry if I'm hijacking the conversation, but why in the world would you > want > to implement a queue on top of Cassandra? It seems like using a proper > queuing service > would make your life a lot easier. > > That being said, there might be a better way to play to the strengths of > C*. Ideally everything you do > is append only with few deletes or updates. So an interesting way to > implement a queue might be > to do one insert to put the job in the queue and another insert to mark > the job as done or in process > or whatever. This would also give you the benefit of being able to replay > the state of the queue. > > > On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen < > jan.algermis...@nordsc.com> wrote: > >> Chris, >> >> thanks for taking a look. >> >> On 06 Oct 2014, at 04:44, Chris Lohfink <clohf...@blackbirdit.com> wrote: >> >> > It appears you are aware of the tombstones affect that leads people to >> label this an anti-pattern. Without "due" or any time based value being >> part of the partition key means you will still get a lot of buildup. You >> only have 1 partition per shard which just linearly decreases the >> tombstones. That isn't likely to be enough to really help in a situation >> of high queue throughput, especially with the default of 4 shards. >> >> Yes, dealing with the tombstones effect is the whole point. The work >> loads I have to deal with are not really high throughput, it is unlikely >> we’ll ever reach multiple messages per second.The emphasis is also more on >> coordinating producer and consumer than on high volume capacity problems. >> >> Your comment seems to suggest to include larger time frames (e.g. the >> due-hour) in the partition keys and use the current time to select the >> active partitions (e.g. the shards of the hour). Once an hour has passed, >> the corresponding shards will never be touched again. >> >> Am I understanding this correctly? >> >> > >> > You may want to consider switching to LCS from the default STCS since >> re-writing to same partitions a lot. It will still use STCS in L0 so in >> high write/delete scenarios, with low enough gc_grace, when it never gets >> higher then L1 it will be sameish write throughput. In scenarios where you >> get more LCS will shine I suspect by reducing number of obsolete >> tombstones. Would be hard to identify difference in small tests I think. >> >> Thanks, I’ll try to explore the various effects >> >> > >> > Whats the plan to prevent two consumers from reading same message off >> of a queue? You mention in docs you will address it at a later point in >> time but its kinda a biggy. Big lock & batch reads like astyanax recipe? >> >> I have included a static column per shard to act as a lock (the ’lock’ >> column in the examples) in combination with conditional updates. >> >> I must admit, I have not quite understood what Netfix is doing in terms >> of coordination - but since performance isn’t our concern, CAS should do >> fine, I guess(?) >> >> Thanks again, >> >> Jan >> >> >> > >> > --- >> > Chris Lohfink >> > >> > >> > On Oct 5, 2014, at 6:03 PM, Jan Algermissen <jan.algermis...@nordsc.com> >> wrote: >> > >> >> Hi, >> >> >> >> I have put together some thoughts on realizing simple queues with >> Cassandra. >> >> >> >> https://github.com/algermissen/cassandra-ruby-queue >> >> >> >> The design is inspired by (the much more sophisticated) Netfilx >> approach[1] but very reduced. >> >> >> >> Given that I am still a C* newbie, I’d be very glad to hear some >> thoughts on the design path I took. >> >> >> >> Jan >> >> >> >> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue >> > >> >> >