Shane, On 06 Oct 2014, at 16:34, Shane Hansen <shanemhan...@gmail.com> wrote:
> Sorry if I'm hijacking the conversation, but why in the world would you want > to implement a queue on top of Cassandra? It seems like using a proper > queuing service > would make your life a lot easier. Agreed - however, the use case simply does not justify the additional operations. > > That being said, there might be a better way to play to the strengths of C*. > Ideally everything you do > is append only with few deletes or updates. So an interesting way to > implement a queue might be > to do one insert to put the job in the queue and another insert to mark the > job as done or in process > or whatever. This would also give you the benefit of being able to replay the > state of the queue. Thanks, I’ll try that, too. Jan > > > On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen <jan.algermis...@nordsc.com> > wrote: > Chris, > > thanks for taking a look. > > On 06 Oct 2014, at 04:44, Chris Lohfink <clohf...@blackbirdit.com> wrote: > > > It appears you are aware of the tombstones affect that leads people to > > label this an anti-pattern. Without "due" or any time based value being > > part of the partition key means you will still get a lot of buildup. You > > only have 1 partition per shard which just linearly decreases the > > tombstones. That isn't likely to be enough to really help in a situation > > of high queue throughput, especially with the default of 4 shards. > > Yes, dealing with the tombstones effect is the whole point. The work loads I > have to deal with are not really high throughput, it is unlikely we’ll ever > reach multiple messages per second.The emphasis is also more on coordinating > producer and consumer than on high volume capacity problems. > > Your comment seems to suggest to include larger time frames (e.g. the > due-hour) in the partition keys and use the current time to select the active > partitions (e.g. the shards of the hour). Once an hour has passed, the > corresponding shards will never be touched again. > > Am I understanding this correctly? > > > > > You may want to consider switching to LCS from the default STCS since > > re-writing to same partitions a lot. It will still use STCS in L0 so in > > high write/delete scenarios, with low enough gc_grace, when it never gets > > higher then L1 it will be sameish write throughput. In scenarios where you > > get more LCS will shine I suspect by reducing number of obsolete > > tombstones. Would be hard to identify difference in small tests I think. > > Thanks, I’ll try to explore the various effects > > > > > Whats the plan to prevent two consumers from reading same message off of a > > queue? You mention in docs you will address it at a later point in time > > but its kinda a biggy. Big lock & batch reads like astyanax recipe? > > I have included a static column per shard to act as a lock (the ’lock’ column > in the examples) in combination with conditional updates. > > I must admit, I have not quite understood what Netfix is doing in terms of > coordination - but since performance isn’t our concern, CAS should do fine, I > guess(?) > > Thanks again, > > Jan > > > > > > --- > > Chris Lohfink > > > > > > On Oct 5, 2014, at 6:03 PM, Jan Algermissen <jan.algermis...@nordsc.com> > > wrote: > > > >> Hi, > >> > >> I have put together some thoughts on realizing simple queues with > >> Cassandra. > >> > >> https://github.com/algermissen/cassandra-ruby-queue > >> > >> The design is inspired by (the much more sophisticated) Netfilx > >> approach[1] but very reduced. > >> > >> Given that I am still a C* newbie, I’d be very glad to hear some thoughts > >> on the design path I took. > >> > >> Jan > >> > >> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue > > > >