Re: Exploring Simply Queueing

Jan Algermissen Mon, 06 Oct 2014 13:36:45 -0700

Shane,

On 06 Oct 2014, at 16:34, Shane Hansen <shanemhan...@gmail.com> wrote:


> Sorry if I'm hijacking the conversation, but why in the world would you want
> to implement a queue on top of Cassandra? It seems like using a proper 
> queuing service
> would make your life a lot easier.

Agreed - however, the use case simply does not justify the additional 
operations.

> 
> That being said, there might be a better way to play to the strengths of C*. 
> Ideally everything you do
> is append only with few deletes or updates. So an interesting way to 
> implement a queue might be
> to do one insert to put the job in the queue and another insert to mark the 
> job as done or in process
> or whatever. This would also give you the benefit of being able to replay the 
> state of the queue.

Thanks, I’ll try that, too.

Jan


> 
> 
> On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen <jan.algermis...@nordsc.com> 
> wrote:
> Chris,
> 
> thanks for taking a look.
> 
> On 06 Oct 2014, at 04:44, Chris Lohfink <clohf...@blackbirdit.com> wrote:
> 
> > It appears you are aware of the tombstones affect that leads people to 
> > label this an anti-pattern.  Without "due" or any time based value being 
> > part of the partition key means you will still get a lot of buildup.  You 
> > only have 1 partition per shard which just linearly decreases the 
> > tombstones.  That isn't likely to be enough to really help in a situation 
> > of high queue throughput, especially with the default of 4 shards.
> 
> Yes, dealing with the tombstones effect is the whole point. The work loads I 
> have to deal with are not really high throughput, it is unlikely we’ll ever 
> reach multiple messages per second.The emphasis is also more on coordinating 
> producer and consumer than on high volume capacity problems.
> 
> Your comment seems to suggest to include larger time frames (e.g. the 
> due-hour) in the partition keys and use the current time to select the active 
> partitions (e.g. the shards of the hour). Once an hour has passed, the 
> corresponding shards will never be touched again.
> 
> Am I understanding this correctly?
> 
> >
> > You may want to consider switching to LCS from the default STCS since 
> > re-writing to same partitions a lot. It will still use STCS in L0 so in 
> > high write/delete scenarios, with low enough gc_grace, when it never gets 
> > higher then L1 it will be sameish write throughput. In scenarios where you 
> > get more LCS will shine I suspect by reducing number of obsolete 
> > tombstones.  Would be hard to identify difference in small tests I think.
> 
> Thanks, I’ll try to explore the various effects
> 
> >
> > Whats the plan to prevent two consumers from reading same message off of a 
> > queue?  You mention in docs you will address it at a later point in time 
> > but its kinda a biggy.  Big lock & batch reads like astyanax recipe?
> 
> I have included a static column per shard to act as a lock (the ’lock’ column 
> in the examples) in combination with conditional updates.
> 
> I must admit, I have not quite understood what Netfix is doing in terms of 
> coordination - but since performance isn’t our concern, CAS should do fine, I 
> guess(?)
> 
> Thanks again,
> 
> Jan
> 
> 
> >
> > ---
> > Chris Lohfink
> >
> >
> > On Oct 5, 2014, at 6:03 PM, Jan Algermissen <jan.algermis...@nordsc.com> 
> > wrote:
> >
> >> Hi,
> >>
> >> I have put together some thoughts on realizing simple queues with 
> >> Cassandra.
> >>
> >> https://github.com/algermissen/cassandra-ruby-queue
> >>
> >> The design is inspired by (the much more sophisticated) Netfilx 
> >> approach[1] but very reduced.
> >>
> >> Given that I am still a C* newbie, I’d be very glad to hear some thoughts 
> >> on the design path I took.
> >>
> >> Jan
> >>
> >> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
> >
> 
>

Re: Exploring Simply Queueing

Reply via email to