Some ideas I throw in here:

"The delay Y will be at least 1 minute, and at most 90 days with a
resolution per minute" --> Use the delay (with format YYYYMMDDHHMM as
integer) as your partition key.

Example: today March 24th at 12:00 (201502241200) you need to delay 3
actions, action A in exact 3 days, action B in 10 hours and action C in 5
minutes. Thus you will create 3 partitions:

- for A, partition key = 201503271200
- for B, partition key = 201503242200
- for C, partition key = 201503241205

In each partition, you'll need to create as many clustering columns as
there are actions to execute. According to your estimate, the average is a
few hundred thousands and the max is a few millions so it's fine. Also, you
would have a pool of worker which will load the whole partition (with
paging when necessary) every minute and process the actions.

Once all the actions have been executed, you can either remove the complete
partition or keep them for archiving.

Duy Hai DOAN

On Tue, Mar 24, 2015 at 9:19 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Mar 24, 2015 at 5:05 AM, Robin Verlangen <ro...@us2.nl> wrote:
>
>> - for every point in the future there are probably hundreds of actions
>> which have to be processed
>> - all actions for a point in time will be processed at once (thus not
>> removing action by action as a typical queue would do)
>> - once all actions have been processed we remove the entire row (by key,
>> not the individual columns)
>>
>
> I've used Cassandra for similar queue-like things, and it's "fine." Not
> ideal, but number of objects and access patterns are "fine."
>
>
> https://engineering.eventbrite.com/replayable-pubsub-queues-with-cassandra-and-zookeeper/
>
> This design never truncates history, but if you can tolerate throwing away
> history, that problem goes away..
>
> =Rob
>
>

Reply via email to