Ideas: Use a checkpoint that moves forward in time for each logical partition of the workload.
Establish a way of dividing up jobs between clients that doesn't require synchronization. One way of doing this would be to modulo the key by the number of logical workers, allowing them to graze directly on the job data. Doing it this way means that you have to make the workers smart enough to checkpoint properly, handle exceptions, etc. Jobs may be dispatched out-of-order in this scheme, so you would have to decide how to handle explicit sequencing requirements. Some jobs have idempotent results only when executed in the same order, and keeping operations idempotent allows for simpler failure recovery. If your workers are capable of absorbing the workload, then backlogging won't hurt too much. Otherwise, you'll see strange ordering of things in your application when they would otherwise need to look more consistent. You might find it easier to just take the hit of having a synchronized dispatcher, but make it is a lean as possible. Another way to break workload up is to have logical groupings of jobs according to a natural boundary in your domain model, and to run a synchronized dispatcher for each of those. Using the "job" columns to keep track of who owns a job may not be the best approach. You may have to do row scans on column data, which is a Cassandra anti-pattern. Without an atomic check and modify operation, there is no way to do it that avoids possible race conditions or extra state management. This may be one of the strongest arguments for putting such an operation into Cassandra. You can set up your job name/keying such that every job result is logically ordered to come immediately after the job definition. Row key range scans would still be close to optimal, but would carry a marker for jobs which had been completed, This would allow clients to self-checkpoint, as long as result insertions are atomic row-wise. (I think they are). Another worker could clean up rows which were subsequently consumed (results no longer needed) after some gap in time. The client can avoid lots of tombstones by only looking where there should be additional work. (checkpoint time). Pick a character that is not natural for your keys and make it a delimiter. Require that all keys in the job CF be aggregate and fully-qualified. Clients might be able to remove jobs rows that allow for it after completion, but jobs which were dispatched to multiple works may end up with orphaned result rows to be cleaned up. .. just some drive-by ramblings .. Jonathan On Sat, Jun 26, 2010 at 3:56 PM, Andrew Miklas <and...@pagerduty.com> wrote: > Hi all, > > Has anyone written a work-queue implementation using Cassandra? > > There's a section in the UseCase wiki page for "A distributed Priority Job > Queue" which looks perfect, but unfortunately it hasn't been filled in yet. > http://wiki.apache.org/cassandra/UseCases#A_distributed_Priority_Job_Queue > > I've been thinking about how best to do this, but every solution I've > thought of seems to have some serious drawback. The "range ghost" problem > in particular creates some issues. I'm assuming each job has a row within > some column family, where the row's key is the time at which the job should > be run. To find the next job, you'd do a range query with a start a few > hours in the past, and an end at the current time. Once a job is completed, > you delete the row. > > The problem here is that you have to scan through deleted-but-not-yet-GCed > rows each time you run the query. Is there a better way? > > Preventing more than one worker from starting the same job seems like it > would be a problem too. You'd either need an external locking manager, or > have to use some other protocol where workers write their ID into the row > and then immediately read it back to confirm that they are the owner of the > job. > > Any ideas here? Has anyone come up with a nice implementation? Is > Cassandra not well suited for queue-like tasks? > > > > Thanks, > > > Andrew >