I think most people will tell you what Sean did- queues are considered an anti-pattern for many reasons in Cassandra, and while it's possible, you may want to consider something more suited for the job (RabbitMQ, redis queues are just a few ideas that come to mind).
If you're sold on the idea of using Cassandra for this, you will likely need a few tables, as Sean points out. Also, you should try to avoid the situation where you have a 1:1 ratio of writes to deletes (everything written will be deleted, and quickly)- this will exercise a great many limitations in Cassandra's design. For queues, especially if you 1) want to act quickly on the contents of the queue, 2) cannot miss a message, and 3) do not want duplicate actions, you're going to have trouble with a distributed system like Cassandra. One common approach is to persist the actual data (message contents) in Cassandra keyed by eg. msgid timeuuid, or (userid, msgid) , and to use a durable queue like RabbitMQ to contain the uuids for the actual queue behavior, circumventing the need for deletes and simplifying the failure/retry logic. Then you get historical lookups (give me all emails I've sent to userid) as well. On Fri, Mar 4, 2016 at 1:36 PM, I PVP <i...@hotmail.com> wrote: > Thanks for answering. > > Yes, It is mainly a queue, but also has some functionality to allow resend > the messages. > > Does anyone have experience handling this kind of scenario, within (or > without) Cassandra? > > Thanks > > -- > IPVP > > > From: sean_r_dur...@homedepot.com <sean_r_dur...@homedepot.com> > <sean_r_dur...@homedepot.com> > Reply: user@cassandra.apache.org <user@cassandra.apache.org>> > <user@cassandra.apache.org> > Date: March 4, 2016 at 11:48:56 AM > To: user@cassandra.apache.org <user@cassandra.apache.org>> > <user@cassandra.apache.org> > Subject: RE: Modeling transactional messages > > As you have it, this is not a good model for Cassandra. Your partition key > has only 2 specific values. You would end up with only 2 partitions > (perhaps owned by just 2 nodes) that would quickly get huge (and slow). > Also, secondary indexes are generally a bad idea. You would either want to > create new table to support additional queries or look at the materialized > views in the 3.x versions. > > > > You are setting up something like a queue, which is typically an > anti-pattern for Cassandra. > > > > However, I will at least toss out an idea for the rest of the community to > improve (or utterly reject): > > > > You could have an unsent mail table and a sent mail table. > > For unsent mail, just use the objectID as the partition key. The drivers > can page through results, though if it gets very large, you might see > problems. Delete the row from unsent mail once it is sent. Try leveled > compaction with a short gc_grace. There would be a lot of churn on this > table, so it may still be less than ideal. > > > > Then you could do the sent email table with objectID and all the email > details. Add separate lookup tables for: > > - (emailaddr), object ID (if this is going to be large/wide, perhaps add a > time bucket to the partition key, like yyyymm) > > - (domain, time bucket), objectID > > > > Set TTL on these rows (either default or with the insert) to get the purge > to be automatic. > > > > > > Sean Durity > > > > *From:* I PVP [mailto:i...@hotmail.com] > *Sent:* Thursday, March 03, 2016 7:51 PM > *To:* user@cassandra.apache.org > *Subject:* Modeling transactional messages > > > > Hi everyone, > > > > Can anyone please let me know if I am heading to an antiparttern or > somethingelse bad? > > > > How would you model the following ... ? > > > > I am migrating from MYSQL to Cassandra, I have a scenario in which need to > store the content of "to be sent" transactional email messages that the > customer will receive on events like : an order was created, an order was > updated, an order was canceled,an order was shipped,an account was > created, an account was confirmed, an account was locked and so on. > > > > On MYSQL there is table for email message "type", like: a table to store > messages of "order-created”, a table to store messages of "order-updated" > and so on. > > > > The messages are sent by a non-parallelized java worker, scheduled to run > every X seconds, that push the messages to a service like > Sendgrid/Mandrill/Mailjet. > > > > For better performance, easy to purge and overall code maintenance I am > looking to have all message "types" on a single table/column family as > following: > > > > CREATE TABLE communication.transactional_email ( > > objectid timeuuid, > > subject text, > > content text, > > fromname text, > > fromaddr text, > > toname text, > > toaddr text, > > wassent boolean, > > createdate timestamp, > > sentdate timestamp, > > type text, // example: order_created, order_canceled > > domain text, // exaple: hotmail.com. in case need to stop sending to a > specific domain > > PRIMARY KEY (wassent, objectid) > > ); > > > > create index on toaddr > > create index on sentdate > > create index on domain > > create index on type > > > > > > The requirements are : > > > > 1) select * from transactional_email where was_sent = false and objectid < > minTimeuuid(current timestamp) limit <number> > > > > (to get the messages that need to be sent) > > > > 2) update transactional_email set was_sent = true where objectid = > <timeuuid> > > > > (to update the message right after it was sent) > > > > 3) select * from transactional_email where toaddr = <emailaddr> > > > > (to get all messages that were sent to a specific emailaddr) > > > > 4) select * from transactional_email where domain = <domain> > > > > (to get all messages that were sent to a specific domain) > > > > 5) delete from transactional_email where was_sent = true and objectid < > minTimeuuid(a timestamp) > > > > (to do purge, delete all messages send before the last X days) > > > > 6) delete from transactional_email where toaddr = <emailaddr> > > > > (to be able to delete all messages when a user account is closed) > > > > > > Thanks > > > > IPVP > > ------------------------------ > > The information in this Internet Email is confidential and may be legally > privileged. It is intended solely for the addressee. Access to this Email > by anyone else is unauthorized. If you are not the intended recipient, any > disclosure, copying, distribution or any action taken or omitted to be > taken in reliance on it, is prohibited and may be unlawful. When addressed > to our clients any opinions or advice contained in this Email are subject > to the terms and conditions expressed in any applicable governing The Home > Depot terms of business or client engagement letter. The Home Depot > disclaims all responsibility and liability for the accuracy and content of > this attachment and for any damages or losses arising from any > inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other > items of a destructive nature, which may be contained in this attachment > and shall not be liable for direct, indirect, consequential or special > damages in connection with this e-mail message or its attachment. > >