I think most people will tell you what Sean did- queues are considered an
anti-pattern for many reasons in Cassandra, and while it's possible, you
may want to consider something more suited for the job (RabbitMQ, redis
queues are just a few ideas that come to mind).

If you're sold on the idea of using Cassandra for this, you will likely
need a few tables, as Sean points out. Also, you should try to avoid the
situation where you have a 1:1 ratio of writes to deletes (everything
written will be deleted, and quickly)- this will exercise a great many
limitations in Cassandra's design. For queues, especially if you 1) want to
act quickly on the contents of the queue, 2) cannot miss a message, and 3)
do not want duplicate actions, you're going to have trouble with a
distributed system like Cassandra.

One common approach is to persist the actual data (message contents) in
Cassandra keyed by eg. msgid timeuuid, or (userid, msgid) , and to use a
durable queue like RabbitMQ to contain the uuids for the actual queue
behavior, circumventing the need for deletes and simplifying the
failure/retry logic. Then you get historical lookups (give me all emails
I've sent to userid) as well.

On Fri, Mar 4, 2016 at 1:36 PM, I PVP <i...@hotmail.com> wrote:

> Thanks for  answering.
>
> Yes, It is mainly a queue, but also has some functionality to allow resend
> the messages.
>
> Does anyone have experience handling this kind of scenario, within (or
> without) Cassandra?
>
> Thanks
>
> --
> IPVP
>
>
> From: sean_r_dur...@homedepot.com <sean_r_dur...@homedepot.com>
> <sean_r_dur...@homedepot.com>
> Reply: user@cassandra.apache.org <user@cassandra.apache.org>>
> <user@cassandra.apache.org>
> Date: March 4, 2016 at 11:48:56 AM
> To: user@cassandra.apache.org <user@cassandra.apache.org>>
> <user@cassandra.apache.org>
> Subject:  RE: Modeling transactional messages
>
> As you have it, this is not a good model for Cassandra. Your partition key
> has only 2 specific values. You would end up with only 2 partitions
> (perhaps owned by just 2 nodes) that would quickly get huge (and slow).
> Also, secondary indexes are generally a bad idea. You would either want to
> create new table to support additional queries or look at the materialized
> views in the 3.x versions.
>
>
>
> You are setting up something like a queue, which is typically an
> anti-pattern for Cassandra.
>
>
>
> However, I will at least toss out an idea for the rest of the community to
> improve (or utterly reject):
>
>
>
> You could have an unsent mail table and a sent mail table.
>
> For unsent mail, just use the objectID as the partition key. The drivers
> can page through results, though if it gets very large, you might see
> problems. Delete the row from unsent mail once it is sent. Try leveled
> compaction with a short gc_grace. There would be a lot of churn on this
> table, so it may still be less than ideal.
>
>
>
> Then you could do the sent email table with objectID and all the email
> details. Add separate lookup tables for:
>
> - (emailaddr), object ID (if this is going to be large/wide, perhaps add a
> time bucket to the partition key, like yyyymm)
>
> - (domain, time bucket), objectID
>
>
>
> Set TTL on these rows (either default or with the insert) to get the purge
> to be automatic.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* I PVP [mailto:i...@hotmail.com]
> *Sent:* Thursday, March 03, 2016 7:51 PM
> *To:* user@cassandra.apache.org
> *Subject:* Modeling transactional messages
>
>
>
> Hi everyone,
>
>
>
> Can anyone please let me know if I am heading to an antiparttern or
> somethingelse bad?
>
>
>
> How would you model the following ... ?
>
>
>
> I am migrating from MYSQL to Cassandra, I have a scenario in which need to
> store the content of "to be sent" transactional email messages that the
> customer will receive on events like : an order was created, an order was
> updated, an order was canceled,an order was  shipped,an account was
> created, an account was confirmed, an account was locked and so on.
>
>
>
> On MYSQL there is table for email message "type", like: a table to store
> messages of "order-created”, a table to store messages of "order-updated"
> and so on.
>
>
>
> The messages are sent by a non-parallelized java worker, scheduled to run
> every X seconds, that push the messages to a service like
> Sendgrid/Mandrill/Mailjet.
>
>
>
> For better performance, easy to purge and overall code maintenance I am
> looking to have all message "types" on a single table/column family as
> following:
>
>
>
> CREATE TABLE communication.transactional_email (
>
> objectid timeuuid,
>
> subject text,
>
> content text,
>
> fromname text,
>
> fromaddr text,
>
> toname text,
>
> toaddr text,
>
> wassent boolean,
>
> createdate timestamp,
>
> sentdate timestamp,
>
> type text,    // example: order_created, order_canceled
>
> domain text, // exaple: hotmail.com. in case need to stop sending to a
> specific domain
>
> PRIMARY KEY (wassent, objectid)
>
> );
>
>
>
> create index on toaddr
>
> create index on sentdate
>
> create index on domain
>
> create index on type
>
>
>
>
>
> The requirements are :
>
>
>
> 1) select * from transactional_email where was_sent = false and objectid <
> minTimeuuid(current timestamp) limit <number>
>
>
>
> (to get the messages that need to be sent)
>
>
>
> 2) update transactional_email set was_sent = true where objectid =
> <timeuuid>
>
>
>
> (to update the message  right after it was sent)
>
>
>
> 3) select * from transactional_email where toaddr = <emailaddr>
>
>
>
> (to get all messages that were sent to a specific emailaddr)
>
>
>
> 4) select * from transactional_email where domain = <domain>
>
>
>
> (to get all messages that were sent to a specific domain)
>
>
>
> 5) delete from transactional_email where was_sent = true and objectid <
> minTimeuuid(a timestamp)
>
>
>
> (to do purge, delete all messages send before the last X days)
>
>
>
> 6) delete from transactional_email where toaddr = <emailaddr>
>
>
>
> (to be able to delete all messages when a user account is closed)
>
>
>
>
>
> Thanks
>
>
>
> IPVP
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>

Reply via email to