Re: Best approach in Cassandra (+ Spark?) for Continuous Queries?

Peter Lin Sat, 03 Jan 2015 03:59:56 -0800

It looks like you're using the wrong tool and architecture.

If the use case really needs continuous query like event processing, use an ESP 
product to do that. You can still store data in Cassandra for persistence .


The design you want is to have two paths: event stream and persistence. At the 
entry point, the system makes parallel calls. One goes to a messaging system 
that feeds the ESP and a second that calls Cassandra 


Sent from my iPhone

> On Jan 3, 2015, at 5:46 AM, Hugo José Pinto <hugo.pi...@inovaworks.com> wrote:
> 
> Hello.
> 
> We're currently using Hazelcast (http://hazelcast.org/) as a distributed 
> in-memory data grid. That's been working sort-of-well for us, but going 
> solely in-memory has exhausted its path in our use case, and we're 
> considering porting our application to a NoSQL persistent store. After the 
> usual comparisons and evaluations, we're borderline close to picking 
> Cassandra, plus eventually Spark for analytics.
> 
> Nonetheless, there is a gap in our architectural needs that we're still not 
> grasping how to solve in Cassandra (with or without Spark): Hazelcast allows 
> us to create a Continuous Query in that, whenever a row is 
> added/removed/modified from the clause's resultset, Hazelcast calls up back 
> with the corresponding notification. We use this to continuously update the 
> clients via AJAX streaming with the new/changed rows.
> 
> This is probably a conceptual mismatch we're making, so - how to best address 
> this use case in Cassandra (with or without Spark's help)? Is there something 
> in the API that allows for Continuous Queries on key/clause changes (haven't 
> found it)? Is there some other way to get a stream of key/clause updates? 
> Events of some sort?
> 
> I'm aware that we could, eventually, periodically poll Cassandra, but in our 
> use case, the client is potentially interested in a large number of table 
> clause notifications (think "all changes to Ship positions on California's 
> coastline"), and iterating out of the store would kill the streamer's 
> scalability.
> 
> Hence, the magic question: what are we missing? Is Cassandra the wrong tool 
> for the job? Are we not aware of a particular part of the API or external 
> library in/outside the apache realm that would allow for this?
> 
> Many thanks for any assistance!
> 
> Hugo

Re: Best approach in Cassandra (+ Spark?) for Continuous Queries?

Reply via email to