that depends on what you mean by real-time analytics. For things like continuous data streams, neither are appropriate platforms for doing analytics. They're good for storing the results (aka output) of the streaming analytics. I would suggest before you decide cassandra vs hbase, first figure out exactly what kind of analytics you need to do. Start with prototyping and look at what kind of queries and patterns you need to support.
neither hbase or cassandra are good for complex patterns that do joins or cross joins (aka mdx), so using either one you have to re-invent stuff. most of the event processing and stream processing products out there also don't support joins or cross joins very well, so any solution is going to need several different components. typically stream processing does filtering, which feeds another system that does simple joins. The output of the second step can then go to another system that does mdx style queries. spark streaming has basic support, but it's not as mature and feature rich as other stream processing products. On Wed, Dec 17, 2014 at 11:20 PM, Ajay <ajay.ga...@gmail.com> wrote: > > Hi, > > Can Cassandra be used or best fit for Real Time Analytics? I went through > couple of benchmark between Cassandra Vs HBase (most of it was done 3 years > ago) and it mentioned that Cassandra is designed for intensive writes and > Cassandra has higher latency for reads than HBase. In our case, we will > have writes and reads (but reads will be more say 40% writes and 60% > reads). We are planning to use Spark as the in memory computation engine. > > Thanks > Ajay >