Duy, if you are not already working for Datastax, they should hire you. :) Great response. You have given me some good points to think about. I will do the rest of the research.
Thanks. On Fri, Jul 4, 2014 at 10:10 PM, DuyHai Doan <doanduy...@gmail.com> wrote: > I would answer your question this way: > > 1) Why should I choose C* ? > > a. linear scalability, throughputs scale "almost" linearly with number of > nodes > > b. almost unbounded extensivity (there is no limit, or at least huge > limit in term of number of nodes you can have on a cluster) > > c. operational simplicity due to master-less architecture. This feature > is, although quite transparent for developers, is a key selling point. > Having suffered when installing manually a Hadoop cluster, I happen to love > the deployment simplicity of C*, only one process per node, no moving parts. > > d. high availability. C* trades consistency for availability clearly so > you can expect to have something like 99.99% of uptime. Very selling point > for critical business which need to be up all the time > > e. support for multi data centers out of the box. Again, on the > operational side, it's a great feature if you plan a worldwide deployment > > That's all I can see for now > > 2) Why shouldn't I choose C* ? > > a. need for a strong consistency most of the time. Although you can > perform all requests with Consistency level ALL, it's clearly not the best > use of C*. You'll suffer for higher latency and reduced availability. Even > the new "lightweight transaction" feature is not meant to be use on large > scale > > b. very complicated and changing queries. Denormalizing is great when you > know ahead of time exactly how you'll query your data. Once done, any new > way of querying will require new coding & new tables to support it > > c. ridiculous data load. I've seen people in prod using C* for only 200Gb > because they want to be trendy and use bleeding edge technologies. They'd > better off using a classical RDBMS solution that fit perfectly their load > > Hope that helps > > Duy Hai DOAN > > > > On Fri, Jul 4, 2014 at 9:31 PM, Prem Yadav <ipremya...@gmail.com> wrote: > >> Thanks Manoj. Great post for those who already have Cassandra in >> production. >> However it brings me back to my original post. >> All the points you have mentioned apply to any big data technology. >> Storage- All of them >> Query- All of them. In fact lot of them perform better. Agree that CQL >> structure is better. But hive,mongo all good >> Availability- many of them >> >> So my question is basically to Cassandra support people e.g.- Datastax Or >> the developers. >> What makes Cassandra special. >> If I have to convince my CTO to spend million dollars on a cluster and >> support, his first question would be why Cassandra? Why not this or that? >> >> So I still am not sure about what special Cassandra brings to the table? >> >> Sorry about the rant. But in the enterprise world, decisions are taken >> based on taking into account the stability, convincing managers and what >> not. Chosen technology has to be stable for years. People should be >> convinced that the engineers are not going to do a lot of firefighting. >> >> Any inputs appreciated. >> >> >> >> On Fri, Jul 4, 2014 at 7:07 PM, Manoj Khangaonkar <khangaon...@gmail.com> >> wrote: >> >>> These are my personal opinions based on few months using Cassandra. >>> These are my views. Others >>> may have different opinion >>> >>> >>> >>> http://khangaonkar.blogspot.com/2014/06/apache-cassandra-things-to-consider.html >>> >>> regards >>> >>> >>> >>> On Fri, Jul 4, 2014 at 7:37 AM, Prem Yadav <ipremya...@gmail.com> wrote: >>> >>>> Hi, >>>> I have seen this in a lot of replies that Cassandra is not designed for >>>> this and that. I don't want to sound rude, i just need some info about this >>>> so that i can compare it to technologies like hbase, mongo, elasticsearch, >>>> solr, >>>> etc. >>>> >>>> 1) what is Cassandra designed for. Heave writes yes. So is Hbase. Or >>>> ElasticSearch >>>> What is the use case(s) that suit Cassandra. >>>> >>>> 2) What kind of queries are best suited for Cassandra. >>>> I ask this Because I have seen people asking about queries and getting >>>> replies that its not suited for Cassandra. For ex: queries where large >>>> number of rows are requested and timeout happens. Or range queries or >>>> aggregate queries. >>>> >>>> 3) Where does Cassandra excel compared to other technologies? >>>> >>>> I have been working on Casandra for some time. I know how it works and >>>> I like it very much. >>>> We are moving towards building a big cluster. But at this point, I am >>>> not sure if its a right decision. >>>> >>>> A lot of people including me like Cassandra in my company. But it has >>>> more to do with the CQL and not the internals or the use cases. Until now, >>>> there have been small PoCs and people enjoyed it. But a large scale >>>> project, we are not so sure. >>>> >>>> Please guide us. >>>> Please note that the drawbacks of other technologies do not interest >>>> me, its the strengths/weaknesses of Cassandra I am interested in. >>>> Thanks >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> http://khangaonkar.blogspot.com/ >>> >> >> >