Dotan, I think that if you're in the early stages you have a basic idea of what your product is going to be, architecturally speaking. While you may change your business model, or features at the display layer, I would think the data models itself would remain relatively similar throughout...otherwise you'd have another product on your hands, no?
But, even if your requirements radically shift, Cassandra is schemaless, so you'd be able to make 'structural' changes to your data without as much risk as in a traditional RDBMS, i.e. MySql. At the end of the day, I don't think you've given enough information about your proposed data models for anyone to say, "Yes, Cassandra would or would not be the right choice for your startup." If well administered, depending on the services offered, MySQL or Oracle could support a site with 200M users, and a poorly designed Cassandra data store could work very poorly for a site supporting 200 users. I will say that I think it makes a lot of sense to use tradional RDBMS systems for relational data and a Cassandra-like system when there is a need for larger data storage, or something that lends itself well to a structureless design. If you are using a framework that supports a good ORM layer (i.e. Hibernate for Java), you can have your build process update your database schema as you build out your application. I haven't done much work in Rails or Django, but I understand those support the transparent schema updating as well. That sort of setup can work very effectively in early development...but that is more a discussion for other communities. If you're interested in doing Map/Reduce jobs with Cassandra, look into Brisk, the system created by DataStax (which is also open source) that allows you to run Hadoop on top of your Cassandra cluster. This may not be exactly what you're looking for when asking this question...but it might give you the insights you're looking for. Hope this has been at least somewhat helpful. David On Sun, Nov 20, 2011 at 1:06 PM, Dotan N. <dip...@gmail.com> wrote: > Hi all, > my question may be more philosophical than related technically > to Cassandra, but please bear with me. > > Given that a young startup may not know its product full at the early > stages, but that it definitely points to ~200M users, > would Cassandra will be the right way to go? > > That is, the requirement is for a large data store, that can move with > product changes and requirements swiftly. > > Given that in Cassandra one thinks hard about the queries, and then builds > a model to suit it best, I was thinking of > this situation as problematic. > > So here are some questions: > > - would it be wiser to start with a more agile data store (such as > mongodb) and then progress onto Cassandra, when the product itself > solidifies? > - given that we start with Cassandra from the get go, what is a common > (and quick in terms of development) way or practice to change data, change > schemas, as the product evolves? > - is it even smart to start with Cassandra? would only startups whose core > business is big data start with it from the get go? > - how would you do map/reduce with Cassandra? how agile is that? (for > example, can you run map/reduce _very_ frequently?) > > Thanks! > > -- > Dotan, @jondot <http://twitter.com/jondot> > > -- *David McNelis* Lead Software Engineer Agentis Energy www.agentisenergy.com c: 219.384.5143 *A Smart Grid technology company focused on helping consumers of energy control an often under-managed resource.*