Ariel, DSE lets you specify an "Analytics" virtual data center. You can then replicate your keyspaces over to that data center, and run your Analytics jobs against it, and as long as they are using the LOCAL_ consistency levels, they won't be hitting your real time nodes, and vice versa. So the Cassandra "multiple data center" capabilities are used to separate your OLTP stuff and Analytics stuff from interfering with each other, but the data in each is seamlessly replicated so that both are always up to date, without you having to write ETL code.
Does that answer your question? -Jeremiah On Mar 11, 2014, at 10:27 AM, Ariel Weisberg <ar...@weisberg.ws> wrote: > Hi, > > I am doing a presentation at Big Data Boston about how people are > bridging the gap between OLTP and ingest side databases and their > analytic storage and queries. One class of systems I am talking about > are things like HBase and DSE that let you run map reduce against your > OLTP dataset. > > I remember reading at some point that DSE allows you to provision > dedicated hardware for map reduce, but the docs didn't seem to fully > explain how that works.I looked at > http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/ana/anaStrt.html > > My question is what kind of provisioning can I do? Can I provision > dedicated hardware for just the filesystem or can I also provision > replicas that are dedicated to the file system and also serving reads > for map reduce jobs. What kind of support is there for keeping OLTP > reads from hitting the Hadoop storage nodes and how does this relate to > doing quorum reads and writes? > > Thanks, > Ariel