I would spin it as Cassandra being the right choice where your primary need in OLTP and with a secondary need for analytics. IOW, where you would otherwise need to use two separate databases for the same data.
-- Jack Krupansky On Tue, Mar 1, 2016 at 12:40 PM, Jonathan Haddad <j...@jonhaddad.com> wrote: > Spark & Cassandra work just fine together, but, as I said, Cassandra is > *primarily* used for OLTP. If your main use case is analytics, I would use > something that's built for analytics. If 90%+ of your queries are going to > be 1-10ms & customer facing, then you're good to go. If you're building > something to replace OLAP cubes, I'd look at something else. > > On Tue, Mar 1, 2016 at 8:52 AM Jack Krupansky <jack.krupan...@gmail.com> > wrote: > >> OLAP using Cassandra and Spark: >> >> http://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark >> >> What is the cardinality of your cube dimenstions? Obviously any >> multi-dimensional data must be flattened. >> >> Cassandra tables have fixed named columns, but... the map datatype with >> string key values effectively gives you extensible columns. >> >> >> >> -- Jack Krupansky >> >> On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi <iaiva...@gmail.com> >> wrote: >> >>> Jonathan thanks for the link, >>> I believe that maybe is good as Data Store part, because is fast for I/o >>> and handles Time Series, for analytics could be with Apache Ignite and/or >>> Apache Spark >>> what it worries me is that looks very complex create the structure for >>> each Fact table and then extend >>> >>> regards. >>> >>> On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad <j...@jonhaddad.com> >>> wrote: >>> >>>> Cassandra is primarily used as an OLTP database, not analytics. You >>>> should watch this 30 min video discussing Cassandra core concepts (coming >>>> from a relational background): >>>> https://academy.datastax.com/courses/ds101-introduction-cassandra >>>> >>>> On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi <iaiva...@gmail.com> >>>> wrote: >>>> >>>>> Hello, At my work we are looking for new technologies for an Analysis >>>>> Engine, and we are evaluating differents technologies one of them is >>>>> Cassandra as our Data repository. >>>>> >>>>> Now we can execute query analysis agains an OLAP Cube and RDBMS, using >>>>> MSSQL as our data repository. Cube is obsolete and SQL server engine is >>>>> slow as data repository. >>>>> >>>>> I don't know much about cassandra, I read some books, and looks to fit >>>>> well on what we are needing, but there are some things that looks like a >>>>> problem for us. >>>>> >>>>> Our engine is designed to be scalable, flexible and dynamic, any user >>>>> can add new dimensions or measures from any source, all the data is stored >>>>> on Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled >>>>> tables with the dimension values. >>>>> >>>>> >>>>> Ok, with the context given I'll like to clear some doubts >>>>> >>>>> - I able to flat the table with all the possible dimension values to >>>>> cassandra, creating the pk against the dimension columns? this will give >>>>> me >>>>> the "sensation" of data pivot over the PK columns? If correct, what if I >>>>> want to select the order of the columns, or add another or reduce them? >>>>> - It's possible to extend the values of a row dynamically? What we do >>>>> often is join row against a value of a mapped external data value to >>>>> extend >>>>> the dimensions hierarchical value structure (ie state->Country->Continent) >>>>> >>>>> I know we can do some of this things in the core of our engine, like >>>>> the dimension extension of the values or reduce columns, but as we are >>>>> evaluating differents technologies is good to know. >>>>> >>>>> Regards!! >>>>> >>>>> >>>>> -- >>>>> Ing. Ivaldi Andres >>>>> >>>> >>> >>> >>> -- >>> Ing. Ivaldi Andres >>> >> >>