Thanks all for the tips, Mainly we are replacing an OLAP cube, but our engine works fine with RDBMS directly so with the low latency of cassandra it could work nice (extensibility of this is what worries me). We will give a try to Cassandra + Spark
Thanks again!! On Tue, Mar 1, 2016 at 2:59 PM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > I would spin it as Cassandra being the right choice where your primary > need in OLTP and with a secondary need for analytics. IOW, where you would > otherwise need to use two separate databases for the same data. > > > -- Jack Krupansky > > On Tue, Mar 1, 2016 at 12:40 PM, Jonathan Haddad <j...@jonhaddad.com> > wrote: > >> Spark & Cassandra work just fine together, but, as I said, Cassandra is >> *primarily* used for OLTP. If your main use case is analytics, I would use >> something that's built for analytics. If 90%+ of your queries are going to >> be 1-10ms & customer facing, then you're good to go. If you're building >> something to replace OLAP cubes, I'd look at something else. >> >> On Tue, Mar 1, 2016 at 8:52 AM Jack Krupansky <jack.krupan...@gmail.com> >> wrote: >> >>> OLAP using Cassandra and Spark: >>> >>> http://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark >>> >>> What is the cardinality of your cube dimenstions? Obviously any >>> multi-dimensional data must be flattened. >>> >>> Cassandra tables have fixed named columns, but... the map datatype with >>> string key values effectively gives you extensible columns. >>> >>> >>> >>> -- Jack Krupansky >>> >>> On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi <iaiva...@gmail.com> >>> wrote: >>> >>>> Jonathan thanks for the link, >>>> I believe that maybe is good as Data Store part, because is fast for >>>> I/o and handles Time Series, for analytics could be with Apache Ignite >>>> and/or Apache Spark >>>> what it worries me is that looks very complex create the structure for >>>> each Fact table and then extend >>>> >>>> regards. >>>> >>>> On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad <j...@jonhaddad.com> >>>> wrote: >>>> >>>>> Cassandra is primarily used as an OLTP database, not analytics. You >>>>> should watch this 30 min video discussing Cassandra core concepts (coming >>>>> from a relational background): >>>>> https://academy.datastax.com/courses/ds101-introduction-cassandra >>>>> >>>>> On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi <iaiva...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello, At my work we are looking for new technologies for an Analysis >>>>>> Engine, and we are evaluating differents technologies one of them is >>>>>> Cassandra as our Data repository. >>>>>> >>>>>> Now we can execute query analysis agains an OLAP Cube and RDBMS, >>>>>> using MSSQL as our data repository. Cube is obsolete and SQL server >>>>>> engine >>>>>> is slow as data repository. >>>>>> >>>>>> I don't know much about cassandra, I read some books, and looks to >>>>>> fit well on what we are needing, but there are some things that looks >>>>>> like >>>>>> a problem for us. >>>>>> >>>>>> Our engine is designed to be scalable, flexible and dynamic, any user >>>>>> can add new dimensions or measures from any source, all the data is >>>>>> stored >>>>>> on Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled >>>>>> tables with the dimension values. >>>>>> >>>>>> >>>>>> Ok, with the context given I'll like to clear some doubts >>>>>> >>>>>> - I able to flat the table with all the possible dimension values to >>>>>> cassandra, creating the pk against the dimension columns? this will give >>>>>> me >>>>>> the "sensation" of data pivot over the PK columns? If correct, what if I >>>>>> want to select the order of the columns, or add another or reduce them? >>>>>> - It's possible to extend the values of a row dynamically? What we do >>>>>> often is join row against a value of a mapped external data value to >>>>>> extend >>>>>> the dimensions hierarchical value structure (ie >>>>>> state->Country->Continent) >>>>>> >>>>>> I know we can do some of this things in the core of our engine, like >>>>>> the dimension extension of the values or reduce columns, but as we are >>>>>> evaluating differents technologies is good to know. >>>>>> >>>>>> Regards!! >>>>>> >>>>>> >>>>>> -- >>>>>> Ing. Ivaldi Andres >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Ing. Ivaldi Andres >>>> >>> >>> > -- Ing. Ivaldi Andres