I would spin it as Cassandra being the right choice where your primary need
in OLTP and with a secondary need for analytics. IOW, where you would
otherwise need to use two separate databases for the same data.


-- Jack Krupansky

On Tue, Mar 1, 2016 at 12:40 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> Spark & Cassandra work just fine together, but, as I said, Cassandra is
> *primarily* used for OLTP.  If your main use case is analytics, I would use
> something that's built for analytics.  If 90%+ of your queries are going to
> be 1-10ms & customer facing, then you're good to go.  If you're building
> something to replace OLAP cubes, I'd look at something else.
>
> On Tue, Mar 1, 2016 at 8:52 AM Jack Krupansky <jack.krupan...@gmail.com>
> wrote:
>
>> OLAP using Cassandra and Spark:
>>
>> http://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark
>>
>> What is the cardinality of your cube dimenstions? Obviously any
>> multi-dimensional data must be flattened.
>>
>> Cassandra tables have fixed named columns, but... the map datatype with
>> string key values effectively gives you extensible columns.
>>
>>
>>
>> -- Jack Krupansky
>>
>> On Tue, Mar 1, 2016 at 11:22 AM, Andrés Ivaldi <iaiva...@gmail.com>
>> wrote:
>>
>>> Jonathan thanks for the link,
>>> I believe that maybe is good as Data Store part, because is fast for I/o
>>> and handles Time Series, for analytics could be with Apache Ignite and/or
>>> Apache Spark
>>> what it worries me is that looks very complex create the structure for
>>> each Fact table and then extend
>>>
>>> regards.
>>>
>>> On Sun, Feb 28, 2016 at 12:28 PM, Jonathan Haddad <j...@jonhaddad.com>
>>> wrote:
>>>
>>>> Cassandra is primarily used as an OLTP database, not analytics. You
>>>> should watch this 30 min video discussing Cassandra core concepts (coming
>>>> from a relational background):
>>>> https://academy.datastax.com/courses/ds101-introduction-cassandra
>>>>
>>>> On Sun, Feb 28, 2016 at 5:40 AM Andrés Ivaldi <iaiva...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello, At my work we are looking for new technologies for an Analysis
>>>>> Engine, and we are evaluating differents technologies one of them is
>>>>> Cassandra as our Data repository.
>>>>>
>>>>> Now we can execute query analysis agains an OLAP Cube and RDBMS, using
>>>>> MSSQL as our data repository. Cube is obsolete and SQL server engine is
>>>>> slow as data repository.
>>>>>
>>>>> I don't know much about cassandra, I read some books, and looks to fit
>>>>> well on what we are needing, but there are some things that looks like a
>>>>> problem for us.
>>>>>
>>>>> Our engine is designed to be scalable, flexible and dynamic, any user
>>>>> can add new dimensions or measures from any source, all the data is stored
>>>>> on Cube(this is fixed data) and MSSQL(dynamic data) so we have decoupled
>>>>> tables with the dimension values.
>>>>>
>>>>>
>>>>> Ok, with the context given I'll like to clear some doubts
>>>>>
>>>>> - I able to flat the table with all the possible dimension values to
>>>>> cassandra, creating the pk against the dimension columns? this will give 
>>>>> me
>>>>> the "sensation" of data pivot over the PK columns? If correct, what if I
>>>>> want to select the order of the columns, or add another or reduce them?
>>>>> - It's possible to extend the values of a row dynamically? What we do
>>>>> often is join row against a value of a mapped external data value to 
>>>>> extend
>>>>> the dimensions hierarchical value structure (ie state->Country->Continent)
>>>>>
>>>>> I know we can do some of this things in the core of our engine, like
>>>>> the dimension extension of the values or reduce columns, but as we are
>>>>> evaluating differents technologies is good to know.
>>>>>
>>>>> Regards!!
>>>>>
>>>>>
>>>>> --
>>>>> Ing. Ivaldi Andres
>>>>>
>>>>
>>>
>>>
>>> --
>>> Ing. Ivaldi Andres
>>>
>>
>>

Reply via email to