Re: Datastore HDFS vs Cassandra

Mike Trienis Wed, 11 Feb 2015 21:28:24 -0800

Thanks everyone for your responses. I'll definitely think carefully about
the data models, querying patterns and fragmentation side-effects.


Cheers, Mike.

On Wed, Feb 11, 2015 at 1:14 AM, Franc Carter <[email protected]>
wrote:

>
> I forgot to mention that if you do decide to use Cassandra I'd highly
> recommend jumping on the Cassandra mailing list, if we had taken in come of
> the advice on that list things would have been considerably smoother
>
> cheers
>
> On Wed, Feb 11, 2015 at 8:12 PM, Christian Betz <
> [email protected]> wrote:
>
>>   Hi
>>
>>  Regarding the Cassandra Data model, there's an excellent post on the
>> ebay tech blog:
>> http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/.
>> There's also a slideshare for this somewhere.
>>
>>  Happy hacking
>>
>>  Chris
>>
>>   Von: Franc Carter <[email protected]>
>> Datum: Mittwoch, 11. Februar 2015 10:03
>> An: Paolo Platter <[email protected]>
>> Cc: Mike Trienis <[email protected]>, "[email protected]" <
>> [email protected]>
>> Betreff: Re: Datastore HDFS vs Cassandra
>>
>>
>> One additional comment I would make is that you should be careful with
>> Updates in Cassandra, it does support them but large amounts of Updates
>> (i.e changing existing keys) tends to cause fragmentation. If you are
>> (mostly) adding new keys (e.g new records in the the time series) then
>> Cassandra can be excellent
>>
>>  cheers
>>
>>
>> On Wed, Feb 11, 2015 at 6:13 PM, Paolo Platter <[email protected]
>> > wrote:
>>
>>>   Hi Mike,
>>>
>>> I developed a Solution with cassandra and spark, using DSE.
>>> The main difficult is about cassandra, you need to understand very well
>>> its data model and its Query patterns.
>>> Cassandra has better performance than hdfs and it has DR and stronger
>>> availability.
>>> Hdfs is a filesystem, cassandra is a dbms.
>>> Cassandra supports full CRUD without acid.
>>> Hdfs is more flexible than cassandra.
>>>
>>> In my opinion, if you have a real time series, go with Cassandra paying
>>> attention at your reporting data access patterns.
>>>
>>> Paolo
>>>
>>> Inviata dal mio Windows Phone
>>>  ------------------------------
>>> Da: Mike Trienis <[email protected]>
>>> Inviato: ?11/?02/?2015 05:59
>>> A: [email protected]
>>> Oggetto: Datastore HDFS vs Cassandra
>>>
>>>   Hi,
>>>
>>> I am considering implement Apache Spark on top of Cassandra database
>>> after
>>> listing to related talk and reading through the slides from DataStax. It
>>> seems to fit well with our time-series data and reporting requirements.
>>>
>>>
>>> http://www.slideshare.net/patrickmcfadin/apache-cassandra-apache-spark-for-time-series-data
>>>
>>> Does anyone have any experiences using Apache Spark and Cassandra,
>>> including
>>> limitations (and or) technical difficulties? How does Cassandra compare
>>> with
>>> HDFS and what use cases would make HDFS more suitable?
>>>
>>> Thanks, Mike.
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Datastore-HDFS-vs-Cassandra-tp21590.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>
>>
>>
>>  --
>>
>> *Franc Carter* | Systems Architect | Rozetta Technology
>>
>> [email protected]  <[email protected]>|
>> www.rozettatechnology.com
>>
>> Tel: +61 2 8355 2515
>>
>> Level 4, 55 Harrington St, The Rocks NSW 2000
>>
>> PO Box H58, Australia Square, Sydney NSW 1215
>>
>> AUSTRALIA
>>
>>
>
>
> --
>
> *Franc Carter* | Systems Architect | Rozetta Technology
>
> [email protected]  <[email protected]>|
> www.rozettatechnology.com
>
> Tel: +61 2 8355 2515
>
> Level 4, 55 Harrington St, The Rocks NSW 2000
>
> PO Box H58, Australia Square, Sydney NSW 1215
>
> AUSTRALIA
>
>

Re: Datastore HDFS vs Cassandra

Reply via email to