Thanks everyone for your responses. I'll definitely think carefully about the data models, querying patterns and fragmentation side-effects.
Cheers, Mike. On Wed, Feb 11, 2015 at 1:14 AM, Franc Carter <[email protected]> wrote: > > I forgot to mention that if you do decide to use Cassandra I'd highly > recommend jumping on the Cassandra mailing list, if we had taken in come of > the advice on that list things would have been considerably smoother > > cheers > > On Wed, Feb 11, 2015 at 8:12 PM, Christian Betz < > [email protected]> wrote: > >> Hi >> >> Regarding the Cassandra Data model, there's an excellent post on the >> ebay tech blog: >> http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/. >> There's also a slideshare for this somewhere. >> >> Happy hacking >> >> Chris >> >> Von: Franc Carter <[email protected]> >> Datum: Mittwoch, 11. Februar 2015 10:03 >> An: Paolo Platter <[email protected]> >> Cc: Mike Trienis <[email protected]>, "[email protected]" < >> [email protected]> >> Betreff: Re: Datastore HDFS vs Cassandra >> >> >> One additional comment I would make is that you should be careful with >> Updates in Cassandra, it does support them but large amounts of Updates >> (i.e changing existing keys) tends to cause fragmentation. If you are >> (mostly) adding new keys (e.g new records in the the time series) then >> Cassandra can be excellent >> >> cheers >> >> >> On Wed, Feb 11, 2015 at 6:13 PM, Paolo Platter <[email protected] >> > wrote: >> >>> Hi Mike, >>> >>> I developed a Solution with cassandra and spark, using DSE. >>> The main difficult is about cassandra, you need to understand very well >>> its data model and its Query patterns. >>> Cassandra has better performance than hdfs and it has DR and stronger >>> availability. >>> Hdfs is a filesystem, cassandra is a dbms. >>> Cassandra supports full CRUD without acid. >>> Hdfs is more flexible than cassandra. >>> >>> In my opinion, if you have a real time series, go with Cassandra paying >>> attention at your reporting data access patterns. >>> >>> Paolo >>> >>> Inviata dal mio Windows Phone >>> ------------------------------ >>> Da: Mike Trienis <[email protected]> >>> Inviato: ?11/?02/?2015 05:59 >>> A: [email protected] >>> Oggetto: Datastore HDFS vs Cassandra >>> >>> Hi, >>> >>> I am considering implement Apache Spark on top of Cassandra database >>> after >>> listing to related talk and reading through the slides from DataStax. It >>> seems to fit well with our time-series data and reporting requirements. >>> >>> >>> http://www.slideshare.net/patrickmcfadin/apache-cassandra-apache-spark-for-time-series-data >>> >>> Does anyone have any experiences using Apache Spark and Cassandra, >>> including >>> limitations (and or) technical difficulties? How does Cassandra compare >>> with >>> HDFS and what use cases would make HDFS more suitable? >>> >>> Thanks, Mike. >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Datastore-HDFS-vs-Cassandra-tp21590.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> >> >> -- >> >> *Franc Carter* | Systems Architect | Rozetta Technology >> >> [email protected] <[email protected]>| >> www.rozettatechnology.com >> >> Tel: +61 2 8355 2515 >> >> Level 4, 55 Harrington St, The Rocks NSW 2000 >> >> PO Box H58, Australia Square, Sydney NSW 1215 >> >> AUSTRALIA >> >> > > > -- > > *Franc Carter* | Systems Architect | Rozetta Technology > > [email protected] <[email protected]>| > www.rozettatechnology.com > > Tel: +61 2 8355 2515 > > Level 4, 55 Harrington St, The Rocks NSW 2000 > > PO Box H58, Australia Square, Sydney NSW 1215 > > AUSTRALIA > >
