I'm using Cassandra as a big graph database, loading large volumes of data live and linking on the fly. The number of edges grow geometrically with data added, and need to be read to continue linking the graph on the fly.
Consequently, my problem is constrained by: * Predominantly read - especially when data gets large and reads are quasi random * I have lots of data to plow in, to be read * Although the problem scale out and possibly all be in RAM, it requires too much kit for the to be viable So, my findings with Cassandra are: * Compaction is expensive, I need it but 1) It takes away disk IO from my reads 2) Destroys the file cache I've not had chance to do extensive tests with the Level db compaction * Compaction has been too hard to configure historically * Memory hungry So for me the biggest features would be * Cheaper compaction - * Lower memory usage * Indexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey) I do a lot of checking against dynamic colnames The great features are that redundancy, and live addition of shards is available out of the box. I've also experimented with Golden Orb and Triggered updates, I think there is a fair bit that can be achieved in my problem with local data access. Through GoldenOrb and Hadoop writables a managed to get both a BigTable and Pregel access model onto my Cassandra data. It was schema specific, but provided a local compute model. p ________________________________ From: Jonathan Ellis <jbel...@gmail.com> To: user <user@cassandra.apache.org> Sent: Tuesday, 1 November 2011, 22:59 Subject: Second Cassandra users survey Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1] The results [2] have been extremely useful in setting and prioritizing goals for Cassandra development. But with the release of 1.0 we've accomplished basically everything from our original wish list. [3] I'd love to hear from modern Cassandra users again, especially if you're usually a quiet lurker. What does Cassandra do well? What are your pain points? What's your feature wish list? As before, if you're in stealth mode or don't want to say anything in public, feel free to reply to me privately and I will keep it off the record. [1] http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html [2] http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com