Re: Second Cassandra users survey

Peter Tillotson Thu, 03 Nov 2011 08:51:05 -0700

>>  * Indexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey)
>>    I do a lot of checking against dynamic colnames
>
>I agree, some kind of integration with search engine is required to
>support adhoc queries as well and searching on column names. This will
>be really helpful.
>
>Currently, one of the options is to write in 2 places. Cassandra +
>search engine.
>


I thought a disk backed skiplist, with every nth rowkey:colkey dragged into 
memory per sstable as per Lucene TermEnum.  


________________________________
From: Mohit Anchlia <mohitanch...@gmail.com>
To: user@cassandra.apache.org; Peter Tillotson <slatem...@yahoo.co.uk>
Sent: Thursday, 3 November 2011, 14:15
Subject: Re: Second Cassandra users survey

On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson <slatem...@yahoo.co.uk> wrote:
> I'm using Cassandra as a big graph database, loading large volumes of data
> live and linking on the fly.

Not sure if Cassandra is right fit to model complex vertexes and edges.

> The number of edges grow geometrically with data added, and need to be read
> to continue linking the graph on the fly.
>
> Consequently, my problem is constrained by:
>  * Predominantly read - especially when data gets large and reads are quasi
> random
>  * I have lots of data to plow in, to be read
>  * Although the problem scale out and possibly all be in RAM, it requires
> too much kit for the to be viable
> So, my findings with Cassandra are:
>  * Compaction is expensive, I need it but
>    1) It takes away disk IO from my reads
>    2) Destroys the file cache
>    I've not had chance to do extensive tests with the Level db compaction
>  * Compaction has been too hard to configure historically
>  * Memory hungry
> So for me the biggest features would be
>  * Cheaper compaction -
>  * Lower memory usage
>  * Indexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey)
>    I do a lot of checking against dynamic colnames

I agree, some kind of integration with search engine is required to
support adhoc queries as well and searching on column names. This will
be really helpful.

Currently, one of the options is to write in 2 places. Cassandra +
search engine.
>
> The great features are that redundancy, and live addition of shards is
> available out of the box.
>
> I've also experimented with Golden Orb and Triggered updates, I think there
> is a fair bit that can be achieved in my problem with local data access.
> Through GoldenOrb and Hadoop writables a managed to get both a BigTable and
> Pregel access model onto my Cassandra data. It was schema specific, but
> provided a local compute model.
> p
> ________________________________
> From: Jonathan Ellis <jbel...@gmail.com>
> To: user <user@cassandra.apache.org>
> Sent: Tuesday, 1 November 2011, 22:59
> Subject: Second Cassandra users survey
>
> Hi all,
>
> Two years ago I asked for Cassandra use cases and feature requests.
> [1]  The results [2] have been extremely useful in setting and
> prioritizing goals for Cassandra development.  But with the release of
> 1.0 we've accomplished basically everything from our original wish
> list. [3]
>
> I'd love to hear from modern Cassandra users again, especially if
> you're usually a quiet lurker.  What does Cassandra do well?  What are
> your pain points?  What's your feature wish list?
>
> As before, if you're in stealth mode or don't want to say anything in
> public, feel free to reply to me privately and I will keep it off the
> record.
>
> [1]
> http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html
> [2]
> http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html
> [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>
>
>

Re: Second Cassandra users survey

Reply via email to