I am not sure about the collection case. But for compact storage you can specify multiple-ranges in a slice query.
https://issues.apache.org/jira/browse/CASSANDRA-3885 I am not sure this will get you all the way to bit-map indexes but in a wide row scenario it seems like you could support a "event contains 1 or event contains 2 or event contains 3" I am not sure how arbitrarily complex the CQL query handler can/will become. For intravert (something I am dabling with) the concept is to apply a server side function to the result of a slice. https://github.com/zznate/intravert-ug/wiki/Filter-mode There is a huge win in having multiple indexes behind the plugable index support, not all of the plugable indexes and query options will be easy to CQL-ify. On Fri, Apr 12, 2013 at 10:52 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > Something like this? > > SELECT * FROM users > WHERE user_id IN (select user_id from events where type in (1, 2, 3)) > AND user_id NOT IN (select user_id from events where type=4) > > This doesn't really look like a Cassandra query to me. More like a > query for Hive (or Drill, or Impala). > > But, I know Sylvain is looking forward to adding index support to > Collections [1], so something like this might fit: > > SELECT * FROM users > WHERE (events CONTAINS 1 OR events CONTAINS 2 OR events CONTAINS 3) > AND NOT (events CONTAINS 4) > > However, even this is more than our current query planner can handle; > we don't really handle disjunctions at all, except for the special > case of IN on the partition key (which translates to multiget), let > alone arbitrary logical predicates. > > I think that between "bitmap indexes" and "query planning," the latter > is actually the hard part. QueryProcessor is about at the limits of > tractable complexity already; I think we'd need a new approach if we > want to handle arbitrarily complex predicates like that. > > [1] https://issues.apache.org/jira/browse/CASSANDRA-4511 > > > On Wed, Apr 10, 2013 at 4:40 PM, mrevilgnome <mrevilgn...@gmail.com> > wrote: > > What do you think about set manipulation via indexes in Cassandra? I'm > > interested in answering queries such as give me all users that performed > > event 1, 2, and 3, but not 4. If the answer is yes than I can make a case > > for spending my time on C*. The only downside for us would be our current > > prototype is in C++ so we would loose some performance and the ability to > > dedicate an entire machine to caching/performing queries. > > > > > > On Wed, Apr 10, 2013 at 11:57 AM, Jonathan Ellis <jbel...@gmail.com> > wrote: > > > >> If you mean, "Can someone help me figure out how to get started updating > >> these old patches to trunk and cleaning out the Avro?" then yes, I've > been > >> knee-deep in indexing code recently. > >> > >> > >> On Wed, Apr 10, 2013 at 11:34 AM, mrevilgnome <mrevilgn...@gmail.com> > >> wrote: > >> > >> > I'm currently building a distributed cluster on top of cassandra to > >> perform > >> > fast set manipulation via bitmap indexes. This gives me the ability to > >> > perform unions, intersections, and set subtraction across sub-queries. > >> > Currently I'm storing index information for thousands of dimensions as > >> > cassandra rows, and my cluster keeps this information cached, > distributed > >> > and replicated in order to answer queries. > >> > > >> > Every couple of days I think to myself this should really exist in C*. > >> > Given all the benifits would there be any interest in > >> > reviving CASSANDRA-1472? > >> > > >> > Some downsides are that this is very memory intensive, even for sparse > >> > bitmaps. > >> > > >> > >> > >> > >> -- > >> Jonathan Ellis > >> Project Chair, Apache Cassandra > >> co-founder, http://www.datastax.com > >> @spyced > >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder, http://www.datastax.com > @spyced >