Is it a problem for me to have millions of columns in a supercolumn?
You will have problem, because there is no index in supercolumn for
subcolumns.

On Tue, May 11, 2010 at 10:03 PM, David Boxenhorn <da...@lookin2.com> wrote:

> I have a similar issue, but I can't create a CF per type, because types are
> an open-ended set in my case (they are geographical locations). So I wanted
> to have one CF for types, and a supercolumn for each type, with the keys as
> columns per supercolumn.
>
> Is it a problem for me to have millions of columns in a supercolumn?
>
>
> On Tue, May 11, 2010 at 4:29 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
>
>> multiget performs in O(N) with the number of rows requested.  so will
>> range scanning.
>>
>> if you want to query millions of records of one type i would create a
>> CF per type and use hadoop to parallelize the computation.
>>
>> On Fri, May 7, 2010 at 6:16 PM, James <rent.lupin.r...@gmail.com> wrote:
>> > Hi all,
>> > Apologies if I'm still stuck in RDBMS mentality - first project using
>> > Cassandra!
>> > I'll be using Cassandra to store quite a lot (10s of millions) of
>> records,
>> > each of which has a type.
>> > I'll want to query the records to get all of a certain type; it's an
>> > analagous situation to the TaggedPosts schema from Arin's blog post
>> > (http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model).
>> > The thing is, each type (or tag) row key will be pointing at millions of
>> > records. I know I can use multiget_slice with all those record IDs as
>> one
>> > request, but is this The Right Way of "filtering" a large column family
>> by
>> > type?
>> > Coming from an RDBMS-ingrained mindset, it seems kind of awkward...
>> > Thanks!
>> > James
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>
>

Reply via email to