Yes, this is what I am worrying about. 2011/8/24 Ryan King <r...@twitter.com>
> On Tue, Aug 23, 2011 at 10:03 AM, Alvin UW <alvi...@gmail.com> wrote: > > Hello, > > > > As mentioned by Ed Anuff in his blog and slides, one way to build > customized > > secondary index is: > > We use one CF, each row to represent a secondary index, with the > secondary > > index name as row key. > > For example, > > > > Indexes = { > > "User_Keys_By_Last_Name" : { > > "adams" : "e5d61f2b-…", > > "alden" : "e80a17ba-…", > > "anderson" : "e5d61f2b-…", > > "davis" : "e719962b-…", > > "doe" : "e78ece0f-…", > > "franks" : "e66afd40-…", > > … : …, > > } > > } > > > > But the whole secondary index is partitioned into a single node, because > of > > the row key. > > All the queries against this secondary index will go to this node. Of > > course, there are some replica nodes. > > > > Do you think this is a scalability problem, or any better solution to > solve > > it? > > Its certainly a scalability problem in that this solution has a hard > ceiling (this index can't get larger than the capacity of any single > node). It will probably work on small datasets, but if your dataset is > small then why are you using cassandra? > > -ryan >