What I am trying to ask is that what if there are billions of row keys (eg: abc, def, xyz in below eg.) and then client does a lookup/query on a row say xyz (get all cols for row xyz). Now since there are billions of rows look up using Hash mechanism, is it going to be slow? What algorithm will be used to retrieve row xyz which could be anywhere in those billion rows on a particular node.
Is it going to help if there is an index on row keys (eg: abc, xyz)? > UserProfile = { // this is a ColumnFamily > abc: { // this is the key to this Row inside the CF > // now we have an infinite # of columns in this row > username: "phatduckk", > email: "phatdu...@example.com", > phone: "(900) 976-6666" > }, // end row > def: { // this is the key to another row in the CF > // now we have another infinite # of columns in this row > username: "ieure", > email: "ie...@example.com", > phone: "(888) 555-1212" > age: "66", > gender: "undecided" > }, > } -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Understanding-Indexes-tp6058238p6061356.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.