> I created the following model: an UserCF, whose key is a userID generated by > TimeUUID, and a RequestCF, whose key is composite: UserUUID + timestamp. For > each user, I will store basic data and, for each request, I will insert a lot > of columns.
I would consider: # User CF * row_key: user_id * columns: user properties, key=value # UserRequests CF * row_key: <user_id : partition_start> where partition_start is the start of a time partition that makes sense in your domain. e.g. partition monthly. Generally want to avoid rows the grow forever, as a rule of thumb avoid rows more than a few 10's of MB. * columns: two possible approaches: 1) If the requests are immutable and you generally want all of the data store the request in a single column using JSON or similar, with the column name a timestamp. 2) Otherwise use a composite column name of <timestamp : request_property> to store the request in many columns. * In either case consider using Reversed comparators so the most recent columns are first see http://thelastpickle.com/2011/10/03/Reverse-Comparators/ # GlobalRequests CF * row_key: partition_start - time partition as above. It may be easier to use the same partition scheme. * column name: <timestamp : user_id> * column value: empty > - Select all the requests for an user Work out the current partition client side, get the first N columns. Then page. > - Select all the users which has new requests, since date D Worm out the current partition client side, get the first N columns from GlobalRequests, make a multi get call to UserRequests NOTE: Assuming the size of the global requests space is not huge. Hope that helps. ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/09/2012, at 11:19 AM, Marcelo Elias Del Valle <mvall...@gmail.com> wrote: > In your first email, you get a request and seem to shove it and a user in > generating the ids which means that user never generates a request ever > again??? If a user sends multiple requests in, how are you looking up his > TimeUUID row key from your first email(I would do the same in my > implementation)? > > Actually, I don't get it from Cassandra. I am using Cassandra for the writes, > but to find the userId I look on a pre-indexed structure, because I think the > reads would be faster this way. I need to find the userId by some key fields, > so I use an index like this: > > user ID 5596 -> { name -> "john denver", phone -> "5555 5555", field3 -> > "field 3 data"...., field 10 -> "field 10 data"} > > The values are just examples. This part is not implemented yet and I am > looking for alternatives. Currently we have some similar indexes in SOLR, but > we are thinking in keeping the index in memory and replicating manually in > the cluster, or using Voldemort, etc. > I might be wrong, but I think Cassandra is great for writes, but a solution > like this would be better for reads. > > > If you had an ldap unique username, I would just use that as the primary > key meaning you NEVER have to do reads. If you have a username and need > to lookup a UUID, you would have to do that in both implementationsÅ not a > real big deal thoughÅ a quick quick lookup table does the trick there and > in most cases is still fast enough(ie. Read before write here is ok in a > lot of cases). > > That X-ref table would simple be rowkey=username and value=users real > primary key > > Though again, we use ldap and know no one's username is really going to > change so username is our primary key. > > In my case, a single user can have thousands of requests. In my userCF, I > will have just 1 user with uuid X, but I am not sure about what to have in my > requestCF. > > -- > Marcelo Elias Del Valle > http://mvalle.com - @mvallebr