This sounds reasonable to me; the general rule of thumb is, "a row should be data that you access together."
The tricky part is when you have data that is accessed multiple ways for multiple queries. Sometimes the answer is "denormalize," sometimes the answer is "accept that the queries you do less often will be slower," depending on your workload (e.g. ratio of reads to writes). On Fri, Jan 28, 2011 at 12:20 PM, Ertio Lew <ertio...@gmail.com> wrote: > Hi, > > I have two kinds of data that I would like to fit in one super column > family; I am trying this, for the reasons of implementing fast > database retrievals by combining the data of two rows into just one > row. > > First kind of data, in supercolumn family, is named with timeUUIDs as > supercolumn names; Think of this as, the postIds of posts in a Group. > These posts will need to be sorted by time (so that list of latest > posts is retrieved). Thus each post has one supercolumn each with name > as (timeUUID+userID) and sorted by timeUUIDtype. > > Second kind of data would be just a single supercolumn containing > columns of userId of all members in a group(very small). (The no of > members in group will be around 40-50 max). The name of this single > supercolumn may be kept suitable(perhaps max. time in future ) so as > to keep this supercolumn to the beginning. > > (The supercolumns are required as we need to store some additional > data in the columns of 1st kind of data). > > So is it recommended to store these two types of data (not related to > each other but need to be retrieved together) in one super column > family ? > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com