"A related question is whether it is a good idea to denormalize on read-heavy part of data while normalize on other less frequently-accessed data?"
Heavy read -> denormalize Less frequently accessed data -> it depends how "less frequent" it is and whether it's complicated to denormalize in your code. "We will also have a like board for each user containing pins that they like, which can be somewhat private and only viewed by the owner." "Since a pin can be potentially liked by thousands of user, if we also denormalize the like board, everytime that pin is liked by another user we would have to update the like count in thousands of like boards." If I understand your use case, a pin consist of a "description" and a "like count" isn't it ? It makes sense then to use counter type for the "like count" but in this case you cannot denormalize the counter type because you cannot mix counter column family with normal column family (containing the pin description and properties). *If you are sure* the the like board is accessed rarely or not very frequently by the users, then normalization could be the answer. You can mitigate further the effect of N+1 select in the like board by paging pins (not showing all of them at once but by page of 10 for example) On Sat, May 17, 2014 at 2:37 AM, ziju feng <pkdog...@gmail.com> wrote: > Thanks for your answer, I really like the frequency of update vs read way > of > thinking. > > A related question is whether it is a good idea to denormalize on > read-heavy > part of data while normalize on other less frequently-accessed data? > > Our app will have a limited number of system managed boards that are viewed > by every user so it makes sense to denormalize and propagate updates of > pins > to these boards. > > We will also have a like board for each user containing pins that they > like, > which can be somewhat private and only viewed by the owner. > > Since a pin can be potentially liked by thousands of user, if we also > denormalize the like board, everytime that pin is liked by another user we > would have to update the like count in thousands of like boards. > > Does normalize work better in this case or cassandra can handle this kind > of > write load? > > > > -- > View this message in context: > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Data-modeling-for-Pinterest-like-application-tp7594481p7594517.html > Sent from the cassandra-u...@incubator.apache.org mailing list archive at > Nabble.com. >