Re: Data modeling for Pinterest-like application

DuyHai Doan Sat, 17 May 2014 02:25:17 -0700

"A related question is whether it is a good idea to denormalize on
read-heavy part of data while normalize on other less frequently-accessed
data?"

 Heavy read -> denormalize
 Less frequently accessed data -> it depends how "less frequent" it is and
whether it's complicated to denormalize in your code.

"We will also have a like board for each user containing pins that they
like, which can be somewhat private and only viewed by the owner."

"Since a pin can be potentially liked by thousands of user, if we also
denormalize the like board, everytime that pin is liked by another user we
would have to update the like count in thousands of like boards."

If I understand your use case, a pin consist of a "description" and a "like
count" isn't it ?  It makes sense then to use counter type for the "like
count" but in this case you cannot denormalize the counter type because you
cannot mix counter column family with normal column family (containing the
pin description and properties).

*If you are sure* the the like board  is accessed rarely or not very
frequently by the users, then normalization could be the answer. You can
mitigate further the effect of N+1 select in the like board by paging pins
(not showing all of them at once but by page of 10 for example)

On Sat, May 17, 2014 at 2:37 AM, ziju feng <pkdog...@gmail.com> wrote:

> Thanks for your answer, I really like the frequency of update vs read way
> of
> thinking.
>
> A related question is whether it is a good idea to denormalize on
> read-heavy
> part of data while normalize on other less frequently-accessed data?
>
> Our app will have a limited number of system managed boards that are viewed
> by every user so it makes sense to denormalize and propagate updates of
> pins
> to these boards.
>
> We will also have a like board for each user containing pins that they
> like,
> which can be somewhat private and only viewed by the owner.
>
> Since a pin can be potentially liked by thousands of user, if we also
> denormalize the like board, everytime that pin is liked by another user we
> would have to update the like count in thousands of like boards.
>
> Does normalize work better in this case or cassandra can handle this kind
> of
> write load?
>
>
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Data-modeling-for-Pinterest-like-application-tp7594481p7594517.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at
> Nabble.com.
>

Re: Data modeling for Pinterest-like application

Reply via email to