Hello, I'm working on data modeling for a Pinterest-like project. There are basically two main concepts: Pin and Board, just like Pinterest, where pin is an item containing an image, description and some other information such as a like count, and each board should contain a sorted list of Pins.
The board can be modeled with primary key (board_id, created_at, pin_id) where created_at is used to sort the pins of the board by date. The problem is whether I should denormalize details of pins into the board table or just retrieve pins by page (page size can be 10~20) and then multi-get by pin_ids to obtain details. Since there are some boards that are accessed very often (like the home board), denormalization seems to be a reasonable choice to enhance read performance. However, we then have to update not only the pin table be also each row in the board table that contains the pin whenever a pin is updated, which sometimes could be quite frequent (such as updating the like count). Since a pin may be contained by many boards (could be thousands), denormalization seems to bring a lot of load on the write side as well as application code complexity. Any suggestion to whether our data model should go denormalized or the normalized/multi-get way which then perhaps need a separate cached layer for read? Thanks, Ziju