I was hoping someone could share their opinions on the following CF designs or suggest a better way of doing it. My app is constantly receiving new data that contains URLs. I was thinking of hashing this URL to form a key. The data is a JSON object with several properties. For now many of its properties will be ignored and only 4 are of interests, URL, title, username, user_rating. Often times the same URL is received but shared by a different user. I’m wondering if anyone can suggest a better approach to what I propose below which will be able answer the following .
Queries: I’ll be asking the following questions: 1. Give me the N most frequently shared items over : a) The last 30 minutes b) The last 24hrs c) Between D1 and D2 (where D1 and D2 represents the start and end date of interest) 2) Give me the N most shared item over the 3 time periods above WHERE the average user rating is above 5 3) Give me X for the item with the ID 123 (where X is a property for the item with the ID 123) Proposed approach Use timestamps as keys in the CF, that should take care of queries under 1 and partially handle 2 and use each column to store the JSON data, minus the common fields such as the title which will be the same no matter how many users share the same link (they’ll have their own columns in the row) other column names will be the user’s username and the value for those columns will be any JSON left over that’s not specific to the user. For the rest of 2, I can get the N items we’re interested in and calculate the average user rating for each item client side. Of course using timestamp as key means we need to maintain an index of the “real” keys/IDs to each item which would allow us to answer “Give me the item with the ID 123” Finally to address 3, I was thinking; Using the index get the timestamp of the item, and on the client side find the property of interest. CF1 Timestamp1 Title value ID ID1 Username3 {“rating”:5} Username2 {“rating”:0} Username2 {“rating”:4} Timestamp2 Title Value1 ID ID2 Username24 {“rating”:1} Username87 {“rating”:9} Username7 {“rating”:2} CF2 ID1 Timestamp1 ID2 Timestamp2 In the Username column, I'd ideally like to avoid storing the other properties as a JSON but I couldn't think of a way of doing it sensibly when that JSON grows into having 10 other properties.Does this sound like a sensible approach to designing my CFs?