> 1. Is size getting bigger in either one in storing one Tweet? If you store the data in one blob then we only store one column name and the blob. If they are in different cols then we store the column names and their values.
> 2. Has either choice have impact on read/write performance on large scale? If you store data in a blob you can only read and update it as a blob, so chances are you will be wasting effort as you do read-modify-write operations. Unless you have a good reason split things up and store them as columns. cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 3/04/2013, at 1:08 PM, Alan Ristić <alan.ris...@gmail.com> wrote: > Hi guys, > > Here is example (fictional) model I have for learning purposes... > > I'm currently storing the "User" object in a Tweet as blob value. So taking > JSON of 'User' and storing it as blob. I'm wondering why is this better vs. > just prefixing and flattening column names? > > Tweet { > id uuid, > user blob > } > > vs. > > Tweet { > id uuid, > user_id uuid, > user_name text, > .... > } > > In one or other > > 1. Is size getting bigger in either one in storing one Tweet? > 2. Has either choice have impact on read/write performance on large scale? > 3. Anything else I should be considering here? Your view/thinking would be > great. > > Here is my understanding: > For 'ease' of update if for example user changes its name I'm aware I need to > (re)write whole object in all Tweets in first "blob" example and only > user_name column in second 'flattened' example. Which brings me that If I'd > wanted to actually do this "updating/rewriting" for every Tweet I'd use > second 'flattened' example since payload of only user_name is smaller than > whole User blob for every Tweet right? > > Nothing urgent, any input is valuable, tnx guys :) > > > > Hvala in lp, > Alan Ristić > > w: personal blog > t: @alanristic > l: linkedin.com/alanristic > m: 068 15 73 88