> What is the downside, anyway? you code is now the only thing that can read the data. So it makes it harder to look at in a CLI tool.
IMHO just store the data in columns. Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 7:04 AM, Chidambaran Subramanian <chi...@gmail.com> wrote: > > > > On Thu, Apr 4, 2013 at 6:58 AM, aaron morton <aa...@thelastpickle.com> wrote: > > 1. Is size getting bigger in either one in storing one Tweet? > If you store the data in one blob then we only store one column name and the > blob. If they are in different cols then we store the column names and their > values. > > > 2. Has either choice have impact on read/write performance on large scale? > If you store data in a blob you can only read and update it as a blob, so > chances are you will be wasting effort as you do read-modify-write > operations. Unless you have a good reason split things up and store them as > columns. > > If its mostly read only data that can be cached outside Cassandra, storing it > in one column looks like a good idea to me. What is the downside, anyway? > > > cheers > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 3/04/2013, at 1:08 PM, Alan Ristić <alan.ris...@gmail.com> wrote: > > > Hi guys, > > > > Here is example (fictional) model I have for learning purposes... > > > > I'm currently storing the "User" object in a Tweet as blob value. So taking > > JSON of 'User' and storing it as blob. I'm wondering why is this better vs. > > just prefixing and flattening column names? > > > > Tweet { > > id uuid, > > user blob > > } > > > > vs. > > > > Tweet { > > id uuid, > > user_id uuid, > > user_name text, > > .... > > } > > > > In one or other > > > > 1. Is size getting bigger in either one in storing one Tweet? > > 2. Has either choice have impact on read/write performance on large scale? > > 3. Anything else I should be considering here? Your view/thinking would be > > great. > > > > Here is my understanding: > > For 'ease' of update if for example user changes its name I'm aware I need > > to (re)write whole object in all Tweets in first "blob" example and only > > user_name column in second 'flattened' example. Which brings me that If I'd > > wanted to actually do this "updating/rewriting" for every Tweet I'd use > > second 'flattened' example since payload of only user_name is smaller than > > whole User blob for every Tweet right? > > > > Nothing urgent, any input is valuable, tnx guys :) > > > > > > > > Hvala in lp, > > Alan Ristić > > > > w: personal blog > > t: @alanristic > > l: linkedin.com/alanristic > > m: 068 15 73 88 > >