Re: Data Modeling - JSON vs Composite columns

Brian O'Neill Wed, 19 Sep 2012 06:32:42 -0700

Roshni,

We're going through the same debate right now.

I believe native support for JSON (or collections) is on the docket
for Cassandra.
Here is a discussion we had a few months ago on the topic:
http://comments.gmane.org/gmane.comp.db.cassandra.devel/5233

We presently store JSON, but we're considering a change to composite keys.

Presently, each client has to parse the JSON value.  If you are
retrieving lots of values, that's a lot of parsing.  Also, storing the
raw values allows for better integration with other tools, such as
reporting engines (e.g. JasperSoft).  Also, if you do want to update a
single value inside the json, you get into real trouble, because you
first need to read the value, update the field, then write the column
again.  The read before write is a problem, especially if you have a
lot of concurrency in your system.  (Two clients could read the old
value, then update different fields, and the second would overwrite
the firsts change)

One final note...
(As a side not, JSON values also complicated our wide-row indexing
mechanism: (https://github.com/hmsonline/cassandra-indexing))

For those reasons, we're considering a data model shift away from JSON.

That said, I'm keeping a close watch on:
https://issues.apache.org/jira/browse/CASSANDRA-3647

But if this is CQL only, I'm not sure how much use it will be for us
since we're coming in from different clients.
Anyone know how/if collections will be available from other clients?

-brian

On Wed, Sep 19, 2012 at 8:00 AM, Roshni Rajagopal
<roshni_rajago...@hotmail.com> wrote:
> Hi,
>
> There was a conversation on this some time earlier, and to continue it
>
> Suppose I want to associate a user to  an item, and I want to also store 3
> commonly used attributes without needing to go to an entity item column
> family , I have 2 options :-
>
> A) use composite columns
> UserId1 : {
>  <itemid1>:<Name> = Betty Crocker,
>  <itemid1>:<Descr> = Cake
> <itemid1>:<Qty> = 5
>  <itemid2>:<Name> = Nutella,
>  <itemid2>:<Descr> = Choc spread
> <itemid2>:<Qty> = 15
> }
>
> B) use a json with the data
> UserId1 : {
>  <itemid1> = {name: Betty Crocker,descr: Cake, Qty: 5},
>  <itemid2> ={name: Nutella,descr: Choc spread, Qty: 15}
> }
>
> Essentially A is better if one wants to update individual fields , while B
> is better if one wants easier paging, reading multiple items at once in one
> read. etc. The details are in this discussion thread
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Data-Modeling-another-question-td7581967.html
>
> I had an additional question,
> as its being said, that CQL is the direction in which cassandra is moving,
> and there's a lot of effort in making CQL the standard,
>
> How does approach B work in CQL. Can we read/write a JSON easily in CQL? Can
> we extract a field from a JSON in CQL or would that need to be done via the
> client code?
>
>
> Regards,
> Roshni

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
Apache Cassandra MVP
mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42

Re: Data Modeling - JSON vs Composite columns

Reply via email to