Re: Correct model

Hiller, Dean Wed, 19 Sep 2012 13:33:31 -0700

Uhm, unless I am mistaken, a NEW request implies a new UUID so you can just 
write it to both the index to the request row and to the user that request was 
for all in one shot with no need to read, right?


(Also, read before write is not necessarily bad…it really depends on your 
situation but in this case, I don't think you need read before write).

For your structured data comment….
Actually playOrm stores structured and unstructured data.  It follows the 
pattern cassandra is adopting more and more of "partial" schemas and plans to 
hold to that path.  It is a complete break from JPA due to noSQL being so 
different.

and each request would have its own id, right

Yes, in my design, I choose each request with it's own id.

Wouldn't it be faster to have a composite key in the requestCF itself?

In CQL, don't you have to have an == in the first part of the clause meaning 
you would have to select the user id, BUT you wanted requests > date no matter 
which user so the indices I gave you have that information with a simple column 
slice of the data.  The indices I gave you look like this(composite column 
names)…. <time1>.<req1>.<user1>, <time2>.<req2>.<user1>, <time3>.<req3>.<user2> 
 NOTE that each is a UUID there in the <> so are unique.

Maybe there is a way, but I am not sure on how to get all the latest request > 
data for every user….I guess you could always map/reduce but that is generally 
reserved for analytics or maybe updating new index tables you are creating for 
reading faster.

Later,
Dean

From: Marcelo Elias Del Valle <mvall...@gmail.com<mailto:mvall...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Wednesday, September 19, 2012 1:47 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Correct model

2012/9/19 Hiller, Dean <dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>>
Thinking out loud and I think a bit towards playOrm's model though you don’t' 
need to use playroom for this.

1. I would probably have a User with the requests either embedded in or the 
Foreign keys to the requests…either is fine as long as you get the user get ALL 
FK's and make one request to get the requests for that user

This was my first option. However, everytime I have a new request I would need 
to read the column "request_ids", update its value, and them write the result. 
This would be a read-before-write, which is bad in Cassandra, right? Or you 
were talking about other kinds of FKs?

2. I would create rows for index and index each month of data OR maybe index 
each day of data(depends on your system).  Then, I can just query into the 
index for that one month.  With playOrm S-SQL, this is a simple PARTITIONS 
r(:thismonthParititonId) SELECT r FROM Request r where r.date > :date OR you 
just do a column range query doing the same thing into your index.  The index 
is basically the wide row pattern ;) with composite keys of <date>.<rowkey of 
request>

I would consider playOrm in a later step in my project, as my understanding now 
is it is good to store relational data, structured data. I cannot predict which 
columns I am going to store in requestCF. But regardless, even in Cassandra, 
you would still use a composite key, but it seems you would create an indexCf 
using the wide row pattern, and each request would have its own id, right? But 
why? Wouldn't it be faster to have a composite key in the requestCF itself?


From: Marcelo Elias Del Valle 
<mvall...@gmail.com<mailto:mvall...@gmail.com><mailto:mvall...@gmail.com<mailto:mvall...@gmail.com>>>
Reply-To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Date: Wednesday, September 19, 2012 1:02 PM
To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Subject: Correct model

I am new to Cassandra and NoSQL at all.
I built my first model and any comments would be of great help. I am describing 
my thoughts bellow.

It's a very simple model. I will need to store several users and, for each 
user, I will need to store several requests. It request has it's insertion 
time. As the query comes first, here are the only queries I will need to run 
against this model:
- Select all the requests for an user
- Select all the users which has new requests, since date D

I created the following model: an UserCF, whose key is a userID generated by 
TimeUUID, and a RequestCF, whose key is composite: UserUUID + timestamp. For 
each user, I will store basic data and, for each request, I will insert a lot 
of columns.

My questions:
- Is the strategy of using a composite key good for this case? I thought in 
other solutions, but this one seemed to be the best. Another solution would be 
have a non-composite key of type UUID for the requests, and have another CF to 
relate user and request.
- To perform the second query, instead of selecting if each user has a request 
inserted after date D, I thought in storing the last request insertion date 
into the userCF, everytime I have a new insert for the user. It would be a data 
replication, but I would have no read-before-write and I am guessing the second 
query would perform faster.

Any thoughts?

--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr



--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr

Re: Correct model

Reply via email to