Thanks for the suggestion. I don't care the history of the update time to a file, BUT I do want to ordered by it.
Reason for that is, without that, and if I have 10k+ file belongs to a user, I have to fetch all the last modified time of all these 10k+ file and sort through them in my application and only return the top N. Kind of expensive. I would like to see if it is possible to rely on Cassandra native storage to achieve this. CREATE TABLE user_file ( user_id uuid, file_id timeuuid, last_modified_time timestamp, PRIMARY KEY(user_id, file_id) ); select * from user_file where user_id=user1_uuid order by last_modified_time limit 10 Above CQL would be invalid, because last_modified_time is not part of the compound key, and is not allowed to used for order by purpose. On Thu, Jul 11, 2013 at 12:51 AM, Lohith Samaga M <lohith.sam...@mphasis.com > wrote: > ** > > Hi, > Do you need to store the history of updates to a file? > If this is not required, then you can make the userid and file id as the > row key. You need to simply update the modified_date timestamp. There will > be only one row per file per user. > > Thanks and Regards > M. Lohith Samaga > > > > > > > -----Original Message----- > From: y2k...@gmail.com on behalf of Jimmy Lin > Sent: Thu 11-Jul-13 13:09 > To: user@cassandra.apache.org > Subject: Re: data model question : finding out the n most recent changes > items > > what I mean is, I really just want the last modified date instead of series > of timestamp and still able to sort or order by it. > (maybe I should rephrase my question as how to sort or order by last > modified column in a row) > > CREATE TABLE user_file ( > user_id uuid, > modified_date timestamp, > file_id timeuuid, > PRIMARY KEY(user_id, modified_date) > ); > > e.g user1 update file A 3 times in a row, and update file B, then update > file A again. > insert into user_file values(user1_uuid, date1, file_a_uuid); > insert into user_file values(user1_uuid, date2, file_a_uuid); > insert into user_file values(user1_uuid, date3, file_a_uuid); > insert into user_file values(user1_uuid, date4, file_b_uuid); > insert into user_file values(user1_uuid, date5, file_a_uuid); > > #trying to get top 3 most recent changed files > select * from user_file where user_id=user1_uuid limit 3 > > using CQL, I will get 3 rows back(all file a) > (user1_uuid, date1, file_a_uuid); > (user1_uuid, date2, file_a_uuid); > (user1_uuid, date3, file_a_uuid); > > what I want is (file a AND file b) > user1_uuid, date1, file_a_uuid > user1_uuid, date4, file_b_uuid > > So how do I order by/sort by last modified column in a row? > > thanks > > > > > On Thu, Jul 11, 2013 at 12:00 AM, aaron morton <aa...@thelastpickle.com > >wrote: > > > What you described this sounds like the most appropriate: > > > > CREATE TABLE user_file ( > > user_id uuid, > > modified_date timestamp, > > file_id timeuuid, > > PRIMARY KEY(user_id, modified_date) > > ); > > > > If you normally need more information about the file then either store > > that as additional fields or pack the data using something like JSON or > > Protobuf. > > > > > my return list may still not accurate because a single directory could > > have lot of modification changes. I basically end up pulling out series > of > > modification timestamp for the same directory. > > Not sure I understand the problem. > > > > Cheers > > > > > > ----------------- > > Aaron Morton > > Freelance Cassandra Consultant > > New Zealand > > > > @aaronmorton > > http://www.thelastpickle.com > > > > On 10/07/2013, at 6:51 PM, Jimmy Lin <y2klyf+w...@gmail.com> wrote: > > > > > I have an application that need to find out the n most recent modified > > files for a given user id. I started out few tables but still couldn't > get > > what i want, I hope someone get point to some right direction... > > > > > > See my tables below. > > > > > > #1 won't work, because file_id's timeuuid contains creation time, not > > the modification time. > > > > > > #2 won't work, because i can't order by a non primary key > > column(modified_date) > > > #3,#4 although i can now get a time series of modification time of > each > > file belongs to a user, my return list may still not accurate because a > > single directory could have lot of modification changes. I basically end > up > > pulling out series of modification timestamp for the same directory. > > > > > > Any suggestion? > > > > > > Thanks > > > > > > > > > > > > #1 > > > > > > CREATE TABLE user_file ( > > > > > > user_id uuid, > > > > > > file_id timeuuid, > > > > > > PRIMARY KEY(user_id, file_id) > > > > > > ); > > > > > > > > > > > > #2 > > > > > > CREATE TABLE user_file ( > > > > > > user_id uuid, > > > > > > file_id timeuuid, > > > > > > modified_date timestamp, > > > > > > PRIMARY KEY(user_id, file_id) > > > > > > ); > > > > > > > > > > > > #3 > > > > > > CREATE TABLE user_file ( > > > > > > user_id uuid, > > > > > > file_id timeuuid, > > > > > > modified_date timestamp, > > > > > > PRIMARY KEY(user_id, file_id, modified_date) > > > > > > ); > > > > > > > > > > > > #4 > > > > > > CREATE TABLE user_file ( > > > > > > user_id uuid, > > > > > > modified_date timestamp, > > > > > > file_id timeuuid, > > > > > > PRIMARY KEY(user_id, modified_date, file_id) > > > > > > ); > > > > > > > > > > > > > > > > > > > > > > Information transmitted by this e-mail is proprietary to MphasiS, its > associated companies and/ or its customers and is intended > for use only by the individual or entity to which it is addressed, and may > contain information that is privileged, confidential or > exempt from disclosure under applicable law. If you are not the intended > recipient or it appears that this mail has been forwarded > to you without proper authority, you are notified that any use or > dissemination of this information in any manner is strictly > prohibited. In such cases, please notify us immediately at > mailmas...@mphasis.com and delete this mail from your records. >