Re: 答复: 答复: A difficult data model with C*

Benjamin Roth Thu, 10 Nov 2016 02:51:07 -0800

I pretty much guess the CQL you posted is invalid. You cannot set a
clustering column that is not part of the primary key.
But you can use a materialized view to append the last_time to the primary
key and still preserver uniqueness of username + vedio_id (guess it is a
typo in vedio).


2016-11-10 10:47 GMT+00:00 Carlos Alonso <i...@mrcalonso.com>:

> What about having something like
>
> CREATE TABLE user_views (
>   user_name text,
>   video_id text,
>   position int,
>   last_time timestamp,
>   PRIMARY KEY(user_name, last_time)
> ) WITH CLUSTERING ORDER BY (last_time DESC);
>
> Where you insert a record everytime a user watches a video and then having
> a batch task (every night maybe?) that deletes the extra rows that are not
> needed anymore.
> The query pattern for this is quite efficient as something like SELECT *
> FROM user_views WHERE user_name = ? LIMIT 10;
>
> Regards
>
> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>
> On 10 November 2016 at 09:19, Vladimir Yudovin <vla...@winguzone.com>
> wrote:
>
>> >Do you mean the oldest one should be removed when a new play is added?
>> Sure. As you described the issue "the last ten items may be adequate for
>> the business"
>>
>> Best regards, Vladimir Yudovin,
>>
>> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
>> CassandraLaunch your cluster in minutes.*
>>
>>
>> ---- On Wed, 09 Nov 2016 20:47:05 -0500*Diamond ben
>> <diamond....@outlook.com <diamond....@outlook.com>>* wrote ----
>>
>> The solution maybe work. However, the play list will grow over time and
>> somebody maybe has ten thousands that will slow down the query and sort .
>> Do you mean the oldest one should be removed when a new play is added?
>>
>> BTW, the version is 2.1.16 in our live system.
>>
>>
>> BRs,
>>
>> BEN
>> ------------------------------
>>
>> *发件人:* Vladimir Yudovin <vla...@winguzone.com>
>> *发送时间:* 2016年11月9日 18:11:26
>> *收件人:* user
>> *主题:* Re: 答复: A difficult data model with C*
>>
>> You are welcome! )
>>
>> >recent ten movies watched by the user within 30 days.
>> In this case you can't use PRIMARY KEY (user_name, video_id), as video_id
>> is demanded to fetch row, so all this stuff may be
>>
>> CREATE TYPE play (video_id text, position int, last_time timestamp);
>> CREATE TABLE recent (user_name text PRIMARY KEY, play_list
>> LIST<frozen<play>>);
>>
>> You can easily retrieve play list for specific user by his ID. Instead of
>> LIST you can use MAP, I don't think that for ten entries it matters.
>>
>>
>> Best regards, Vladimir Yudovin,
>>
>> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud Cassandra
>> Launch your cluster in minutes.*
>>
>>
>> ---- On Tue, 08 Nov 2016 22:29:48 -0500*ben ben <diamond....@outlook.com
>> <diamond....@outlook.com>>* wrote ----
>>
>> Hi Vladimir Yudovin,
>>
>>
>>     Thank you very much for your detailed explaining. Maybe I didn't
>> describe the requirement clearly. The use cases should be:
>>
>> 1. a user login our app.
>>
>> 2. show the recent ten movies watched by the user within 30 days.
>>
>> 3. the user can click any one of the ten movie and continue to watch
>> from the last position she/he did. BTW, a movie can be watched several
>> times by a user and the last positon is needed indeed.
>>
>> BRs,
>>
>> BEN
>> ------------------------------
>>
>> *发件人:* Vladimir Yudovin <vla...@winguzone.com>
>> *发送时间:* 2016年11月8日 22:35:48
>> *收件人:* user
>> *主题:* Re: A difficult data model with C*
>>
>> Hi Ben,
>>
>> if need very limited number of positions (as you said ten) may be you can
>> store them in LIST of UDT? Or just as JSON string?
>> So you'll have one row per each pair user-video.
>>
>> It can be something like this:
>>
>> CREATE TYPE play (position int, last_time timestamp);
>> CREATE TABLE recent (user_name text, video_id text, review
>> LIST<frozen<play>>, PRIMARY KEY (user_name, video_id));
>>
>> UPDATE recent set review = review + [(1234,12345)] where user_name='some
>> user' AND video_id='great video';
>> UPDATE recent set review = review + [(1234,123456)] where user_name='some
>> user' AND video_id='great video';
>> UPDATE recent set review = review + [(1234,1234567)] where
>> user_name='some user' AND video_id='great video';
>>
>> You can delete the oldest entry by index:
>> DELETE review[0] FROM recent WHERE user_name='some user' AND
>> video_id='great video';
>>
>> or by value, if you know the oldest entry:
>>
>> UPDATE recent SET review = review - [(1234,12345)]  WHERE user_name='some
>> user' AND video_id='great video';
>>
>> Best regards, Vladimir Yudovin,
>>
>> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud Cassandra
>> Launch your cluster in minutes.*
>>
>>
>> ---- On Mon, 07 Nov 2016 21:54:08 -0500*ben ben <diamond....@outlook.com
>> <diamond....@outlook.com>>* wrote ----
>>
>>
>> Hi guys,
>>
>> We are maintaining a system for an on-line video service. ALL users'
>> viewing records of every movie are stored in C*. So she/he can continue to
>> enjoy the movie from the last point next time. The table is designed as
>> below:
>> CREATE TABLE recent (
>> user_name text,
>> vedio_id text,
>> position int,
>> last_time timestamp,
>> PRIMARY KEY (user_name, vedio_id)
>> )
>>
>> It worked well before. However, the records increase every day and the
>> last ten items may be adequate for the business. The current model use
>> vedio_id as cluster key to keep a row for a movie, but as you know, the
>> business prefer to order by the last_time desc. If we use last_time as
>> cluster key, there will be many records for a singe movie and the recent
>> one is actually desired. So how to model that? Do you have any suggestions?
>> Thanks!
>>
>>
>> BRs,
>> BEN
>>
>>
>>
>>
>>
>>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer

Re: 答复: 答复: A difficult data model with C*

Reply via email to