In general I advise people avoid lists and use Maps or Sets instead. Using this data model, for instance, it's easy to remove a specific Address from a user:
CREATE TYPE address ( street text, city text, zip_code int, ); CREATE TABLE user ( user_id int primary key, addresses map<text, frozen <address>> ); When I want to remove one of the addresses from a user, I can do this: cqlsh:test> delete addresses['home'] from user where user_id = 1; Hope that helps, Jon On Mon, Jan 11, 2016 at 1:20 PM I PVP <i...@hotmail.com> wrote: > Well…the way it is now it is not possible to delete a specific contact > row from the base table at all. Because a DELETE statement only works with > PK in the WHERE clausule. Non PK columns cannot be in the DELETE WHERE > clausule. > https://docs.datastax.com/en/cql/3.3/cql/cql_reference/delete_r.html > > The way it is now It is only possible to delete the entire contact list > for that specific user. > Looks like will need to: > 1)SELECT all rows from user_contact excluding the one that the user wants > to get rid of. > 2) DELETE all the user_contact rows for that particular user . > 3) INSERT the result of 1). > > Is that the proper way to achieve it or Am I missing some point in the > modeling that would allow to delete a specific contact row and still able > to comply with the select requirements? > > Thanks > -- > IPVP > > > From: Jack Krupansky <jack.krupan...@gmail.com> <jack.krupan...@gmail.com> > Reply: user@cassandra.apache.org <user@cassandra.apache.org>> > <user@cassandra.apache.org> > Date: January 11, 2016 at 7:00:04 PM > > To: user@cassandra.apache.org <user@cassandra.apache.org>> > <user@cassandra.apache.org> > Subject: Re: Modeling contact list, plain table or List > > That's the beauty of MV - Cassandra automatically updates the MVs when the > base table changes, including deletions, which is why all of the PK columns > from the base table needed to be in the MV PK. > > -- Jack Krupansky > > On Mon, Jan 11, 2016 at 3:41 PM, I PVP <i...@hotmail.com> wrote: > >> The below table and materialized view will solve the SELECT requirements >> of my current application . >> The challenge now is when the user decides to DELETE one specific contact >> from his contact list. I could add the objectid to a composite partition >> key together with the userid. But that would make the SELECT inviable. >> >> Any ideas/suggestions? >> >> >> CREATE TABLE communication.user_contact ( >> userid int, >> contactname text, >> contactid int, >> createdat timeuuid, >> favoriteat timestamp, >> isfavorite boolean, >> objectid timeuuid, >> PRIMARY KEY (userid, contactname) >> ) WITH CLUSTERING ORDER BY ( contactname DESC ) >> >> >> CREATE MATERIALIZED VIEW communication.user_contact_by_favorite AS >> SELECT userid, isfavorite, contactname, contactid, createdat, favoriteat, >> objectid >> FROM user_contact >> WHERE userid IS NOT NULL AND isfavorite IS NOT NULL AND contactname IS >> NOT NULL >> PRIMARY KEY ( ( userid, isfavorite ), contactname ) >> WITH CLUSTERING ORDER BY ( contactname DESC ) >> >> Thanks >> >> -- >> IPVP >> >> >> From: DuyHai Doan <doanduy...@gmail.com> <doanduy...@gmail.com> >> Reply: user@cassandra.apache.org <user@cassandra.apache.org>> >> <user@cassandra.apache.org> >> Date: January 11, 2016 at 11:14:10 AM >> >> To: user@cassandra.apache.org <user@cassandra.apache.org>> >> <user@cassandra.apache.org> >> Subject: Re: Modeling contact list, plain table or List >> >> In the current iteration of materialized view, it is still not possible >> to have WHERE clause other than IS NOT NULL so is_favourite IS TRUE >> won't work. >> >> Still there is a JIRA created to support this feature : >> https://issues.apache.org/jira/browse/CASSANDRA-10368 >> >> About cardinality of favorite vs non-favorites, it doesn't matter in this >> case because the OP said "Less then one hundred contacts by user is the >> normal." >> >> So even if all contacts are stuck in one unique favorite state, the >> materialized view partition for one user is at most 100. Even for extreme >> edge case with users having 10 000 contacts, it's still a manageable >> partition size for C*. >> >> But I agree it is important to know before-hand the favorite/non-favorite >> update frequency since it will impact the write throughput on the MV. >> >> For more details on materialized view impl and performance: >> http://www.doanduyhai.com/blog/?p=1930 >> >> On Mon, Jan 11, 2016 at 1:36 PM, Jack Krupansky <jack.krupan...@gmail.com >> > wrote: >> >>> The new Materialized View feature is just an automated way of creating >>> and maintaining what people used to call a "query table", which is the >>> traditional Cassandra data modeling technique for performing queries on on >>> than the primary key for a table - you store the same columns in different >>> tables using different columns for the primary key. >>> >>> One also needs to be careful to include all columns of the original >>> primary key in each MV primary key - in addition to whatever column(s) are >>> to be used for indexing in each MV (so that Cassandra can find the old row >>> when it needs to update the MV when the base table row changes, such as on >>> a deletion.) >>> >>> But before creating MVs, you first need to answer questions about how >>> the app needs to query the data. Even with MV, conceptualizing queries >>> needs to precede data modeling. >>> >>> For example, what is the cardinality of favorites vs. non-favorites, >>> does the app even need to query by favorates, as opposed to querying all >>> contacts and retrieving is_favorite as simply a non-key column value, >>> whether favorites need to be retrieved separately from non-favorites, the >>> frequency and latency requirements for query by favorite status, etc. Once >>> these questions are answered, decisions can be made about data modeling. >>> >>> -- Jack Krupansky >>> >>> On Mon, Jan 11, 2016 at 5:13 AM, Carlos Alonso <i...@mrcalonso.com> >>> wrote: >>> >>>> I have never used Materialized Views so maybe this suggestion is not >>>> possible, but in this case, wouldn't it make sense to define the >>>> materialized view as >>>> >>>> is_favourite IS TRUE >>>> instead of >>>> is_favourite IS NOT NULL? >>>> >>>> Carlos Alonso | Software Engineer | @calonso >>>> <https://twitter.com/calonso> >>>> >>>> On 10 January 2016 at 09:59, DuyHai Doan <doanduy...@gmail.com> wrote: >>>> >>>>> Try this >>>>> >>>>> CREATE TABLE communication.user_contact_list ( >>>>> user_id uuid, >>>>> contact_id uuid, >>>>> contact_name text, >>>>> created_at timeuuid, >>>>> is_favorite boolean, >>>>> favorite_at timestamp, >>>>> PRIMARY KEY (user_id, contact_name, contact_id) >>>>> ); >>>>> >>>>> CREATE MATERIALIZED VIEW communication.user_favorite_contact_list >>>>> AS SELECT * FROM communication.user_contact_list >>>>> WHERE user_id IS NOT NULL AND contact_name IS NOT NULL >>>>> AND contact_id IS NOT NULL AND is_favorite IS NOT NULL >>>>> PRIMARY KEY(user_id, is_favorite, contact_name, contact_id) >>>>> >>>>> If the flag is_favorite is not updated very often the write perf hit >>>>> due to materialized view is acceptable. >>>>> >>>>> On Sat, Jan 9, 2016 at 11:57 PM, Isaac P. <i...@hotmail.com> wrote: >>>>> >>>>>> Jack/ Michael, >>>>>> >>>>>> Thanks for answering. >>>>>> >>>>>> How big?: Less then one hundred contacts by user is the normal. >>>>>> >>>>>> Update requirements: The UPDATE requirements are all around each >>>>>> user “favoriting/unfavoriting” the contacts . Deleting is not very >>>>>> frequent. >>>>>> >>>>>> Does that mean that in C* 3.02 , for this use case to work, the >>>>>> contact name must be part of a composite partition key in order to >>>>>> allow >>>>>> sorting by contact_name like this ? : >>>>>> >>>>>> CREATE TABLE communication.user_contact_list ( >>>>>> user_id uuid, >>>>>> contact_name text, >>>>>> is_favorite boolean, >>>>>> contact_id uuid, >>>>>> created_at timeuuid, >>>>>> favorite_at timestamp, >>>>>> PRIMARY KEY ((user_id, contact_name), is_favorite) >>>>>> ) WITH CLUSTERING ORDER BY (contact_name ASC); >>>>>> >>>>>> Query: Select * from user_contact_list where user_id = :userid and >>>>>> is_favorite = true order by contact_name asc; >>>>>> >>>>>> Looks like each contact as a row/clustering key will be the way to go. >>>>>> >>>>>> Thanks >>>>>> >>>>>> IPVP >>>>>> >>>>>> >>>>>> From: Laing, Michael <michael.la...@nytimes.com> >>>>>> <michael.la...@nytimes.com> >>>>>> Reply: user@cassandra.apache.org <user@cassandra.apache.org>> >>>>>> <user@cassandra.apache.org> >>>>>> Date: January 9, 2016 at 11:51:27 AM >>>>>> To: user@cassandra.apache.org <user@cassandra.apache.org>> >>>>>> <user@cassandra.apache.org> >>>>>> Subject: Re: Modeling contact list, plain table or List >>>>>> >>>>>> Note that in C* 3.02 the second query is invalid: >>>>>> >>>>>> cqlsh> Select * from communication.user_contact_list where user_id = >>>>>> 98f50f00-b6d5-11e5-afec-6003089bf572 and is_favorite = true order >>>>>> by contact_name asc; >>>>>> >>>>>> *InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY >>>>>> column "is_favorite" cannot be restricted as preceding column >>>>>> "contact_name" is not restricted"* >>>>>> >>>>>> On Fri, Jan 8, 2016 at 6:50 PM, Jack Krupansky < >>>>>> jack.krupan...@gmail.com> wrote: >>>>>> >>>>>>> How big is each contact list expected to be? Dozens? Hundreds? >>>>>>> Thousands? If just dozens, a simple list column would seem sufficient. >>>>>>> If >>>>>>> thousands, the row (not partition) would get kind of bloated. >>>>>>> >>>>>>> What requirements do you have for updating? If updating contacts and >>>>>>> lots of contacts, I think I'd prefer each contact as a row/clustering >>>>>>> key. >>>>>>> Nice to be able to do selective queries to return slices of the >>>>>>> clustering >>>>>>> key values, which is not so easy if they are all just a single list >>>>>>> column. >>>>>>> >>>>>>> -- Jack Krupansky >>>>>>> >>>>>>> On Fri, Jan 8, 2016 at 6:31 PM, Isaac P. <i...@hotmail.com> wrote: >>>>>>> >>>>>>>> Hi everyone >>>>>>>> >>>>>>>> What would perform better while modeling a simple user contact list >>>>>>>> that will be used mainly to select the recipients for/from/to >>>>>>>> messages ? >>>>>>>> >>>>>>>> a) Individual rows to each (user, contact) pair so a select would >>>>>>>> fetch all the rows to retrieve all the contacts from a given user. >>>>>>>> >>>>>>>> or >>>>>>>> >>>>>>>> b) A single row for each user containing the List<Contact> UDT. >>>>>>>> >>>>>>>> Aside of the basic CRUD, the queries will be the following ones: >>>>>>>> >>>>>>>> Select * from user_contact_list where user_id = :userid order by >>>>>>>> contact_name asc >>>>>>>> >>>>>>>> Select * from user_contact_list where user_id = :userid and >>>>>>>> is_favorite = true order by contact_name asc >>>>>>>> >>>>>>>> After reading this >>>>>>>> https://docs.datastax.com/en/cql/3.0/cql/ddl/ddl_compound_keys_c.html >>>>>>>> the table is looking like this: >>>>>>>> >>>>>>>> CREATE TABLE communication.user_contact_list ( >>>>>>>> user_id uuid, >>>>>>>> contact_id uuid, >>>>>>>> contact_name text, >>>>>>>> created_at timeuuid, >>>>>>>> is_favorite boolean, >>>>>>>> favorite_at timestamp, >>>>>>>> PRIMARY KEY (user_id, contact_name, is_favorite) >>>>>>>> ); >>>>>>>> >>>>>>>> Any guidance will be appreciated. >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> -- >>>>>>>> IPVP >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >