Hi Ferit, I have implemented the follower/following relationship in a manner similar to what you describe. However, I can't tell from the description of your tables if you are using the same convention I used. The two tables I used were symmetric, as with yours, but with the ids swapped. I inserted and removed in tandem, as you mention, and I used batches, though they probably aren't strictly necessary for this case.
so for the followers table (which I interpret as "who are my followers?"), the table was defined as: followers: pk = (uid,fid) + fields used for display and for the following table (which I interpret as "who am I following?"), it was defined as: following: pk = (fid,uid) + fields relevant to that display so when user 87654321 follows user 12345678: insert into followers (12345678,87654321, ...) and insert into following (87654321,12345678, ...) Secondary indexes aren't needed for the 2 queries, though bulk deletes (i.e. remove all followers when a user is removed) take a bit of housekeeping. On Wed, Mar 29, 2017 at 10:35 AM, Durity, Sean R < sean_r_dur...@homedepot.com> wrote: > This looks more like a problem for a graph-based model. Have you looked at > DSE Graph as a possibility? > > > > > > Sean Durity > > *From:* ferit baver elhuseyni [mailto:feritba...@gmail.com] > *Sent:* Tuesday, March 14, 2017 11:40 AM > *To:* user@cassandra.apache.org > *Subject:* results differ on two queries, based on secondary index key > and partition key > > > > Hi all, > > > > > > We are using a C* 2.2.8 cluster in our production system, composed of 5 > nodes in 1 DC with RF=3. Our clients mostly write with CL.ALL and read with > CL.ONE (both will be switched to quorum soon). > > > > We face several problems while trying to persist classical "follow > relationship". Did anyone of you have similar problems / or have any idea > on what could be wrong? > > > > 1) First our model. It is based on two tables : follower and following, > that should be identical. First one is for queries on getting followers of > a user, latter is for getting who a user is following. > > > > followings (uid bigint, ts timeuuid, fid bigint, PRIMARY KEY (uid, ts)) > WITH CLUSTERING ORDER BY (ts DESC); > > > > followers (uid bigint, ts timeuuid, fid bigint, PRIMARY KEY (uid, ts)) > WITH CLUSTERING ORDER BY (ts DESC); > > > > > > 2) Both tables have secondary indexes on fid columns. > > > > 3) Definitely, a new follow relationship should insert one row to each > table and delete should work on both too. > > > > > > > > *Problems :* > > > > 1) We have a serious discrepancy problems between tables. With "nodetool > cfstats" followings is 18mb, follower is 19mb in total. For demonstration > purposes of this problem, I got followers of the most-followed user from > both tables. > > > > A) select * from followers where uid=12345678 > > B) select * from followings where fid=12345678 > > > > using a small script on unix, i could find out this info on sets A and B: > > count( A < B ) = 1247 > > count( B < A ) = 185 > > count( A ∩ B ) = 20894 > > > > > > 2) Even more interesting than that is, if I query follower table on > secondary index, I don't get a row that I normally get with filtering just > on partition key. Let me try to visualize it : > > > > select uid,ts,fid from followers where fid=X (cannot find uid=12345678) > > A | BBB | X > > C | DDD | X > > E | FFF | X > > > > select uid,ts,fid from followers where uid=12345678 | grep X > > 12345678 | GGG | *X* > > > > > > *My thoughts :* > > > > 1) Currently, we don't use batches during inserts and deletes to both > tables. Would this help with our problems? > > > > 2) I was first suspicious of a corruption in secondary indexes. But > actually, through the use of secondary index, I get consistent results. > > > > 3) I also thought, there could be the case of zombie rows. However we > didn't have any long downtimes with our nodes. But, to our shame, we > haven't been running any scheduled repairs on the cluster. > > > > 4) Finally, do you think that there may be problem with our modelling? > > > > > > Thanks in advance. > > ------------------------------ > > The information in this Internet Email is confidential and may be legally > privileged. It is intended solely for the addressee. Access to this Email > by anyone else is unauthorized. If you are not the intended recipient, any > disclosure, copying, distribution or any action taken or omitted to be > taken in reliance on it, is prohibited and may be unlawful. When addressed > to our clients any opinions or advice contained in this Email are subject > to the terms and conditions expressed in any applicable governing The Home > Depot terms of business or client engagement letter. The Home Depot > disclaims all responsibility and liability for the accuracy and content of > this attachment and for any damages or losses arising from any > inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other > items of a destructive nature, which may be contained in this attachment > and shall not be liable for direct, indirect, consequential or special > damages in connection with this e-mail message or its attachment. > -- * Steve Robenalt Software Architect, HighWire Press, Inc. * www.highwire.org| Los Gatos, CA| Belfast, NI| Brighton, UK <https://www.linkedin.com/company-beta/1797/> <http://twitter.com/highwirepress> *HighWire Spring Publishers' Meeting, April 4-6. Registration now open! <omenji...@highwire.org?subject=Please%20register%20me%20for%20HighWire%20Spring%20Publishers'%20Meeting>* *Register here for HighWire webinars <https://t.xink.io/Tracking/Index/XaoBAMZ_AAB-KCEA0>*