Re: indexes from CassandraSF

Guy Incognito Sun, 13 Nov 2011 06:52:51 -0800

[1] i'm not particularly worried about transient conditions so that'sok. i think there's still the possibility of a non-transient falsepositive...if 2 writes were to happen at exactly the same time (highlyunlikely), eg

1) A reads previous location (L1) from index entries
2) B reads previous location (L1) from index entries
3) A deletes previous location (L1) from index entries
4) B deletes previous location (L1) from index entries
5) A deletes previous location (L1) from index
6) B deletes previous location (L1) from index
7) A enters new location (L2) into index entries
8) B enters new location (L3) into index entries
9 ) A enters new location (L2) into index
10) B enters new location (L3) into index
11) A sets new location (L2) on users
12) B sets new location (L2) on users

after this, don't i end up with an incorrect L2 location in indexentries and in the index, that won't be resolved until the next write oflocation for that user?

[2] ah i see...so the client would continuously retry until the updateworks. that's fine provided the client doesn't bomb out with some othererror, if that were to happen then i have potentially deleted the indexentry columns without deleting the corresponding index columns.

i can handle both of the above for my use case, i just want to clarifywhether they are possible (however unlikely) scenarios.

On 13/11/2011 02:41, Ed Anuff wrote:

1) The index updates should be eventually consistent.  This does mean
that you can get a transient false-positive on your search results.
If this doesn't work for you, then you either need to use ZK or some
other locking solution or do "read repair" by making sure that the row
you retrieve contains the value you're searching for before passing it
on to the rest of your applicaiton.

2)  You should be able to reapply the batch updates til they succeed.
The update is idempotent.  One thing that's important that the slides
don't make clear is that this requires using time-based uuids as your
timestamp components.  Take a look at the sample code.

Hope this helps,

Ed

On Sat, Nov 12, 2011 at 3:59 PM, Guy Incognito<dnd1...@gmail.com>  wrote:

help?

On 10/11/2011 19:34, Guy Incognito wrote:

hi,

i've been looking at the model below from Ed Anuff's presentation at
Cassandra CF (http://www.slideshare.net/edanuff/indexing-in-cassandra).
  Couple of questions:

1) Isn't there still the chance that two concurrent updates may end up
with the index containing two entries for the given user, only one of which
would be match the actual value in the Users cf?

2) What happens if your batch fails partway through the update?  If i
understand correctly there are no guarantees about ordering when a batch is
executed, so isn't it possible that eg the previous
value entries in Users_Index_Entries may have been deleted, and then the
batch fails before the entries in Indexes are deleted, ie the mechanism has
'lost' those values?  I assume this can be addressed
by not deleting the old entries until the batch has succeeded (ie put the
previous entry deletion into a separate, subsequent batch).  this at least
lets you retry at a later time.

perhaps i'm missing something?

SELECT {"location"}..{"location", *}
FROM Users_Index_Entries WHERE KEY =<user_key>;

BEGIN BATCH

DELETE {"location", ts1}, {"location", ts2}, ...
FROM Users_Index_Entries WHERE KEY =<user_key>;

DELETE {<value1>,<user_key>, ts1}, {<value2>,<user_key>, ts2}, ...
FROM Indexes WHERE KEY = "Users_By_Location";

UPDATE Users_Index_Entries SET {"location", ts3} =<value3>
WHERE KEY=<user_key>;

UPDATE Indexes SET {<value3>,<user_key>, ts3) = null
WHERE KEY = "Users_By_Location";

UPDATE Users SET location =<value3>
WHERE KEY =<user_key>;

APPLY BATCH

Re: indexes from CassandraSF

Reply via email to