Hello, What are your best practices for avoiding collisions when updating duplicate (derived) data? For example if I have tables like this:
users — (username, email, phone) — PK (username) users_by_email — (username, email, phone) — PK (email) users_by_phone — (username, email, phone) — PK ((phone), username) … and I want to change the user’s email address ... What is your strategy for ensuring that two writers don’t try to update the same records at the same time? Some sort of lock? 1) Do you have a global “locks” table in Cassandra, and use a LWT to lock it? insert into locks (what, locked_by) values ('users:maxc', ’server1_pid1234') if not exists; # Verify insert succeeded # Now that I have the lock, grab the latest select * from users where username=‘maxc’; begin batch # … apply changes ... apply batch; # release lock delete from locks where what=‘users:maxc’; 2) Do you add a “locked_by” column to the master (“users”) table, and then use a LWT to lock it? update users set locked_by=‘server1_pid1234’’ where username=“maxc” if locked_by = null; # Verify update succeeded # Now that I have the lock, grab the latest select * from users where username=‘maxc’; begin batch # … apply changes ... apply batch; # release lock update users set locked_by=null where username=‘maxc’; 3) Do you use something outside of Cassandra to manage the locks? Zoo keeper? ## Acquire external lock ## # Now that I have the lock, grab the latest select * from users where username=‘maxc’; begin batch # … apply changes ... apply batch; ## release external lock ## Or is there some other way to do this that I’m totally missing?? Materialized views in 3.x, I suppose. Other ideas? Thanks! - Max