Hi Stefan,

I was hoping we could avoid the cost of a serial read (which I assume is a lot more expensive than a regular read due to the paxos requirements). I actually do a serial read at line #9 (ie, when we lose the LWT and have to read the winning value) and that still fails to ensure the uniqueness guarantees. Under what circumstances would we be reading inconsistent results ? Is there a case where we end up reading a value that actually end up not being written ?

Thanks !

--
Mahdi.

On 2/9/18 12:52 PM, Stefan Podkowinski wrote:

I'd not recommend using any consistency level but serial for reading tables updated by LWT operations. Otherwise you might end up reading inconsistent results.


On 09.02.18 08:06, Mahdi Ben Hamida wrote:

Hello,

I'm running a 2.0.17 cluster (I know, I know, need to upgrade) with 46 nodes across 3 racks (& RF=3). I'm seeing that under high contention, LWT may actually not guarantee uniqueness. With a total of 16 million LWT transactions (with peak LWT concurrency around 5k/sec), I found 38 conflicts that should have been impossible. I was wondering if there were any known issues that make LWT broken for this old version of cassandra.

I use LWT to guarantee that a 128 bit number (hash) maps to a unique 64 bit number (id). There could be a large number of threads trying to allocate an id for a given hash.

I do the following logic (slightly more complicated than this due to timeout handling)

 1  existing_id = SELECT id FROM hash_id WHERE hash=computed_hash *| consistency = ONE*
 2  if existing_id != null:
 3    return existing_id
 4  new_id = generateUniqueId()
 5  result=INSERT INTO hash_id (id) VALUES(new_id) WHERE hash=computed_hash IF NOT EXIST | *consistency = QUORUM, serialConsistency = SERIAL*
 6  if result == [applied] // ie we won LWT
 7    return new_id
 8  else// we lost LWT, fetch the winning value
 9    existing_id = SELECT id FROM hash_id WHERE hash=computed_hash | consistency = ONE
10    return existing_id

Is there anything flawed about this ?
I do the read at line #1 and #9 at a consistency of ONE. Would that cause uncommitted changes to be seen (ie, dirty reads) ? Should it be a SERIAL consistency instead ? My understanding is that only one transaction will be able to apply the write (at quorum), so doing a read at consistency of one will either result in a null, or I would get the id that won the LWT race.

Any help is appreciated. I've been banging my head on this issue (thinking it was a bug in the code) for some time now.

--
Mahdi.


Reply via email to