Re: Is Cassandra really Strong consistency?

Jeff Jirsa Sun, 06 Sep 2015 09:58:55 -0700

In the cases where NTP and client timestamps with microsecond resolution is 
insufficient, LWT “IF EXISTS, IF NOT EXISTS” is generally used.

From:  ibrahim El-sanosi
Reply-To:  "user@cassandra.apache.org"
Date:  Sunday, September 6, 2015 at 7:40 AM
To:  "user@cassandra.apache.org"
Subject:  Re: Is Cassandra really Strong consistency?

I have done some research about “timestamps could jump back and forth 
arbitrarily if you talk to different nodes”.

To summarise,  it is possible in Cassandra for following scenario can happen in 
sequence:

Process A writes w1 with timestamp t=2
Process B reads w1
Process B writes w2 with timestamp t=1
Process B reads w1, but expected w2
If the system clock goes backwards for any reason, Cassandra’s session 
consistency guarantees no longer hold, even consistency level is write/read CL 
= QOURUM  or write CL = ALL and read CL =one.

Moreover, even we use NTP, the problem above can occur. That means that the 
timestamps for writes are derived either from a single Cassandra server clock, 
or a single app server clock. These clocks can flow backwards, for a number of 
“reasons”:
Hardware wonkiness can push clocks days or centuries into the future or past.
Virtualization can wreak havoc on kernel timekeeping.
Misconfigured nodes may not have NTP enabled, or may not be able to reach 
upstream sources.
Upstream NTP servers can lie.
When the problem is identified and fixed, NTP corrects large time differentials 
by jumping the clock discontinously to the correct time.
Even when perfectly synchronized, POSIX time itself is not monotonic.

If you want to read more this link can give you a lot hints.

Regards,

Ibrahim

On Sun, Sep 6, 2015 at 2:01 PM, Edouard COLE <edouard.c...@rgsystem.com> wrote:
@ibrahim: When saying "clocks should be synchronized", it includes Cassandra 
nodes AND clients

NTP is the way to go

Le 6 sept. 2015 à 14:56, Laing, Michael <michael.la...@nytimes.com> a écrit :

https://en.wikipedia.org/wiki/Network_Time_Protocol

On Sun, Sep 6, 2015 at 8:23 AM, ibrahim El-sanosi <ibrahimsaba...@gmail.com> 
wrote:
Assume the Cassandra cluster is located in somewhere in US. Clients that 
connect from different part of the world will have different timestamp (if we 
rely on client timestamp to store write) or If a coordinator is responsible for 
generating timestamp during the write, it also may have different time among 
replicas, resulting in write conflict can occur and impossible to resolve.

When you are saying “Clocks should be synchronized”, does Cassandra synchronize 
the clock if so how can you refer me to any related article?

Regards,

Ibrahim

On Sun, Sep 6, 2015 at 1:23 PM, Daniel Schulz <danielschulz2...@hotmail.com> 
wrote:
Cassandra is not changing clock settings; it does use it to omit TTL'ed rows in 
compaction phases. So make sure your nodes agree on the very same time using 
e.g. NTP. It is very crucial for data integrity on most distributed systems.

Date: Sun, 6 Sep 2015 13:10:14 +0100
Subject: Re: Is Cassandra really Strong consistency?
From: ibrahimsaba...@gmail.com
To: user@cassandra.apache.org

Do you mean Cassandra does synchronize the clock across all the cluster, if yes 
how it does so, or could you refer me to any related article?

Thank you

Ibrahim

On Sun, Sep 6, 2015 at 1:00 PM, Laing, Michael <michael.la...@nytimes.com> 
wrote:
I think I saw this before. 

Clocks must be synchronized.

On Sun, Sep 6, 2015 at 7:28 AM, ibrahim El-sanosi <ibrahimsaba...@gmail.com> 
wrote:
Hi folks,

Assume we have 4-nodes cluster N1, N2, N3, and N4 and replication factor is 3.  
When write CL =ALL and read CL=ONE:

Client c1 sends W1 = [k1,V1] to N1 (a coordinator).  A coordinator (N1) 
generates timestamp Mon 05-09-2015 11:30:40,200 (according to its local clock) 
and assigned it to W1 and sends the W1 to N2, N3, and N4. After few seconds, 
Client c2 sends W2 = [K1, V2] to N4 (a coordinator). A coordinator (N4) 
generates timestamp Mon 05-09-2015 11:30:38,200 (according to its local clock, 
but assume here N4 clock a bit behind, nearly 2 seconds) and assigned it to W2 
and sends the W2 to N2, N3, and N4 (itself). 

As we have write CL =ALL and read CL = ONE. Now, Client c2 wants to read K1, 
connects to a coordinator N1, a coordinator sends read K1 to N2, picking latest 
timestamp which is [K1, V1]:Mon 05-09-2015 11:30:40,200.

So in this scenario, the latest data that wrote to the replicas is [K1, V2] 
which should be the correct one, but it reads [K1,V1] because of divert clock. 

Can such scenario occur?

Thank you

smime.p7s
Description: S/MIME cryptographic signature

Re: Is Cassandra really Strong consistency?

Reply via email to