On Oct 22, 2010, at 7:41 AM, Jérôme Verstrynge wrote:
> Let's imagine that A initiates its column write at: 334450 ms with 'AAA' and
> timestamp 334450 ms
> Let's imagine that E initiates its column write at: 334451 ms with 'ZZZ'and
> timestamp 334450 ms
> (E is the latest write)
>
> Let's imagine that A reaches C at 334455 ms and performs its write.
> Let's imagine that E reaches C at 334456 ms and attempts to performs its
> write. It will loose the timestamp-tie ('AAA' is greater than 'ZZZ').
How is this any different from E's perspective than if A had come along a
moment later with timestamp 334452?
What you describe is an application in *desperate* need of either a serious
redesign, or a distributed locking mechanism.
This really isn't a Cassandra-specific problem, Cassandra just happens to be
the distributed storage system at issue. Any such system without a locking
mechanism will present some form of this problem, and the answer will be the
same: Avoid it in the application design, or incorporate a locking mechanism
into the application.
> If there is a timestamp-tie, then the context becomes uncertain for E, out of
> the blue.
> If application E can't be sure about what has been saved in Cassandra, it
> cannot rely on what it has in memory. It is a vicious circle. It can't
> anticipate on the potential actions of A on the column too.
And how is this different from E's data being overwritten with a later
timestamp? Either way, what E thinks is in Cassandra really isn't.
If you need to make sure you have consistency at this level, you *need* a
locking mechanism.
> This is unsual for any application, but may be this is the price to pay for
> using Cassandra. Fair enough.
Hardly. Any non-serial application that doesn't use some form of locking has
this exact same problem at all levels of storage, possibly even in its internal
variables.
>
> If E is not informed of the timestamp tie, then it is left alone in the dark.
> Hence, this is why I say Cassandra is not deterministic to E. The result of a
> write is potentially non-deterministic in what it actually performs.
Cassandra is deterministic for a given input. What you're saying is you aren't
properly controlling the input that your application is giving it.
> If E was aware that it lost a timestamp-tie, it would know that there is a
> possible gap between its internal memory representation and what it tried to
> save into Cassandra. That is, EVEN if there is no further write on that same
> column (or, in other words, regardless of any potential subsequent races).
What is the significance of this?
>
> If E was informed it lost a timestamp-tie, it could re-read the column (and
> let's assume that there is no further write in between, but this does not
> change anything to the argument). It could spot that its write for timestamp
> value 334450 ms failed, and also the reason why ('AAA' greater than 'ZZZ). It
> could operate a new write, which eventually could result in another
> timestamp-tie, but at least it would be informed about it too... It would
> have a safety net.
To what end? A and E would apparently get into some sort of never-ending fight.
The application as described is broken and needs to be fixed.
>
> The case I am trying to cover is the case where the context for application E
> becomes invalid because of a successful write call to Cassandra without
> registration of 'ZZZ'. How can Cassandra call it a successful write, when in
> fact, it isn't for application E? I believe Cassandra should notify
> application E one way or another. This is why I mentioned an extra
> timestamp-tie flag in the write ACK sent by nodes back to node E.
Here's part of the problem. You're seeing E as a distinct application from A
which can behave completely independently. You need to stop thinking like that.
It leads to broken architectures
Even if the E and A processes come from entirely different code bases, you need
to start by thinking of them as one application. That application is broken.
>
> The subsequent question I have is:
>
> If 'value breaks timestamp-tie', how does Cassandra behave in case of
> updates? If there is a column with value 'AAA' at 334450 ms and an
> application explicitely wants to update this value to 'ZZZ' for 334450 ms, it
> seems like the timestamp-tie will prevent that. Hence, the update/mutation
> would be undeterministic to E. It seems like one should first delete the
> existing record and write a new one (and that could lead to race conditions
> and timestamp-ties too).
You need a locking mechanism. Timestamps aren't the droids you're looking for.
> I think this should be documented, because engineers will hit that 'local'
> undeterministic issue for sure if two instances of their applications perform
> 'completed writes' in the same column family. Completed does not mean
> successful, even with quorum (or ALL). They ought to know it.
I'm honestly not sure why they wouldn't. One need only perform a very cursory
investigation of Cassandra to realize that addition of a locking mechanism is
necessary for many applications, such as the one described here.
-NK