Hi Gudmundur, each write and overwrite has a timestamp associated with it (you can see these timestamps using the WRITETIME function). This timestamp is provided by the Cassandra server if you don't explicitly supply it yourself (which, judging by your queries, you are not). If the timestamp of the overwrite is older than that of the write, then the overwrite will have no effect. My guess is that clocks on your Cassandra nodes aren't synchronized, and this is causing your overwrite to get an older timestamp than the original write.

Ciao, Duncan.

On 11/03/15 15:58, Guðmundur Örn Jóhannsson wrote:
I have a 3 node cluster of Cassandra version 2.0.9. My keyspace replication
factor is 3 and I'm querying with consistency level ALL.

Problem: When I do an insert to overwrite an old row, the old data still appears
in all select queries! This does not apply to all cases, just a subset of the
rows in this table. Some are a day old, some are 3 hours old. It happens both
using commandline cqlsh and via the datastax java cql driver (in that case with
CL=2)

Of course this is a huge concern since the insert is accepted but has no effect,
meaning that the write is effectively lost!

Here is how it happens:

First I do a select to see the original data:

cqlsh> use mykeyspace
cqlsh:mykeyspace> consistency all;
Consistency level set to ALL.
cqlsh:mykeyspace> select created_timestamp, stored_value from mytable where
key_a = 'the_value' and key_b = '43052960';

It returns the original row, as expected.


Then I do the insert to overwrite it:

cqlsh:mykeyspace> insert into mytable (key_a, key_b, created_timestamp,
created_yyyymmdd, stored_value)
values('the_value','43052960',dateof(now()),'20150312','test6');

And re-run the previous select query, expecting to see the overwritten values,
but it returns the old values for all columns.


If I use a totally different value for the value of key_b, then the problem does
not appear instantly, but it seems that in at least some cases, the issue
appears over time. Not sure.



My table looks like this:

CREATE TABLE mytable (
   key_a text,
   key_b text,
   value_a text,
   created_timestamp timestamp,
   created_yyyymmdd text,
   stored_value text,
   PRIMARY KEY ((key_a), key_b)
) WITH
   bloom_filter_fp_chance=0.010000 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.000000 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.100000 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
CREATE INDEX mytable_idx_yyyymmdd ON mytable (created_yyyymmdd);

--
regards,
Gudmundur Johannsson

Reply via email to