Hi my two cents - before doing anything else, make sure clocks are synchronized to the millisecond. ntp will do so.
Nick. On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shuty...@gmail.com>wrote: > Hi all, > > We have encountered the following problem with cassandra. > > * We use *cassandra v2.0.0* from *Datastax* community repo. > > * We have *3 nodes* in a cluster, all of them are seed providers. > > * We have a *single keyspace* with *replication factor = 3*: > > *CREATE KEYSPACE bof WITH replication = {* > * 'class': 'SimpleStrategy',* > * 'replication_factor': '3'* > *};* > > * We use *Datastax Java CQL Driver v1.0.3* in our application. > > * We have not modified any *consistency settings* in our app, so I assume > we have the *default QUORUM* (2 out of 3 in our case) consistency *for > reads and writes*. > > * We have 400+ tables which can be divided in two groups (*main* and *uids > *). All tables in a group have the same definition, they vary only by > name. The sample definitions are: > > *CREATE TABLE bookingfile (* > * key text,* > * entity_created timestamp,* > * entity_createdby text,* > * entity_entitytype text,* > * entity_modified timestamp,* > * entity_modifiedby text,* > * entity_status text,* > * entity_uid text,* > * entity_updatepolicy text,* > * version_created timestamp,* > * version_createdby text,* > * version_data blob,* > * version_dataformat text,* > * version_datasource text,* > * version_modified timestamp,* > * version_modifiedby text,* > * version_uid text,* > * version_versionnotes text,* > * version_versionnumber int,* > * versionscount int,* > * PRIMARY KEY (key)* > *) WITH* > * bloom_filter_fp_chance=0.010000 AND* > * caching='KEYS_ONLY' AND* > * comment='' AND* > * dclocal_read_repair_chance=0.000000 AND* > * gc_grace_seconds=864000 AND* > * index_interval=128 AND* > * read_repair_chance=0.100000 AND* > * replicate_on_write='true' AND* > * populate_io_cache_on_flush='false' AND* > * default_time_to_live=0 AND* > * speculative_retry='NONE' AND* > * memtable_flush_period_in_ms=0 AND* > * compaction={'class': 'SizeTieredCompactionStrategy'} AND* > * compression={'sstable_compression': 'LZ4Compressor'};* > > *CREATE TABLE bookingfile_uids (* > * date text,* > * timeanduid text,* > * deleted boolean,* > * PRIMARY KEY (date, timeanduid)* > *) WITH* > * bloom_filter_fp_chance=0.010000 AND* > * caching='KEYS_ONLY' AND* > * comment='' AND* > * dclocal_read_repair_chance=0.000000 AND* > * gc_grace_seconds=864000 AND* > * index_interval=128 AND* > * read_repair_chance=0.100000 AND* > * replicate_on_write='true' AND* > * populate_io_cache_on_flush='false' AND* > * default_time_to_live=0 AND* > * speculative_retry='NONE' AND* > * memtable_flush_period_in_ms=0 AND* > * compaction={'class': 'SizeTieredCompactionStrategy'} AND* > * compression={'sstable_compression': 'LZ4Compressor'};* > * > * > *CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids (deleted);* > > * We don't have any problems with the tables from the *main* group. > > * As for the tables from the *uids* group we have noticed that sometimes > deletes from these tables do not do their job. They don't fail, they just > do nothing. We have confirmed this by adding a select query after deletes. > Most times everything is ok and select returns 0 records. But sometimes (~5 > out of 100,000) it returns the supposedly deleted row. > > * We have logged the ExecutionInfo objects with query tracing that are > returned by Datastax's driver. Here are the details > > *DELETE FROM bookingfile_uids WHERE date=C20131006 AND > timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e* > > *ExecutionInfo: [* > *triedHosts=/10.10.30.23;* > *queriedHost=/10.10.30.23;* > *achievedConsistencyLevel=null;* > *queryTrace=* > * Message received from /10.10.30.23 on /10.10.30.19[Thread-56] at Sun > Oct 06 19:55:57 MSK 2013* > * Acquiring switchLock read lock on /10.10.30.19[MutationStage:49] at Sun > Oct 06 19:55:57 MSK 2013* > * Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun Oct 06 > 19:55:57 MSK 2013* > * Adding to bookingfile_uids memtable on /10.10.30.19[MutationStage:49] > at Sun Oct 06 19:55:57 MSK 2013* > * Enqueuing response to /10.10.30.23 on /10.10.30.19[MutationStage:49] at > Sun Oct 06 19:55:57 MSK 2013* > * Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.23] at > Sun Oct 06 19:55:57 MSK 2013* > * Message received from /10.10.30.23 on /10.10.30.20[Thread-34] at Sun > Oct 06 19:55:57 MSK 2013* > * Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun > Oct 06 19:55:57 MSK 2013* > * Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 > 19:55:57 MSK 2013* > * Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] > at Sun Oct 06 19:55:57 MSK 2013* > * Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStage:43] at > Sun Oct 06 19:55:57 MSK 2013* > * Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at > Sun Oct 06 19:55:57 MSK 2013* > * Determining replicas for mutation on > /10.10.30.23[Native-Transport-Requests:1387368] > at Sun Oct 06 19:55:57 MSK 2013* > * Sending message to /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19] at > Sun Oct 06 19:55:57 MSK 2013* > * Acquiring switchLock read lock on /10.10.30.23[MutationStage:46] at Sun > Oct 06 19:55:57 MSK 2013* > * Sending message to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20] at > Sun Oct 06 19:55:57 MSK 2013* > * Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at Sun Oct > 06 19:55:57 MSK 2013* > * Processing response from /10.10.30.20 on > /10.10.30.23[RequestResponseStage:4] > at Sun Oct 06 19:55:57 MSK 2013* > * Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at Sun Oct > 06 19:55:57 MSK 2013* > * Processing response from /10.10.30.19 on > /10.10.30.23[RequestResponseStage:4] > at Sun Oct 06 19:55:57 MSK 2013;* > *]* > > *SELECT * FROM bookingfile_uids WHERE date=C20131006 AND > timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 1 record > * > > the same query 1 second later: > > *DELETE FROM bookingfile_uids WHERE date=C20131006 AND > timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e* > * > * > *ExecutionInfo: [* > *triedHosts=/10.10.30.20;* > *queriedHost=/10.10.30.20;* > *achievedConsistencyLevel=null;* > *queryTrace=* > * Message received from /10.10.30.20 on /10.10.30.19[Thread-57] at Sun > Oct 06 19:55:58 MSK 2013* > * Determining replicas for mutation on > /10.10.30.20[Native-Transport-Requests:1387705] > at Sun Oct 06 19:55:58 MSK 2013* > * Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun > Oct 06 19:55:58 MSK 2013* > * Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 > 19:55:58 MSK 2013* > * Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] > at Sun Oct 06 19:55:58 MSK 2013* > * Sending message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19] at > Sun Oct 06 19:55:58 MSK 2013* > * Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at > Sun Oct 06 19:55:58 MSK 2013* > * Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at Sun Oct > 06 19:55:58 MSK 2013* > * Processing response from /10.10.30.19 on > /10.10.30.20[RequestResponseStage:6] > at Sun Oct 06 19:55:58 MSK 2013* > * Message received from /10.10.30.20 on /10.10.30.23[Thread-18] at Sun > Oct 06 19:55:58 MSK 2013* > *]* > * > * > *SELECT * FROM bookingfile_uids WHERE date=C20131006 AND > timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 0 > records.* > > * Cassandra's system.log on all 3 nodes lists nothing special during these > queries - just some compaction related *INFO* entries. > > Can anyone help with this? What is our next step? > > Thanks in advance, > Alexander >