Hi Nick,

Thanks for the note! We have our cassanra instances installed on virtual
hosts in VMWare and the clock synchronization is handled by the latter, so
I can't use ntpdate (says that NTP socket is in use). Is there any way to
check if the clocks are really synchronized? My best attempt was using
three shell windows with commands already typed thus requiring only
clicking on the window and hitting enter. The results varied by 100-200
msec which I guess is just about the time I need to click and press enter :)

Thanks in advance,
Alexander


2013/10/7 Nikolay Mihaylov <n...@nmmm.nu>

> Hi
>
> my two cents - before doing anything else, make sure clocks are
> synchronized to the millisecond.
> ntp will do so.
>
> Nick.
>
>
> On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shuty...@gmail.com>wrote:
>
>> Hi all,
>>
>> We have encountered the following problem with cassandra.
>>
>> * We use *cassandra v2.0.0* from *Datastax* community repo.
>>
>> * We have *3 nodes* in a cluster, all of them are seed providers.
>>
>> * We have a *single keyspace* with *replication factor = 3*:
>>
>> *CREATE KEYSPACE bof WITH replication = {*
>> *  'class': 'SimpleStrategy',*
>> *  'replication_factor': '3'*
>> *};*
>>
>> * We use *Datastax Java CQL Driver v1.0.3* in our application.
>>
>> * We have not modified any *consistency settings* in our app, so I
>> assume we have the *default QUORUM* (2 out of 3 in our case) consistency
>> *for reads and writes*.
>>
>> * We have 400+ tables which can be divided in two groups (*main* and *
>> uids*). All tables in a group have the same definition, they vary only
>> by name. The sample definitions are:
>>
>> *CREATE TABLE bookingfile (*
>> *  key text,*
>> *  entity_created timestamp,*
>> *  entity_createdby text,*
>> *  entity_entitytype text,*
>> *  entity_modified timestamp,*
>> *  entity_modifiedby text,*
>> *  entity_status text,*
>> *  entity_uid text,*
>> *  entity_updatepolicy text,*
>> *  version_created timestamp,*
>> *  version_createdby text,*
>> *  version_data blob,*
>> *  version_dataformat text,*
>> *  version_datasource text,*
>> *  version_modified timestamp,*
>> *  version_modifiedby text,*
>> *  version_uid text,*
>> *  version_versionnotes text,*
>> *  version_versionnumber int,*
>> *  versionscount int,*
>> *  PRIMARY KEY (key)*
>> *) WITH*
>> *  bloom_filter_fp_chance=0.010000 AND*
>> *  caching='KEYS_ONLY' AND*
>> *  comment='' AND*
>> *  dclocal_read_repair_chance=0.000000 AND*
>> *  gc_grace_seconds=864000 AND*
>> *  index_interval=128 AND*
>> *  read_repair_chance=0.100000 AND*
>> *  replicate_on_write='true' AND*
>> *  populate_io_cache_on_flush='false' AND*
>> *  default_time_to_live=0 AND*
>> *  speculative_retry='NONE' AND*
>> *  memtable_flush_period_in_ms=0 AND*
>> *  compaction={'class': 'SizeTieredCompactionStrategy'} AND*
>> *  compression={'sstable_compression': 'LZ4Compressor'};*
>>
>> *CREATE TABLE bookingfile_uids (*
>> *  date text,*
>> *  timeanduid text,*
>> *  deleted boolean,*
>> *  PRIMARY KEY (date, timeanduid)*
>> *) WITH*
>> *  bloom_filter_fp_chance=0.010000 AND*
>> *  caching='KEYS_ONLY' AND*
>> *  comment='' AND*
>> *  dclocal_read_repair_chance=0.000000 AND*
>> *  gc_grace_seconds=864000 AND*
>> *  index_interval=128 AND*
>> *  read_repair_chance=0.100000 AND*
>> *  replicate_on_write='true' AND*
>> *  populate_io_cache_on_flush='false' AND*
>> *  default_time_to_live=0 AND*
>> *  speculative_retry='NONE' AND*
>> *  memtable_flush_period_in_ms=0 AND*
>> *  compaction={'class': 'SizeTieredCompactionStrategy'} AND*
>> *  compression={'sstable_compression': 'LZ4Compressor'};*
>> *
>> *
>> *CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids (deleted);*
>>
>> * We don't have any problems with the tables from the *main* group.
>>
>> * As for the tables from the *uids* group we have noticed that sometimes
>> deletes from these tables do not do their job. They don't fail, they just
>> do nothing. We have confirmed this by adding a select query after deletes.
>> Most times everything is ok and select returns 0 records. But sometimes (~5
>> out of 100,000) it returns the supposedly deleted row.
>>
>> * We have logged the ExecutionInfo objects with query tracing that are
>> returned by Datastax's driver. Here are the details
>>
>> *DELETE FROM bookingfile_uids WHERE date=C20131006 AND
>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e*
>>
>> *ExecutionInfo: [*
>> *triedHosts=/10.10.30.23;*
>> *queriedHost=/10.10.30.23;*
>>  *achievedConsistencyLevel=null;*
>> *queryTrace=*
>> * Message received from /10.10.30.23 on /10.10.30.19[Thread-56] at Sun
>> Oct 06 19:55:57 MSK 2013*
>> * Acquiring switchLock read lock on /10.10.30.19[MutationStage:49] at
>> Sun Oct 06 19:55:57 MSK 2013*
>> * Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun Oct 06
>> 19:55:57 MSK 2013*
>> * Adding to bookingfile_uids memtable on /10.10.30.19[MutationStage:49]
>> at Sun Oct 06 19:55:57 MSK 2013*
>> * Enqueuing response to /10.10.30.23 on /10.10.30.19[MutationStage:49]
>> at Sun Oct 06 19:55:57 MSK 2013*
>> * Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.23] at
>> Sun Oct 06 19:55:57 MSK 2013*
>> * Message received from /10.10.30.23 on /10.10.30.20[Thread-34] at Sun
>> Oct 06 19:55:57 MSK 2013*
>> * Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at
>> Sun Oct 06 19:55:57 MSK 2013*
>> * Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06
>> 19:55:57 MSK 2013*
>> * Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43]
>> at Sun Oct 06 19:55:57 MSK 2013*
>> * Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStage:43]
>> at Sun Oct 06 19:55:57 MSK 2013*
>> * Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at
>> Sun Oct 06 19:55:57 MSK 2013*
>> * Determining replicas for mutation on 
>> /10.10.30.23[Native-Transport-Requests:1387368]
>> at Sun Oct 06 19:55:57 MSK 2013*
>> * Sending message to /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19] at
>> Sun Oct 06 19:55:57 MSK 2013*
>> * Acquiring switchLock read lock on /10.10.30.23[MutationStage:46] at
>> Sun Oct 06 19:55:57 MSK 2013*
>> * Sending message to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20] at
>> Sun Oct 06 19:55:57 MSK 2013*
>> * Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at Sun
>> Oct 06 19:55:57 MSK 2013*
>> * Processing response from /10.10.30.20 on 
>> /10.10.30.23[RequestResponseStage:4]
>> at Sun Oct 06 19:55:57 MSK 2013*
>> * Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at Sun
>> Oct 06 19:55:57 MSK 2013*
>> * Processing response from /10.10.30.19 on 
>> /10.10.30.23[RequestResponseStage:4]
>> at Sun Oct 06 19:55:57 MSK 2013;*
>> *]*
>>
>> *SELECT * FROM bookingfile_uids WHERE date=C20131006 AND
>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 1 record
>> *
>>
>> the same query 1 second later:
>>
>> *DELETE FROM bookingfile_uids WHERE date=C20131006 AND
>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e*
>> *
>> *
>> *ExecutionInfo: [*
>> *triedHosts=/10.10.30.20;*
>> *queriedHost=/10.10.30.20;*
>> *achievedConsistencyLevel=null;*
>> *queryTrace=*
>> * Message received from /10.10.30.20 on /10.10.30.19[Thread-57] at Sun
>> Oct 06 19:55:58 MSK 2013*
>> * Determining replicas for mutation on 
>> /10.10.30.20[Native-Transport-Requests:1387705]
>> at Sun Oct 06 19:55:58 MSK 2013*
>> * Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at
>> Sun Oct 06 19:55:58 MSK 2013*
>> * Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06
>> 19:55:58 MSK 2013*
>> * Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43]
>> at Sun Oct 06 19:55:58 MSK 2013*
>> * Sending message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19] at
>> Sun Oct 06 19:55:58 MSK 2013*
>> * Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at
>> Sun Oct 06 19:55:58 MSK 2013*
>> * Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at Sun
>> Oct 06 19:55:58 MSK 2013*
>> * Processing response from /10.10.30.19 on 
>> /10.10.30.20[RequestResponseStage:6]
>> at Sun Oct 06 19:55:58 MSK 2013*
>> * Message received from /10.10.30.20 on /10.10.30.23[Thread-18] at Sun
>> Oct 06 19:55:58 MSK 2013*
>> *]*
>> *
>> *
>> *SELECT * FROM bookingfile_uids WHERE date=C20131006 AND
>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 0
>> records.*
>>
>> * Cassandra's system.log on all 3 nodes lists nothing special during
>> these queries - just some compaction related *INFO* entries.
>>
>> Can anyone help with this? What is our next step?
>>
>> Thanks in advance,
>> Alexander
>>
>
>

Reply via email to