Hi Michał,

I didn't notice your message at first.. Well this seems like a real cause
candidate.. I'll add an explicit consistency level QUORUM and see if that
helps. Thanks


2013/10/7 Alexander Shutyaev <shuty...@gmail.com>

> Hi Nick,
>
> Thanks for the note! We have our cassanra instances installed on virtual
> hosts in VMWare and the clock synchronization is handled by the latter, so
> I can't use ntpdate (says that NTP socket is in use). Is there any way to
> check if the clocks are really synchronized? My best attempt was using
> three shell windows with commands already typed thus requiring only
> clicking on the window and hitting enter. The results varied by 100-200
> msec which I guess is just about the time I need to click and press enter :)
>
> Thanks in advance,
> Alexander
>
>
> 2013/10/7 Nikolay Mihaylov <n...@nmmm.nu>
>
>> Hi
>>
>> my two cents - before doing anything else, make sure clocks are
>> synchronized to the millisecond.
>> ntp will do so.
>>
>> Nick.
>>
>>
>> On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shuty...@gmail.com>wrote:
>>
>>> Hi all,
>>>
>>> We have encountered the following problem with cassandra.
>>>
>>> * We use *cassandra v2.0.0* from *Datastax* community repo.
>>>
>>> * We have *3 nodes* in a cluster, all of them are seed providers.
>>>
>>> * We have a *single keyspace* with *replication factor = 3*:
>>>
>>> *CREATE KEYSPACE bof WITH replication = {*
>>> *  'class': 'SimpleStrategy',*
>>> *  'replication_factor': '3'*
>>> *};*
>>>
>>> * We use *Datastax Java CQL Driver v1.0.3* in our application.
>>>
>>> * We have not modified any *consistency settings* in our app, so I
>>> assume we have the *default QUORUM* (2 out of 3 in our case)
>>> consistency *for reads and writes*.
>>>
>>> * We have 400+ tables which can be divided in two groups (*main* and *
>>> uids*). All tables in a group have the same definition, they vary only
>>> by name. The sample definitions are:
>>>
>>> *CREATE TABLE bookingfile (*
>>> *  key text,*
>>> *  entity_created timestamp,*
>>> *  entity_createdby text,*
>>> *  entity_entitytype text,*
>>> *  entity_modified timestamp,*
>>> *  entity_modifiedby text,*
>>> *  entity_status text,*
>>> *  entity_uid text,*
>>> *  entity_updatepolicy text,*
>>> *  version_created timestamp,*
>>> *  version_createdby text,*
>>> *  version_data blob,*
>>> *  version_dataformat text,*
>>> *  version_datasource text,*
>>> *  version_modified timestamp,*
>>> *  version_modifiedby text,*
>>> *  version_uid text,*
>>> *  version_versionnotes text,*
>>> *  version_versionnumber int,*
>>> *  versionscount int,*
>>> *  PRIMARY KEY (key)*
>>> *) WITH*
>>> *  bloom_filter_fp_chance=0.010000 AND*
>>> *  caching='KEYS_ONLY' AND*
>>> *  comment='' AND*
>>> *  dclocal_read_repair_chance=0.000000 AND*
>>> *  gc_grace_seconds=864000 AND*
>>> *  index_interval=128 AND*
>>> *  read_repair_chance=0.100000 AND*
>>> *  replicate_on_write='true' AND*
>>> *  populate_io_cache_on_flush='false' AND*
>>> *  default_time_to_live=0 AND*
>>> *  speculative_retry='NONE' AND*
>>> *  memtable_flush_period_in_ms=0 AND*
>>> *  compaction={'class': 'SizeTieredCompactionStrategy'} AND*
>>> *  compression={'sstable_compression': 'LZ4Compressor'};*
>>>
>>> *CREATE TABLE bookingfile_uids (*
>>> *  date text,*
>>> *  timeanduid text,*
>>> *  deleted boolean,*
>>> *  PRIMARY KEY (date, timeanduid)*
>>> *) WITH*
>>> *  bloom_filter_fp_chance=0.010000 AND*
>>> *  caching='KEYS_ONLY' AND*
>>> *  comment='' AND*
>>> *  dclocal_read_repair_chance=0.000000 AND*
>>> *  gc_grace_seconds=864000 AND*
>>> *  index_interval=128 AND*
>>> *  read_repair_chance=0.100000 AND*
>>> *  replicate_on_write='true' AND*
>>> *  populate_io_cache_on_flush='false' AND*
>>> *  default_time_to_live=0 AND*
>>> *  speculative_retry='NONE' AND*
>>> *  memtable_flush_period_in_ms=0 AND*
>>> *  compaction={'class': 'SizeTieredCompactionStrategy'} AND*
>>> *  compression={'sstable_compression': 'LZ4Compressor'};*
>>> *
>>> *
>>> *CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids (deleted);*
>>>
>>> * We don't have any problems with the tables from the *main* group.
>>>
>>> * As for the tables from the *uids* group we have noticed that
>>> sometimes deletes from these tables do not do their job. They don't fail,
>>> they just do nothing. We have confirmed this by adding a select query after
>>> deletes. Most times everything is ok and select returns 0 records. But
>>> sometimes (~5 out of 100,000) it returns the supposedly deleted row.
>>>
>>> * We have logged the ExecutionInfo objects with query tracing that are
>>> returned by Datastax's driver. Here are the details
>>>
>>> *DELETE FROM bookingfile_uids WHERE date=C20131006 AND
>>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e*
>>>
>>> *ExecutionInfo: [*
>>> *triedHosts=/10.10.30.23;*
>>> *queriedHost=/10.10.30.23;*
>>>  *achievedConsistencyLevel=null;*
>>> *queryTrace=*
>>> * Message received from /10.10.30.23 on /10.10.30.19[Thread-56] at Sun
>>> Oct 06 19:55:57 MSK 2013*
>>> * Acquiring switchLock read lock on /10.10.30.19[MutationStage:49] at
>>> Sun Oct 06 19:55:57 MSK 2013*
>>> * Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun Oct
>>> 06 19:55:57 MSK 2013*
>>> * Adding to bookingfile_uids memtable on /10.10.30.19[MutationStage:49]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Enqueuing response to /10.10.30.23 on /10.10.30.19[MutationStage:49]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.23]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Message received from /10.10.30.23 on /10.10.30.20[Thread-34] at Sun
>>> Oct 06 19:55:57 MSK 2013*
>>> * Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at
>>> Sun Oct 06 19:55:57 MSK 2013*
>>> * Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct
>>> 06 19:55:57 MSK 2013*
>>> * Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStage:43]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Determining replicas for mutation on 
>>> /10.10.30.23[Native-Transport-Requests:1387368]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Sending message to /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Acquiring switchLock read lock on /10.10.30.23[MutationStage:46] at
>>> Sun Oct 06 19:55:57 MSK 2013*
>>> * Sending message to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at Sun
>>> Oct 06 19:55:57 MSK 2013*
>>> * Processing response from /10.10.30.20 on 
>>> /10.10.30.23[RequestResponseStage:4]
>>> at Sun Oct 06 19:55:57 MSK 2013*
>>> * Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at Sun
>>> Oct 06 19:55:57 MSK 2013*
>>> * Processing response from /10.10.30.19 on 
>>> /10.10.30.23[RequestResponseStage:4]
>>> at Sun Oct 06 19:55:57 MSK 2013;*
>>> *]*
>>>
>>> *SELECT * FROM bookingfile_uids WHERE date=C20131006 AND
>>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 1 record
>>> *
>>>
>>> the same query 1 second later:
>>>
>>> *DELETE FROM bookingfile_uids WHERE date=C20131006 AND
>>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e*
>>> *
>>> *
>>> *ExecutionInfo: [*
>>> *triedHosts=/10.10.30.20;*
>>> *queriedHost=/10.10.30.20;*
>>> *achievedConsistencyLevel=null;*
>>> *queryTrace=*
>>> * Message received from /10.10.30.20 on /10.10.30.19[Thread-57] at Sun
>>> Oct 06 19:55:58 MSK 2013*
>>> * Determining replicas for mutation on 
>>> /10.10.30.20[Native-Transport-Requests:1387705]
>>> at Sun Oct 06 19:55:58 MSK 2013*
>>> * Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at
>>> Sun Oct 06 19:55:58 MSK 2013*
>>> * Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct
>>> 06 19:55:58 MSK 2013*
>>> * Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43]
>>> at Sun Oct 06 19:55:58 MSK 2013*
>>> * Sending message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19]
>>> at Sun Oct 06 19:55:58 MSK 2013*
>>> * Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23]
>>> at Sun Oct 06 19:55:58 MSK 2013*
>>> * Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at Sun
>>> Oct 06 19:55:58 MSK 2013*
>>> * Processing response from /10.10.30.19 on 
>>> /10.10.30.20[RequestResponseStage:6]
>>> at Sun Oct 06 19:55:58 MSK 2013*
>>> * Message received from /10.10.30.20 on /10.10.30.23[Thread-18] at Sun
>>> Oct 06 19:55:58 MSK 2013*
>>> *]*
>>> *
>>> *
>>> *SELECT * FROM bookingfile_uids WHERE date=C20131006 AND
>>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 0
>>> records.*
>>>
>>> * Cassandra's system.log on all 3 nodes lists nothing special during
>>> these queries - just some compaction related *INFO* entries.
>>>
>>> Can anyone help with this? What is our next step?
>>>
>>> Thanks in advance,
>>> Alexander
>>>
>>
>>
>

Reply via email to