Hi Daniel, Nate. Thanks for your answers. We have *gc_grace_seconds=864000 *(which is the default, I believe). We've also checked the clocks - they are synchronized.
2013/10/16 Nate McCall <n...@thelastpickle.com> > This is almost a guaranteed sign that the clocks are off in your cluster. > If you run the select query a couple of times in a row right after > deletion, do you see the data appear again? > > > On Wed, Oct 16, 2013 at 12:12 AM, Alexander Shutyaev > <shuty...@gmail.com>wrote: > >> Hi all, >> >> Unfortunately, we still have a problem. I've modified my code, so that it >> explicitly sets the consistency level to *QUORUM* for each query. >> However, we found out a few cases when the record is deleted on only *1 >> node of 3*. In this cases the *delete* query executed ok, and the *select >> * query that we do right after delete returned *0* rows. Later when we >> ran a global daily check *select* returned *1* row. How can that be? >> What can we be missing? >> >> >> 2013/10/7 Jon Haddad <j...@jonhaddad.com> >> >>> I haven't used VMWare but it seems odd that it would lock up the ntp >>> port. try "ps aux | grep ntp" to see if ntpd it's already running. >>> >>> On Oct 7, 2013, at 12:23 AM, Alexander Shutyaev <shuty...@gmail.com> >>> wrote: >>> >>> Hi Michał, >>> >>> I didn't notice your message at first.. Well this seems like a real >>> cause candidate.. I'll add an explicit consistency level QUORUM and see if >>> that helps. Thanks >>> >>> >>> 2013/10/7 Alexander Shutyaev <shuty...@gmail.com> >>> >>>> Hi Nick, >>>> >>>> Thanks for the note! We have our cassanra instances installed on >>>> virtual hosts in VMWare and the clock synchronization is handled by the >>>> latter, so I can't use ntpdate (says that NTP socket is in use). Is there >>>> any way to check if the clocks are really synchronized? My best attempt was >>>> using three shell windows with commands already typed thus requiring only >>>> clicking on the window and hitting enter. The results varied by 100-200 >>>> msec which I guess is just about the time I need to click and press enter >>>> :) >>>> >>>> Thanks in advance, >>>> Alexander >>>> >>>> >>>> 2013/10/7 Nikolay Mihaylov <n...@nmmm.nu> >>>> >>>>> Hi >>>>> >>>>> my two cents - before doing anything else, make sure clocks are >>>>> synchronized to the millisecond. >>>>> ntp will do so. >>>>> >>>>> Nick. >>>>> >>>>> >>>>> On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shuty...@gmail.com >>>>> > wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> We have encountered the following problem with cassandra. >>>>>> >>>>>> * We use *cassandra v2.0.0* from *Datastax* community repo. >>>>>> >>>>>> * We have *3 nodes* in a cluster, all of them are seed providers. >>>>>> >>>>>> * We have a *single keyspace* with *replication factor = 3*: >>>>>> >>>>>> *CREATE KEYSPACE bof WITH replication = {* >>>>>> * 'class': 'SimpleStrategy',* >>>>>> * 'replication_factor': '3'* >>>>>> *};* >>>>>> >>>>>> * We use *Datastax Java CQL Driver v1.0.3* in our application. >>>>>> >>>>>> * We have not modified any *consistency settings* in our app, so I >>>>>> assume we have the *default QUORUM* (2 out of 3 in our case) >>>>>> consistency *for reads and writes*. >>>>>> >>>>>> * We have 400+ tables which can be divided in two groups (*main* and >>>>>> *uids*). All tables in a group have the same definition, they vary >>>>>> only by name. The sample definitions are: >>>>>> >>>>>> *CREATE TABLE bookingfile (* >>>>>> * key text,* >>>>>> * entity_created timestamp,* >>>>>> * entity_createdby text,* >>>>>> * entity_entitytype text,* >>>>>> * entity_modified timestamp,* >>>>>> * entity_modifiedby text,* >>>>>> * entity_status text,* >>>>>> * entity_uid text,* >>>>>> * entity_updatepolicy text,* >>>>>> * version_created timestamp,* >>>>>> * version_createdby text,* >>>>>> * version_data blob,* >>>>>> * version_dataformat text,* >>>>>> * version_datasource text,* >>>>>> * version_modified timestamp,* >>>>>> * version_modifiedby text,* >>>>>> * version_uid text,* >>>>>> * version_versionnotes text,* >>>>>> * version_versionnumber int,* >>>>>> * versionscount int,* >>>>>> * PRIMARY KEY (key)* >>>>>> *) WITH* >>>>>> * bloom_filter_fp_chance=0.010000 AND* >>>>>> * caching='KEYS_ONLY' AND* >>>>>> * comment='' AND* >>>>>> * dclocal_read_repair_chance=0.000000 AND* >>>>>> * gc_grace_seconds=864000 AND* >>>>>> * index_interval=128 AND* >>>>>> * read_repair_chance=0.100000 AND* >>>>>> * replicate_on_write='true' AND* >>>>>> * populate_io_cache_on_flush='false' AND* >>>>>> * default_time_to_live=0 AND* >>>>>> * speculative_retry='NONE' AND* >>>>>> * memtable_flush_period_in_ms=0 AND* >>>>>> * compaction={'class': 'SizeTieredCompactionStrategy'} AND* >>>>>> * compression={'sstable_compression': 'LZ4Compressor'};* >>>>>> >>>>>> *CREATE TABLE bookingfile_uids (* >>>>>> * date text,* >>>>>> * timeanduid text,* >>>>>> * deleted boolean,* >>>>>> * PRIMARY KEY (date, timeanduid)* >>>>>> *) WITH* >>>>>> * bloom_filter_fp_chance=0.010000 AND* >>>>>> * caching='KEYS_ONLY' AND* >>>>>> * comment='' AND* >>>>>> * dclocal_read_repair_chance=0.000000 AND* >>>>>> * gc_grace_seconds=864000 AND* >>>>>> * index_interval=128 AND* >>>>>> * read_repair_chance=0.100000 AND* >>>>>> * replicate_on_write='true' AND* >>>>>> * populate_io_cache_on_flush='false' AND* >>>>>> * default_time_to_live=0 AND* >>>>>> * speculative_retry='NONE' AND* >>>>>> * memtable_flush_period_in_ms=0 AND* >>>>>> * compaction={'class': 'SizeTieredCompactionStrategy'} AND* >>>>>> * compression={'sstable_compression': 'LZ4Compressor'};* >>>>>> * >>>>>> * >>>>>> *CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids (deleted); >>>>>> * >>>>>> >>>>>> * We don't have any problems with the tables from the *main* group. >>>>>> >>>>>> * As for the tables from the *uids* group we have noticed that >>>>>> sometimes deletes from these tables do not do their job. They don't fail, >>>>>> they just do nothing. We have confirmed this by adding a select query >>>>>> after >>>>>> deletes. Most times everything is ok and select returns 0 records. But >>>>>> sometimes (~5 out of 100,000) it returns the supposedly deleted row. >>>>>> >>>>>> * We have logged the ExecutionInfo objects with query tracing that >>>>>> are returned by Datastax's driver. Here are the details >>>>>> >>>>>> *DELETE FROM bookingfile_uids WHERE date=C20131006 AND >>>>>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e* >>>>>> >>>>>> *ExecutionInfo: [* >>>>>> *triedHosts=/10.10.30.23;* >>>>>> *queriedHost=/10.10.30.23;* >>>>>> *achievedConsistencyLevel=null;* >>>>>> *queryTrace=* >>>>>> * Message received from /10.10.30.23 on /10.10.30.19[Thread-56] at >>>>>> Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Acquiring switchLock read lock on /10.10.30.19[MutationStage:49] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun >>>>>> Oct 06 19:55:57 MSK 2013* >>>>>> * Adding to bookingfile_uids memtable on /10.10.30.19[MutationStage:49] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Enqueuing response to /10.10.30.23 on /10.10.30.19[MutationStage:49] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.23] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Message received from /10.10.30.23 on /10.10.30.20[Thread-34] at >>>>>> Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun >>>>>> Oct 06 19:55:57 MSK 2013* >>>>>> * Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStage:43] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Determining replicas for mutation on >>>>>> /10.10.30.23[Native-Transport-Requests:1387368] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Sending message to /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Acquiring switchLock read lock on /10.10.30.23[MutationStage:46] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Sending message to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at >>>>>> Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Processing response from /10.10.30.20 on >>>>>> /10.10.30.23[RequestResponseStage:4] >>>>>> at Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at >>>>>> Sun Oct 06 19:55:57 MSK 2013* >>>>>> * Processing response from /10.10.30.19 on >>>>>> /10.10.30.23[RequestResponseStage:4] >>>>>> at Sun Oct 06 19:55:57 MSK 2013;* >>>>>> *]* >>>>>> >>>>>> *SELECT * FROM bookingfile_uids WHERE date=C20131006 AND >>>>>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 1 >>>>>> record >>>>>> * >>>>>> >>>>>> the same query 1 second later: >>>>>> >>>>>> *DELETE FROM bookingfile_uids WHERE date=C20131006 AND >>>>>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e* >>>>>> * >>>>>> * >>>>>> *ExecutionInfo: [* >>>>>> *triedHosts=/10.10.30.20;* >>>>>> *queriedHost=/10.10.30.20;* >>>>>> *achievedConsistencyLevel=null;* >>>>>> *queryTrace=* >>>>>> * Message received from /10.10.30.20 on /10.10.30.19[Thread-57] at >>>>>> Sun Oct 06 19:55:58 MSK 2013* >>>>>> * Determining replicas for mutation on >>>>>> /10.10.30.20[Native-Transport-Requests:1387705] >>>>>> at Sun Oct 06 19:55:58 MSK 2013* >>>>>> * Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] >>>>>> at Sun Oct 06 19:55:58 MSK 2013* >>>>>> * Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun >>>>>> Oct 06 19:55:58 MSK 2013* >>>>>> * Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] >>>>>> at Sun Oct 06 19:55:58 MSK 2013* >>>>>> * Sending message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19] >>>>>> at Sun Oct 06 19:55:58 MSK 2013* >>>>>> * Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] >>>>>> at Sun Oct 06 19:55:58 MSK 2013* >>>>>> * Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at >>>>>> Sun Oct 06 19:55:58 MSK 2013* >>>>>> * Processing response from /10.10.30.19 on >>>>>> /10.10.30.20[RequestResponseStage:6] >>>>>> at Sun Oct 06 19:55:58 MSK 2013* >>>>>> * Message received from /10.10.30.20 on /10.10.30.23[Thread-18] at >>>>>> Sun Oct 06 19:55:58 MSK 2013* >>>>>> *]* >>>>>> * >>>>>> * >>>>>> *SELECT * FROM bookingfile_uids WHERE date=C20131006 AND >>>>>> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 0 >>>>>> records.* >>>>>> >>>>>> * Cassandra's system.log on all 3 nodes lists nothing special during >>>>>> these queries - just some compaction related *INFO* entries. >>>>>> >>>>>> Can anyone help with this? What is our next step? >>>>>> >>>>>> Thanks in advance, >>>>>> Alexander >>>>>> >>>>> >>>>> >>>> >>> >>> >> > > > -- > ----------------- > Nate McCall > Austin, TX > @zznate > > Co-Founder & Sr. Technical Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com >