I haven't used VMWare but it seems odd that it would lock up the ntp port. try "ps aux | grep ntp" to see if ntpd it's already running.
On Oct 7, 2013, at 12:23 AM, Alexander Shutyaev <shuty...@gmail.com> wrote: > Hi Michał, > > I didn't notice your message at first.. Well this seems like a real cause > candidate.. I'll add an explicit consistency level QUORUM and see if that > helps. Thanks > > > 2013/10/7 Alexander Shutyaev <shuty...@gmail.com> > Hi Nick, > > Thanks for the note! We have our cassanra instances installed on virtual > hosts in VMWare and the clock synchronization is handled by the latter, so I > can't use ntpdate (says that NTP socket is in use). Is there any way to check > if the clocks are really synchronized? My best attempt was using three shell > windows with commands already typed thus requiring only clicking on the > window and hitting enter. The results varied by 100-200 msec which I guess is > just about the time I need to click and press enter :) > > Thanks in advance, > Alexander > > > 2013/10/7 Nikolay Mihaylov <n...@nmmm.nu> > Hi > > my two cents - before doing anything else, make sure clocks are synchronized > to the millisecond. > ntp will do so. > > Nick. > > > On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shuty...@gmail.com> wrote: > Hi all, > > We have encountered the following problem with cassandra. > > * We use cassandra v2.0.0 from Datastax community repo. > > * We have 3 nodes in a cluster, all of them are seed providers. > > * We have a single keyspace with replication factor = 3: > > CREATE KEYSPACE bof WITH replication = { > 'class': 'SimpleStrategy', > 'replication_factor': '3' > }; > > * We use Datastax Java CQL Driver v1.0.3 in our application. > > * We have not modified any consistency settings in our app, so I assume we > have the default QUORUM (2 out of 3 in our case) consistency for reads and > writes. > > * We have 400+ tables which can be divided in two groups (main and uids). All > tables in a group have the same definition, they vary only by name. The > sample definitions are: > > CREATE TABLE bookingfile ( > key text, > entity_created timestamp, > entity_createdby text, > entity_entitytype text, > entity_modified timestamp, > entity_modifiedby text, > entity_status text, > entity_uid text, > entity_updatepolicy text, > version_created timestamp, > version_createdby text, > version_data blob, > version_dataformat text, > version_datasource text, > version_modified timestamp, > version_modifiedby text, > version_uid text, > version_versionnotes text, > version_versionnumber int, > versionscount int, > PRIMARY KEY (key) > ) WITH > bloom_filter_fp_chance=0.010000 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.000000 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=0.100000 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='NONE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > > CREATE TABLE bookingfile_uids ( > date text, > timeanduid text, > deleted boolean, > PRIMARY KEY (date, timeanduid) > ) WITH > bloom_filter_fp_chance=0.010000 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.000000 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=0.100000 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='NONE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > > CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids (deleted); > > * We don't have any problems with the tables from the main group. > > * As for the tables from the uids group we have noticed that sometimes > deletes from these tables do not do their job. They don't fail, they just do > nothing. We have confirmed this by adding a select query after deletes. Most > times everything is ok and select returns 0 records. But sometimes (~5 out of > 100,000) it returns the supposedly deleted row. > > * We have logged the ExecutionInfo objects with query tracing that are > returned by Datastax's driver. Here are the details > > DELETE FROM bookingfile_uids WHERE date=C20131006 AND > timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e > > ExecutionInfo: [ > triedHosts=/10.10.30.23; > queriedHost=/10.10.30.23; > achievedConsistencyLevel=null; > queryTrace= > Message received from /10.10.30.23 on /10.10.30.19[Thread-56] at Sun > Oct 06 19:55:57 MSK 2013 > Acquiring switchLock read lock on /10.10.30.19[MutationStage:49] at Sun > Oct 06 19:55:57 MSK 2013 > Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun Oct 06 > 19:55:57 MSK 2013 > Adding to bookingfile_uids memtable on /10.10.30.19[MutationStage:49] > at Sun Oct 06 19:55:57 MSK 2013 > Enqueuing response to /10.10.30.23 on /10.10.30.19[MutationStage:49] at > Sun Oct 06 19:55:57 MSK 2013 > Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.23] at > Sun Oct 06 19:55:57 MSK 2013 > Message received from /10.10.30.23 on /10.10.30.20[Thread-34] at Sun > Oct 06 19:55:57 MSK 2013 > Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun > Oct 06 19:55:57 MSK 2013 > Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 > 19:55:57 MSK 2013 > Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] > at Sun Oct 06 19:55:57 MSK 2013 > Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStage:43] at > Sun Oct 06 19:55:57 MSK 2013 > Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at > Sun Oct 06 19:55:57 MSK 2013 > Determining replicas for mutation on > /10.10.30.23[Native-Transport-Requests:1387368] at Sun Oct 06 19:55:57 MSK > 2013 > Sending message to /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19] at > Sun Oct 06 19:55:57 MSK 2013 > Acquiring switchLock read lock on /10.10.30.23[MutationStage:46] at Sun > Oct 06 19:55:57 MSK 2013 > Sending message to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20] at > Sun Oct 06 19:55:57 MSK 2013 > Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at Sun Oct > 06 19:55:57 MSK 2013 > Processing response from /10.10.30.20 on > /10.10.30.23[RequestResponseStage:4] at Sun Oct 06 19:55:57 MSK 2013 > Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at Sun Oct > 06 19:55:57 MSK 2013 > Processing response from /10.10.30.19 on > /10.10.30.23[RequestResponseStage:4] at Sun Oct 06 19:55:57 MSK 2013; > ] > > SELECT * FROM bookingfile_uids WHERE date=C20131006 AND > timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 1 record > > the same query 1 second later: > > DELETE FROM bookingfile_uids WHERE date=C20131006 AND > timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e > > ExecutionInfo: [ > triedHosts=/10.10.30.20; > queriedHost=/10.10.30.20; > achievedConsistencyLevel=null; > queryTrace= > Message received from /10.10.30.20 on /10.10.30.19[Thread-57] at Sun > Oct 06 19:55:58 MSK 2013 > Determining replicas for mutation on > /10.10.30.20[Native-Transport-Requests:1387705] at Sun Oct 06 19:55:58 MSK > 2013 > Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun > Oct 06 19:55:58 MSK 2013 > Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 > 19:55:58 MSK 2013 > Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] > at Sun Oct 06 19:55:58 MSK 2013 > Sending message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19] at > Sun Oct 06 19:55:58 MSK 2013 > Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at > Sun Oct 06 19:55:58 MSK 2013 > Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at Sun Oct > 06 19:55:58 MSK 2013 > Processing response from /10.10.30.19 on > /10.10.30.20[RequestResponseStage:6] at Sun Oct 06 19:55:58 MSK 2013 > Message received from /10.10.30.20 on /10.10.30.23[Thread-18] at Sun > Oct 06 19:55:58 MSK 2013 > ] > > SELECT * FROM bookingfile_uids WHERE date=C20131006 AND > timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 0 records. > > * Cassandra's system.log on all 3 nodes lists nothing special during these > queries - just some compaction related INFO entries. > > Can anyone help with this? What is our next step? > > Thanks in advance, > Alexander > > >