I haven't used VMWare but it seems odd that it would lock up the ntp port.  try 
"ps aux | grep ntp" to see if ntpd it's already running.

On Oct 7, 2013, at 12:23 AM, Alexander Shutyaev <shuty...@gmail.com> wrote:

> Hi Michał,
> 
> I didn't notice your message at first.. Well this seems like a real cause 
> candidate.. I'll add an explicit consistency level QUORUM and see if that 
> helps. Thanks
> 
> 
> 2013/10/7 Alexander Shutyaev <shuty...@gmail.com>
> Hi Nick,
> 
> Thanks for the note! We have our cassanra instances installed on virtual 
> hosts in VMWare and the clock synchronization is handled by the latter, so I 
> can't use ntpdate (says that NTP socket is in use). Is there any way to check 
> if the clocks are really synchronized? My best attempt was using three shell 
> windows with commands already typed thus requiring only clicking on the 
> window and hitting enter. The results varied by 100-200 msec which I guess is 
> just about the time I need to click and press enter :)
> 
> Thanks in advance,
> Alexander
> 
> 
> 2013/10/7 Nikolay Mihaylov <n...@nmmm.nu>
> Hi
> 
> my two cents - before doing anything else, make sure clocks are synchronized 
> to the millisecond.
> ntp will do so.
> 
> Nick.
> 
> 
> On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shuty...@gmail.com> wrote:
> Hi all,
> 
> We have encountered the following problem with cassandra.
> 
> * We use cassandra v2.0.0 from Datastax community repo.
> 
> * We have 3 nodes in a cluster, all of them are seed providers.
> 
> * We have a single keyspace with replication factor = 3:
> 
> CREATE KEYSPACE bof WITH replication = {
>   'class': 'SimpleStrategy',
>   'replication_factor': '3'
> };
> 
> * We use Datastax Java CQL Driver v1.0.3 in our application.
> 
> * We have not modified any consistency settings in our app, so I assume we 
> have the default QUORUM (2 out of 3 in our case) consistency for reads and 
> writes.
> 
> * We have 400+ tables which can be divided in two groups (main and uids). All 
> tables in a group have the same definition, they vary only by name. The 
> sample definitions are:
> 
> CREATE TABLE bookingfile (
>   key text,
>   entity_created timestamp,
>   entity_createdby text,
>   entity_entitytype text,
>   entity_modified timestamp,
>   entity_modifiedby text,
>   entity_status text,
>   entity_uid text,
>   entity_updatepolicy text,
>   version_created timestamp,
>   version_createdby text,
>   version_data blob,
>   version_dataformat text,
>   version_datasource text,
>   version_modified timestamp,
>   version_modifiedby text,
>   version_uid text,
>   version_versionnotes text,
>   version_versionnumber int,
>   versionscount int,
>   PRIMARY KEY (key)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='NONE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> 
> CREATE TABLE bookingfile_uids (
>   date text,
>   timeanduid text,
>   deleted boolean,
>   PRIMARY KEY (date, timeanduid)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='NONE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> 
> CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids (deleted);
> 
> * We don't have any problems with the tables from the main group.
> 
> * As for the tables from the uids group we have noticed that sometimes 
> deletes from these tables do not do their job. They don't fail, they just do 
> nothing. We have confirmed this by adding a select query after deletes. Most 
> times everything is ok and select returns 0 records. But sometimes (~5 out of 
> 100,000) it returns the supposedly deleted row.
> 
> * We have logged the ExecutionInfo objects with query tracing that are 
> returned by Datastax's driver. Here are the details
> 
> DELETE FROM bookingfile_uids WHERE date=C20131006 AND 
> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
> 
> ExecutionInfo: [
> triedHosts=/10.10.30.23;
> queriedHost=/10.10.30.23;
> achievedConsistencyLevel=null;
> queryTrace=
>       Message received from /10.10.30.23 on /10.10.30.19[Thread-56] at Sun 
> Oct 06 19:55:57 MSK 2013
>       Acquiring switchLock read lock on /10.10.30.19[MutationStage:49] at Sun 
> Oct 06 19:55:57 MSK 2013
>       Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun Oct 06 
> 19:55:57 MSK 2013
>       Adding to bookingfile_uids memtable on /10.10.30.19[MutationStage:49] 
> at Sun Oct 06 19:55:57 MSK 2013
>       Enqueuing response to /10.10.30.23 on /10.10.30.19[MutationStage:49] at 
> Sun Oct 06 19:55:57 MSK 2013
>       Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.23] at 
> Sun Oct 06 19:55:57 MSK 2013
>       Message received from /10.10.30.23 on /10.10.30.20[Thread-34] at Sun 
> Oct 06 19:55:57 MSK 2013
>       Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun 
> Oct 06 19:55:57 MSK 2013
>       Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 
> 19:55:57 MSK 2013
>       Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] 
> at Sun Oct 06 19:55:57 MSK 2013
>       Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStage:43] at 
> Sun Oct 06 19:55:57 MSK 2013
>       Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at 
> Sun Oct 06 19:55:57 MSK 2013
>       Determining replicas for mutation on 
> /10.10.30.23[Native-Transport-Requests:1387368] at Sun Oct 06 19:55:57 MSK 
> 2013
>       Sending message to /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19] at 
> Sun Oct 06 19:55:57 MSK 2013
>       Acquiring switchLock read lock on /10.10.30.23[MutationStage:46] at Sun 
> Oct 06 19:55:57 MSK 2013
>       Sending message to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20] at 
> Sun Oct 06 19:55:57 MSK 2013
>       Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at Sun Oct 
> 06 19:55:57 MSK 2013
>       Processing response from /10.10.30.20 on 
> /10.10.30.23[RequestResponseStage:4] at Sun Oct 06 19:55:57 MSK 2013
>       Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at Sun Oct 
> 06 19:55:57 MSK 2013
>       Processing response from /10.10.30.19 on 
> /10.10.30.23[RequestResponseStage:4] at Sun Oct 06 19:55:57 MSK 2013;
> ]
> 
> SELECT * FROM bookingfile_uids WHERE date=C20131006 AND 
> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 1 record
> 
> the same query 1 second later:
> 
> DELETE FROM bookingfile_uids WHERE date=C20131006 AND 
> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
> 
> ExecutionInfo: [
> triedHosts=/10.10.30.20;
> queriedHost=/10.10.30.20;
> achievedConsistencyLevel=null;
> queryTrace=
>       Message received from /10.10.30.20 on /10.10.30.19[Thread-57] at Sun 
> Oct 06 19:55:58 MSK 2013
>       Determining replicas for mutation on 
> /10.10.30.20[Native-Transport-Requests:1387705] at Sun Oct 06 19:55:58 MSK 
> 2013
>       Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun 
> Oct 06 19:55:58 MSK 2013
>       Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 
> 19:55:58 MSK 2013
>       Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] 
> at Sun Oct 06 19:55:58 MSK 2013
>       Sending message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19] at 
> Sun Oct 06 19:55:58 MSK 2013
>       Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at 
> Sun Oct 06 19:55:58 MSK 2013
>       Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at Sun Oct 
> 06 19:55:58 MSK 2013
>       Processing response from /10.10.30.19 on 
> /10.10.30.20[RequestResponseStage:6] at Sun Oct 06 19:55:58 MSK 2013
>       Message received from /10.10.30.20 on /10.10.30.23[Thread-18] at Sun 
> Oct 06 19:55:58 MSK 2013
> ]
> 
> SELECT * FROM bookingfile_uids WHERE date=C20131006 AND 
> timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 0 records.
> 
> * Cassandra's system.log on all 3 nodes lists nothing special during these 
> queries - just some compaction related INFO entries.
> 
> Can anyone help with this? What is our next step?
> 
> Thanks in advance,
> Alexander
> 
> 
> 

Reply via email to