Re: DELETE does not delete :)

Michael Theroux Thu, 17 Oct 2013 09:31:40 -0700

A couple questions:

1) How did you determine that the record is deleted on only one node? Are you 
looking for tombstones, or the original entry that was inserted? Note that when 
an item is deleted, the original entry can still be in an SSTABLE somewhere, 
and the tombstone can be in another SSTABLE until those tables are compacted 
together.


2) When you did the global daily check, are you sure you are not getting range 
ghosts? I assume they are still possible on 2.0 
(http://www.datastax.com/docs/0.7/getting_started/using_cli, search for range 
ghosts).

Thanks,
-Mike




On Thursday, October 17, 2013 6:36 AM, Alexander Shutyaev <shuty...@gmail.com> 
wrote:
 
Hi Daniel, Nate.

Thanks for your answers. We have gc_grace_seconds=864000 (which is the default, 
I believe). We've also checked the clocks - they are synchronized.



2013/10/16 Nate McCall <n...@thelastpickle.com>

This is almost a guaranteed sign that the clocks are off in your cluster. If 
you run the select query a couple of times in a row right after deletion, do 
you see the data appear again?
>
>
>
>On Wed, Oct 16, 2013 at 12:12 AM, Alexander Shutyaev <shuty...@gmail.com> 
>wrote:
>
>Hi all,
>>
>>
>>Unfortunately, we still have a problem. I've modified my code, so that it 
>>explicitly sets the consistency level to QUORUM for each query. However, we 
>>found out a few cases when the record is deleted on only 1 node of 3. In this 
>>cases the delete query executed ok, and the select query that we do right 
>>after delete returned 0 rows. Later when we ran a global daily check select 
>>returned 1 row. How can that be? What can we be missing?
>>
>>
>>
>>2013/10/7 Jon Haddad <j...@jonhaddad.com>
>>
>>I haven't used VMWare but it seems odd that it would lock up the ntp port.  
>>try "ps aux | grep ntp" to see if ntpd it's already running.
>>>
>>>
>>>On Oct 7, 2013, at 12:23 AM, Alexander Shutyaev <shuty...@gmail.com> wrote:
>>>
>>>Hi Michał,
>>>>
>>>>
>>>>I didn't notice your message at first.. Well this seems like a real cause 
>>>>candidate.. I'll add an explicit consistency level QUORUM and see if that 
>>>>helps. Thanks
>>>>
>>>>
>>>>
>>>>2013/10/7 Alexander Shutyaev <shuty...@gmail.com>
>>>>
>>>>Hi Nick,
>>>>>
>>>>>Thanks for the note! We have our cassanra instances installed on virtual 
>>>>>hosts in VMWare and the clock synchronization is handled by the latter, so 
>>>>>I can't use ntpdate (says that NTP socket is in use). Is there any way to 
>>>>>check if the clocks are really synchronized? My best attempt was using 
>>>>>three shell windows with commands already typed thus requiring only 
>>>>>clicking on the window and hitting enter. The results varied by 100-200 
>>>>>msec which I guess is just about the time I need to click and press enter 
>>>>>:)
>>>>>
>>>>>
>>>>>Thanks in advance,
>>>>>Alexander
>>>>>
>>>>>
>>>>>
>>>>>2013/10/7 Nikolay Mihaylov <n...@nmmm.nu>
>>>>>
>>>>>Hi
>>>>>>
>>>>>>
>>>>>>my two cents - before doing anything else, make sure clocks are 
>>>>>>synchronized to the millisecond.
>>>>>>ntp will do so.
>>>>>>
>>>>>>
>>>>>>Nick.
>>>>>>
>>>>>>
>>>>>>
>>>>>>On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shuty...@gmail.com> 
>>>>>>wrote:
>>>>>>
>>>>>>Hi all,
>>>>>>>
>>>>>>>
>>>>>>>We have encountered the following problem with cassandra.
>>>>>>>
>>>>>>>
>>>>>>>* We use cassandra v2.0.0 from Datastax community repo.
>>>>>>>
>>>>>>>
>>>>>>>* We have 3 nodes in a cluster, all of them are seed providers.
>>>>>>>
>>>>>>>
>>>>>>>* We have a single keyspace with replication factor = 3:
>>>>>>>
>>>>>>>
>>>>>>>CREATE KEYSPACE bof WITH replication = {
>>>>>>>  'class': 'SimpleStrategy',
>>>>>>>  'replication_factor': '3'
>>>>>>>};
>>>>>>>
>>>>>>>
>>>>>>>* We use Datastax Java CQL Driver v1.0.3 in our application.
>>>>>>>
>>>>>>>
>>>>>>>* We have not modified any consistency settings in our app, so I assume 
>>>>>>>we have the default QUORUM (2 out of 3 in our case) consistency for 
>>>>>>>reads and writes.
>>>>>>>
>>>>>>>
>>>>>>>* We have 400+ tables which can be divided in two groups (main and 
>>>>>>>uids). All tables in a group have the same definition, they vary only by 
>>>>>>>name. The sample definitions are:
>>>>>>>
>>>>>>>
>>>>>>>CREATE TABLE bookingfile (
>>>>>>>  key text,
>>>>>>>  entity_created timestamp,
>>>>>>>  entity_createdby text,
>>>>>>>  entity_entitytype text,
>>>>>>>  entity_modified timestamp,
>>>>>>>  entity_modifiedby text,
>>>>>>>  entity_status text,
>>>>>>>  entity_uid text,
>>>>>>>  entity_updatepolicy text,
>>>>>>>  version_created timestamp,
>>>>>>>  version_createdby text,
>>>>>>>  version_data blob,
>>>>>>>  version_dataformat text,
>>>>>>>  version_datasource text,
>>>>>>>  version_modified timestamp,
>>>>>>>  version_modifiedby text,
>>>>>>>  version_uid text,
>>>>>>>  version_versionnotes text,
>>>>>>>  version_versionnumber int,
>>>>>>>  versionscount int,
>>>>>>>  PRIMARY KEY (key)
>>>>>>>) WITH
>>>>>>>  bloom_filter_fp_chance=0.010000 AND
>>>>>>>  caching='KEYS_ONLY' AND
>>>>>>>  comment='' AND
>>>>>>>  dclocal_read_repair_chance=0.000000 AND
>>>>>>>  gc_grace_seconds=864000 AND
>>>>>>>  index_interval=128 AND
>>>>>>>  read_repair_chance=0.100000 AND
>>>>>>>  replicate_on_write='true' AND
>>>>>>>  populate_io_cache_on_flush='false' AND
>>>>>>>  default_time_to_live=0 AND
>>>>>>>  speculative_retry='NONE' AND
>>>>>>>  memtable_flush_period_in_ms=0 AND
>>>>>>>  compaction={'class': 'SizeTieredCompactionStrategy'} AND
>>>>>>>  compression={'sstable_compression': 'LZ4Compressor'};
>>>>>>>
>>>>>>>
>>>>>>>CREATE TABLE bookingfile_uids (
>>>>>>>  date text,
>>>>>>>  timeanduid text,
>>>>>>>  deleted boolean,
>>>>>>>  PRIMARY KEY (date, timeanduid)
>>>>>>>) WITH
>>>>>>>  bloom_filter_fp_chance=0.010000 AND
>>>>>>>  caching='KEYS_ONLY' AND
>>>>>>>  comment='' AND
>>>>>>>  dclocal_read_repair_chance=0.000000 AND
>>>>>>>  gc_grace_seconds=864000 AND
>>>>>>>  index_interval=128 AND
>>>>>>>  read_repair_chance=0.100000 AND
>>>>>>>  replicate_on_write='true' AND
>>>>>>>  populate_io_cache_on_flush='false' AND
>>>>>>>  default_time_to_live=0 AND
>>>>>>>  speculative_retry='NONE' AND
>>>>>>>  memtable_flush_period_in_ms=0 AND
>>>>>>>  compaction={'class': 'SizeTieredCompactionStrategy'} AND
>>>>>>>  compression={'sstable_compression': 'LZ4Compressor'};
>>>>>>>
>>>>>>>
>>>>>>>CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids (deleted);
>>>>>>>
>>>>>>>
>>>>>>>* We don't have any problems with the tables from the main group.
>>>>>>>
>>>>>>>
>>>>>>>* As for the tables from the uids group we have noticed that sometimes 
>>>>>>>deletes from these tables do not do their job. They don't fail, they 
>>>>>>>just do nothing. We have confirmed this by adding a select query after 
>>>>>>>deletes. Most times everything is ok and select returns 0 records. But 
>>>>>>>sometimes (~5 out of 100,000) it returns the supposedly deleted row.
>>>>>>>
>>>>>>>
>>>>>>>* We have logged the ExecutionInfo objects with query tracing that are 
>>>>>>>returned by Datastax's driver. Here are the details
>>>>>>>
>>>>>>>
>>>>>>>DELETE FROM bookingfile_uids WHERE date=C20131006 AND 
>>>>>>>timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
>>>>>>>
>>>>>>>
>>>>>>>ExecutionInfo: [
>>>>>>>triedHosts=/10.10.30.23;
>>>>>>>queriedHost=/10.10.30.23;
>>>>>>>achievedConsistencyLevel=null;
>>>>>>>queryTrace=
>>>>>>>Message received from /10.10.30.23 on /10.10.30.19[Thread-56] at Sun Oct 
>>>>>>>06 19:55:57 MSK 2013
>>>>>>>Acquiring switchLock read lock on /10.10.30.19[MutationStage:49] at Sun 
>>>>>>>Oct 06 19:55:57 MSK 2013
>>>>>>>Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun Oct 06 
>>>>>>>19:55:57 MSK 2013
>>>>>>>Adding to bookingfile_uids memtable on /10.10.30.19[MutationStage:49] at 
>>>>>>>Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Enqueuing response to /10.10.30.23 on /10.10.30.19[MutationStage:49] at 
>>>>>>>Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.23] at 
>>>>>>>Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Message received from /10.10.30.23 on /10.10.30.20[Thread-34] at Sun Oct 
>>>>>>>06 19:55:57 MSK 2013
>>>>>>>Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun 
>>>>>>>Oct 06 19:55:57 MSK 2013
>>>>>>>Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 
>>>>>>>19:55:57 MSK 2013
>>>>>>>Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] at 
>>>>>>>Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStage:43] at 
>>>>>>>Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at 
>>>>>>>Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Determining replicas for mutation on 
>>>>>>>/10.10.30.23[Native-Transport-Requests:1387368] at Sun Oct 06 19:55:57 
>>>>>>>MSK 2013
>>>>>>>Sending message to /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19] at 
>>>>>>>Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Acquiring switchLock read lock on /10.10.30.23[MutationStage:46] at Sun 
>>>>>>>Oct 06 19:55:57 MSK 2013
>>>>>>>Sending message to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20] at 
>>>>>>>Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at Sun Oct 
>>>>>>>06 19:55:57 MSK 2013
>>>>>>>Processing response from /10.10.30.20 on 
>>>>>>>/10.10.30.23[RequestResponseStage:4] at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at Sun Oct 
>>>>>>>06 19:55:57 MSK 2013
>>>>>>>Processing response from /10.10.30.19 on 
>>>>>>>/10.10.30.23[RequestResponseStage:4] at Sun Oct 06 19:55:57 MSK 2013;
>>>>>>>]
>>>>>>>
>>>>>>>
>>>>>>>SELECT * FROM bookingfile_uids WHERE date=C20131006 AND 
>>>>>>>timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 1 
>>>>>>>record
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>the same query 1 second later:
>>>>>>>
>>>>>>>
>>>>>>>DELETE FROM bookingfile_uids WHERE date=C20131006 AND 
>>>>>>>timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
>>>>>>>
>>>>>>>
>>>>>>>ExecutionInfo: [
>>>>>>>triedHosts=/10.10.30.20;
>>>>>>>queriedHost=/10.10.30.20;
>>>>>>>achievedConsistencyLevel=null;
>>>>>>>queryTrace=
>>>>>>>Message received from /10.10.30.20 on /10.10.30.19[Thread-57] at Sun Oct 
>>>>>>>06 19:55:58 MSK 2013
>>>>>>>Determining replicas for mutation on 
>>>>>>>/10.10.30.20[Native-Transport-Requests:1387705] at Sun Oct 06 19:55:58 
>>>>>>>MSK 2013
>>>>>>>Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun 
>>>>>>>Oct 06 19:55:58 MSK 2013
>>>>>>>Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 
>>>>>>>19:55:58 MSK 2013
>>>>>>>Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] at 
>>>>>>>Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Sending message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19] at 
>>>>>>>Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at 
>>>>>>>Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at Sun Oct 
>>>>>>>06 19:55:58 MSK 2013
>>>>>>>Processing response from /10.10.30.19 on 
>>>>>>>/10.10.30.20[RequestResponseStage:6] at Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Message received from /10.10.30.20 on /10.10.30.23[Thread-18] at Sun Oct 
>>>>>>>06 19:55:58 MSK 2013
>>>>>>>]
>>>>>>>
>>>>>>>
>>>>>>>SELECT * FROM bookingfile_uids WHERE date=C20131006 AND 
>>>>>>>timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 0 
>>>>>>>records.
>>>>>>>
>>>>>>>
>>>>>>>* Cassandra's system.log on all 3 nodes lists nothing special during 
>>>>>>>these queries - just some compaction related INFO entries.
>>>>>>>
>>>>>>>
>>>>>>>Can anyone help with this? What is our next step?
>>>>>>>
>>>>>>>
>>>>>>>Thanks in advance,
>>>>>>>Alexander
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
>-- 
>
>-----------------
>Nate McCall
>Austin, TX
>@zznate
>
>Co-Founder & Sr. Technical Consultant
>Apache Cassandra Consulting
>http://www.thelastpickle.com

Re: DELETE does not delete :)

Reply via email to