Re: Docs: "Why do deleted keys show up during range scans?"

aaron morton Tue, 14 Jun 2011 15:38:49 -0700

> While you can delete a row, if I understand correctly, what happens is a
> tombstone is created which matches every column, so in effect it is
> deleting the columns, not the whole row.

A tombstone is created at the level of the delete, rather than for every 
column. Otherwise imagine deleting a row with 1 million columns.

Tombstones are created at the Column, Super Column and Row level. Deleting at 
the row level writes a row level tombstone. All these different tombstones are 
resolved during the read process. 

My understanding of "So to special case leaving out result entries for 
deletions, we would have to check the entire rest of the row to make sure there 
is no undeleted data anywhere else either (in which case leaving the key out 
would be an error)." is...

Resolving the predicate to determine if a row contains the specified columns is 
a (somewhat) bound operation. Determining if a row as ANY non deleted columns 
is a potentially unbound operation that could involve lots-o-io .  Imagine a 
row with 1 million columns, and the first 100,000 have been deleted. 

For each row in the result set you can say either :

1) It has 1 or more of the columns I requested.
2) It has none of the columns I requested. 
3) it has no columns, but cassandra decided it was too much work to 
conclusively prove that. Because after all I asked if it had some specific 
columns not if it had any columns.  

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15 Jun 2011, at 04:25, Jeremiah Jordan wrote:

> Also, tombstone's are not "attached" anywhere.  A tombstone is just a
> column with special value which says "I was deleted".  And I am pretty
> sure they go into SSTables etc the exact same way regular columns do.
> 
> -----Original Message-----
> From: Jeremiah Jordan [mailto:jeremiah.jor...@morningstar.com] 
> Sent: Tuesday, June 14, 2011 11:22 AM
> To: user@cassandra.apache.org
> Subject: RE: Docs: "Why do deleted keys show up during range scans?"
> 
> I am pretty sure how Cassandra works will make sense to you if you think
> of it that way, that rows do not get deleted, columns get deleted.
> While you can delete a row, if I understand correctly, what happens is a
> tombstone is created which matches every column, so in effect it is
> deleting the columns, not the whole row.  A row key will not be
> forgotten/deleted until there are no columns or tombstones which
> reference it.  Until there are no references to that row key in any
> SSTables you can still get that key back from the API.
> 
> -Jeremiah
> 
> -----Original Message-----
> From: AJ [mailto:a...@dude.podzone.net]
> Sent: Monday, June 13, 2011 12:11 PM
> To: user@cassandra.apache.org
> Subject: Re: Docs: "Why do deleted keys show up during range scans?"
> 
> On 6/13/2011 10:14 AM, Stephen Connolly wrote:
>> 
>> store the query inverted.
>> 
>> that way empty ->  deleted
>> 
> I don't know what that means... get the other columns?  Can you
> elaborate?  Is there docs for this or is this a hack/workaround?
> 
>> the tombstones are stored for each column that had data IIRC... but at
> 
>> this point my grok of C* is lacking
> I suspected this, but wasn't sure.  It sounds like when a row is
> deleted, a tombstone is not "attached" to the row, but to each column???
> So, if all columns are deleted then the row is considered deleted?
> Hmmm, that doesn't sound right, but that doesn't mean it isn't ! ;o)

Re: Docs: "Why do deleted keys show up during range scans?"

Reply via email to