It sounds like either there is a fairly obvious bug, or you're doing something wrong. :)
Can you reproduce against a single node? On Tue, Apr 27, 2010 at 5:14 PM, Joost Ouwerkerk <jo...@openplaces.org> wrote: > Update: I ran a test whereby I deleted ALL the rows in a column > family, using a consistency level of ALL. To do this, I mapped the > ColumnFamily and called remove on each row id. There were 1.5 million > rows, so 1.5 million rows were deleted. > > I ran a counter job immediately after. This job maps the same column > family and tests if any data is returned. If not, it considers the > row a "tombstone". If yes, it considers the row not deleted. Below > are the hadoop counters for those jobs. Note the fluctuation in the > number of rows with data over time, and the increase in time to map > the column family after the destroy job. No other clients were > accessing cassandra during this time. > > I'm thoroughly confused. > > Count: started 13:02:30 EDT, finished 13:11:33 EDT (9 minutes 2 seconds): > ROWS: 1,542,479 > TOMBSTONES: 69 > > Destroy: started 16:48:45 EDT, finished 17:07:36 EDT (18 minutes 50 seconds) > DESTROYED: 1,542,548 > > Count: started 17:15:42 EDT, finished 17:31:03 EDT (15 minutes 21 seconds) > ROWS 876,464 > TOMBSTONES 666,084 > > Count: started 17:31:32, finished 17:47:16 (15mins, 44 seconds) > ROWS 1,451,665 > TOMBSTONES 90,883 > > Count: started 17:52:34, finished 18:10:28 (17mins, 53 seconds) > ROWS 1,425,644 > TOMBSTONES 116,904 > > On Tue, Apr 27, 2010 at 5:37 PM, Joost Ouwerkerk <jo...@openplaces.org> wrote: >> Clocks are in sync: >> >> cluster04:~/cassandra$ dsh -g development "date" >> Tue Apr 27 17:36:33 EDT 2010 >> Tue Apr 27 17:36:33 EDT 2010 >> Tue Apr 27 17:36:33 EDT 2010 >> Tue Apr 27 17:36:33 EDT 2010 >> Tue Apr 27 17:36:34 EDT 2010 >> Tue Apr 27 17:36:34 EDT 2010 >> Tue Apr 27 17:36:34 EDT 2010 >> Tue Apr 27 17:36:34 EDT 2010 >> Tue Apr 27 17:36:34 EDT 2010 >> Tue Apr 27 17:36:35 EDT 2010 >> Tue Apr 27 17:36:35 EDT 2010 >> Tue Apr 27 17:36:35 EDT 2010 >> >> On Tue, Apr 27, 2010 at 5:35 PM, Nathan McCall <n...@vervewireless.com> >> wrote: >>> Have you confirmed that your clocks are all synced in the cluster? >>> This may be the result of an unintentional read-repair occurring if >>> that were the case. >>> >>> -Nate >>> >>> On Tue, Apr 27, 2010 at 2:20 PM, Joost Ouwerkerk <jo...@openplaces.org> >>> wrote: >>>> Hmm... Even after deleting with cl.ALL, I'm getting data back for some >>>> rows after having deleted them. Which rows return data is >>>> inconsistent from one run of the job to the next. >>>> >>>> On Tue, Apr 27, 2010 at 1:44 PM, Joost Ouwerkerk <jo...@openplaces.org> >>>> wrote: >>>>> To check that rows are gone, I check that KeySlice.columns is empty. And >>>>> as >>>>> I mentioned, immediately after the delete job, this returns the expected >>>>> number. >>>>> Unfortunately I reproduced with QUORUM this morning. No node outages. I >>>>> am >>>>> going to try ALL to see if that changes anything, but I am starting to >>>>> wonder if I'm doing something else wrong. >>>>> On Mon, Apr 26, 2010 at 9:45 PM, Jonathan Ellis <jbel...@gmail.com> wrote: >>>>>> >>>>>> How are you checking that the rows are gone? >>>>>> >>>>>> Are you experiencing node outages during this? >>>>>> >>>>>> DC_QUORUM is unfinished code right now, you should avoid using it. >>>>>> Can you reproduce with normal QUORUM? >>>>>> >>>>>> On Sat, Apr 24, 2010 at 12:23 PM, Joost Ouwerkerk <jo...@openplaces.org> >>>>>> wrote: >>>>>> > I'm having trouble deleting rows in Cassandra. After running a job >>>>>> > that >>>>>> > deletes hundreds of rows, I run another job that verifies that the rows >>>>>> > are >>>>>> > gone. Both jobs run correctly. However, when I run the verification >>>>>> > job an >>>>>> > hour later, the rows have re-appeared. This is not a case of >>>>>> > "ghosting" >>>>>> > because the verification job actually checks that there is data in the >>>>>> > columns. >>>>>> > >>>>>> > I am running a cluster with 12 nodes and a replication factor of 3. I >>>>>> > am >>>>>> > using DC_QUORUM consistency when deleting. >>>>>> > >>>>>> > Any ideas? >>>>>> > Joost. >>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Jonathan Ellis >>>>>> Project Chair, Apache Cassandra >>>>>> co-founder of Riptano, the source for professional Cassandra support >>>>>> http://riptano.com >>>>> >>>>> >>>> >>> >> > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com