On Sun, Nov 13, 2011 at 5:57 PM, Maxim Potekhin <potek...@bnl.gov> wrote: > I've done more experimentation and the behavior persists: I start with a > normal dataset which is searcheable by a secondary index. I select by that > index the entries that match a certain criterion, then delete those. I tried > two methods of deletion -- individual cf.remove() as well as batch removal > in Pycassa. > What happens after that is as follows: attempts to read the same CF, using > the same index values start to time out in the Pycassa client (there is a > thrift message about timeout). The entries not touched by such attempted > deletion are read just fine still. > > Has anyone seen such behavior?
What you're probably running into is a huge amount of tombstone filtering on the read (see http://wiki.apache.org/cassandra/DistributedDeletes) Since you're dealing with timeseries data, using a row-bucketing technique like http://rubyscale.com/2011/basic-time-series-with-cassandra/ might help by eliminating the need for an index. -Brandon