Hi Dan, Thanks for the reply. We're on 2.0.13. In fact, I already solved this exactly the way you described - by changing compaction strategy to STCS via JMX and letting compactions collect tombstones.
Roman On Fri, Jul 10, 2015 at 8:57 PM, Dan Kinder <dkin...@turnitin.com> wrote: > > On Sun, Jul 5, 2015 at 1:40 PM, Roman Tkachenko <ro...@mailgunhq.com> > wrote: > >> Hey guys, >> >> I have a table with RF=3 and LCS. Data model makes use of "wide rows". A >> certain query run against this table times out and tracing reveals the >> following error on two out of three nodes: >> >> *Scanned over 100000 tombstones; query aborted (see >> tombstone_failure_threshold)* >> >> This basically means every request with CL higher than "one" fails. >> >> I have two questions: >> >> * How could it happen that only two out of three nodes have overwhelming >> tombstones? For the third node tracing shows sensible *"Read 815 live >> and 837 tombstoned cells"* traces. >> > > One theory: before 2.1.6 compactions on wide rows with lots of tombstones > could take forever or potentially never finish. What version of Cassandra > are you on? It may be that you got lucky with one node that has been able > to keep up but the others haven't been able to. > > >> >> * Anything I can do to fix those two nodes? I have already set gc_grace >> to 1 day and tried to make compaction strategy more aggressive >> (unchecked_tombstone_compaction - true, tombstone_threshold - 0.01) to no >> avail - a couple of days have already passed and it still gives the same >> error. >> > > You probably want major compaction which is coming soon for LCS ( > https://issues.apache.org/jira/browse/CASSANDRA-7272) but not here yet. > > The alternative is, if you have enough time and headroom (this is going to > do some pretty serious compaction so be careful), alter your table to STCS, > let it compact into one SSTable, then convert back to LCS. It's pretty > heavy-handed but as long as your gc_grace is low enough it'll do the job. > Definitely do NOT do this if you have many tombstones in single wide rows > and are not >2.1.6 > > >> >> Thanks! >> >> Roman >> >> > > > -- > Dan Kinder > Senior Software Engineer > Turnitin – www.turnitin.com > dkin...@turnitin.com >