Hi Aleksey, ok we'll split the work and only deal with row level tombstones in CASSANDRA-8527 <https://issues.apache.org/jira/browse/CASSANDRA-8527> and create a follow up ticket to work on the separate counts you're suggesting. My understanding of what you say is that you would not include range tombstones in the warn/fail threshold, but row level tombstones have somewhat an impact that is similar to cell tombstones. They will be retained in memory and will be sent to replicas. If we don't count them in the thresholds (at least for warnings), people will miss the fact that they may be reading a lot of tombstones. Are you ok with including those row tombstones as part of the thresholds ? This was the original intent for creating this ticket, which was a follow-up to CASSANDRA-8477 <https://issues.apache.org/jira/browse/CASSANDRA-8477>.
For the follow up ticket, we may want to move the discussion in JIRA once we've create the ticket, but a merge listener seems like the right way to detect rows shadowed by range tombstones. That would force to change the UnfilteredRowIterator interface to include the tombstones/rows/cells stats as this is what is returned from the lower levels of the read path. Is there any easier way to achieve this that you can think of, as that interface is used in many parts of the code ? On Wed, Dec 27, 2017 at 1:35 PM Aleksey Yeshchenko <alek...@apple.com> wrote: > As Jeff says, the number of actual tombstones is no less relevant than the > number of > cells and rows shadowed by them. And the way row and cell tombstones > affect a read > query can be very different from the way a big range tombstone might: you > potentially > have to retain every (regular) tombstone in memory and have replicas ship > them to > a coordinator, while you can discard everything shadowed by a big RT and > only serialize > a tiny bit of the data between the replicas and the coordinator. > > So a mixed metric that just mixes up rows and cells shadowed by all three > kinds of tombstones > without any differentiation, while better than not having that visibility > at all, is worse than having > a detailed rundown. If possible, I’d love to see proper separate counts > tracked: range, row, and single-cell tombstones encountered, and # of rows > or cells obsoleted by each type. > > I know that this is a non-trivial change, however, but hey, it doesn’t > have to be a trivial patch if it’s going into 4.0. > > In the meantime I think it’d be helpful to report that single count. But I > don’t like the idea of redefining what > tombstone_warn_threshold and tombstone_failure_threshold mean, even in a > major release, as RTs are > qualitatively different from other tombstones, and have a much lower > impact per dead row. > > — > AY > > On 22 December 2017 at 03:53:47, kurt greaves (k...@instaclustr.com) > wrote: > > I think that's a good idea for 4.x, but not so for current branches. I > think as long as we document it well for 4.0 upgrades it's not so much of a > problem. Obviously there will be cases of queries failing that were > previously succeeding but we can already use > tombstone_failure|warn_threshold to tune around that already. I don't think > we need another yaml setting to enable/disable counting deleted rows for > these thresholds, especially because it's going into 4.0. It *might* be a > good idea to bump the tombstone failure threshold default as a safety net > though (as well as put something in the NEWS.txt). > > On 21 December 2017 at 20:11, Jon Haddad <j...@jonhaddad.com> wrote: > > > I had suggested to Alex we kick this discussion over to the ML because > the > > change will have a significant impact on the behavior of Cassandra when > > doing reads with range tombstones that cover a lot of rows. The behavior > > now is a little weird, a single tombstone could shadow hundreds of > > thousands or even millions of rows, and the query would probably just > time > > out. Personally, I’m in favor of the change in behavior of this patch but > > I wanted to get some more feedback before committing to it. Are there any > > objections to what Alex described? > > > > Regarding Mockito, I’ve been meaning to bring this up for a while, and > I’m > > a solid +1 on introducing it to help with testing. In an ideal world we’d > > have no singletons and could test everything in isolation, but > > realistically that’s a multi year process and we just aren’t there. > > > > > > > On Dec 19, 2017, at 11:07 PM, Alexander Dejanovski < > > a...@thelastpickle.com> wrote: > > > > > > Hi folks, > > > > > > I'm working on CASSANDRA-8527 > > > <https://issues.apache.org/jira/browse/CASSANDRA-8527> and would need > to > > > discuss a few things. > > > > > > The ticket makes it visible in tracing and metrics that rows shadowed > by > > > range tombstones were scanned during reads. > > > Currently, scanned range tombstones aren't reported anywhere which > hides > > > the cause of performance issues during reads when the users perform > > primary > > > key deletes. > > > As such, they do not count in the warn and failure thresholds. > > > > > > While the number of live rows and tombstone cells is counted in the > > > ReadCommand class, it is currently not possible to count the number of > > > range tombstones there as they are merged with the rows they shadow > > before > > > reaching the class. > > > Instead, we can count the number of deleted rows that were read , which > > > already improves diagnosis and show that range tombstones were scanned > : > > > > > > if (row.hasLiveData(ReadCommand.this.nowInSec(), > enforceStrictLiveness)) > > > ++liveRows; > > > else if (!row.primaryKeyLivenessInfo().isLive(ReadCommand.this. > > nowInSec())) > > > { > > > // We want to detect primary key deletions only. > > > // If all cells have expired they will count as tombstones. > > > ++deletedRows; > > > } > > > > > > Deleted rows would be part of the warning threshold so that we can spot > > the > > > range tombstone scans in the logs and tracing would look like this : > > > > > > WARN [ReadStage-2] 2017-12-18 18:22:31,352 ReadCommand.java:491 - > > > Read 2086 live rows, 2440 deleted rows and 0 tombstone cells for > > > query.. > > > > > > > > > Are there any caveats to that approach ? > > > Should we include the number of deleted rows in the failure threshold > or > > > make it optional, knowing that it could make some queries fail while > they > > > were passing before ? > > > > > > On a side note, is it ok to bring in Mockito in order to make it easier > > > writing tests ? I would like to use a Spy in order to write some tests > > > without starting the whole stack. > > > > > > Thanks, > > > > > > > > > -- > > > ----------------- > > > Alexander Dejanovski > > > France > > > @alexanderdeja > > > > > > Consultant > > > Apache Cassandra Consulting > > > http://www.thelastpickle.com > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > -- ----------------- Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com