Counting the number of range tombstones - sure, that is reasonable. Counting cells/rows shadowed by range tombstones toward the limit, as if they were regular tombstones - arguably not so much.
— AY On 5 January 2018 at 18:53:45, Jon Haddad (j...@jonhaddad.com) wrote: I think it’s reasonable to count the number of range tombstones towards the total tombstone count / threshold. I agree the number of rows shadowed by the tombstones should be tracked separately, and we should probably think a bit more about how to add configuration / limits around this without burdening the user with a bunch of new flags they have to think about. I would prefer to avoid any more configuration settings as complex as back_pressure_strategy going forward. > On Jan 5, 2018, at 9:36 AM, Alexander Dejanovski <a...@thelastpickle.com> > wrote: > > Hi Aleksey, > > ok we'll split the work and only deal with row level tombstones in > CASSANDRA-8527 <https://issues.apache.org/jira/browse/CASSANDRA-8527> and > create a follow up ticket to work on the separate counts you're suggesting. > My understanding of what you say is that you would not include range > tombstones in the warn/fail threshold, but row level tombstones have > somewhat an impact that is similar to cell tombstones. They will be > retained in memory and will be sent to replicas. > If we don't count them in the thresholds (at least for warnings), people > will miss the fact that they may be reading a lot of tombstones. > Are you ok with including those row tombstones as part of the thresholds ? > This was the original intent for creating this ticket, which was a > follow-up to CASSANDRA-8477 > <https://issues.apache.org/jira/browse/CASSANDRA-8477>. > > For the follow up ticket, we may want to move the discussion in JIRA once > we've create the ticket, but a merge listener seems like the right way to > detect rows shadowed by range tombstones. That would force to change the > UnfilteredRowIterator interface to include the tombstones/rows/cells stats > as this is what is returned from the lower levels of the read path. > Is there any easier way to achieve this that you can think of, as that > interface is used in many parts of the code ? > > On Wed, Dec 27, 2017 at 1:35 PM Aleksey Yeshchenko <alek...@apple.com> > wrote: > >> As Jeff says, the number of actual tombstones is no less relevant than the >> number of >> cells and rows shadowed by them. And the way row and cell tombstones >> affect a read >> query can be very different from the way a big range tombstone might: you >> potentially >> have to retain every (regular) tombstone in memory and have replicas ship >> them to >> a coordinator, while you can discard everything shadowed by a big RT and >> only serialize >> a tiny bit of the data between the replicas and the coordinator. >> >> So a mixed metric that just mixes up rows and cells shadowed by all three >> kinds of tombstones >> without any differentiation, while better than not having that visibility >> at all, is worse than having >> a detailed rundown. If possible, I’d love to see proper separate counts >> tracked: range, row, and single-cell tombstones encountered, and # of rows >> or cells obsoleted by each type. >> >> I know that this is a non-trivial change, however, but hey, it doesn’t >> have to be a trivial patch if it’s going into 4.0. >> >> In the meantime I think it’d be helpful to report that single count. But I >> don’t like the idea of redefining what >> tombstone_warn_threshold and tombstone_failure_threshold mean, even in a >> major release, as RTs are >> qualitatively different from other tombstones, and have a much lower >> impact per dead row. >> >> — >> AY >> >> On 22 December 2017 at 03:53:47, kurt greaves (k...@instaclustr.com) >> wrote: >> >> I think that's a good idea for 4.x, but not so for current branches. I >> think as long as we document it well for 4.0 upgrades it's not so much of a >> problem. Obviously there will be cases of queries failing that were >> previously succeeding but we can already use >> tombstone_failure|warn_threshold to tune around that already. I don't think >> we need another yaml setting to enable/disable counting deleted rows for >> these thresholds, especially because it's going into 4.0. It *might* be a >> good idea to bump the tombstone failure threshold default as a safety net >> though (as well as put something in the NEWS.txt). >> >> On 21 December 2017 at 20:11, Jon Haddad <j...@jonhaddad.com> wrote: >> >>> I had suggested to Alex we kick this discussion over to the ML because >> the >>> change will have a significant impact on the behavior of Cassandra when >>> doing reads with range tombstones that cover a lot of rows. The behavior >>> now is a little weird, a single tombstone could shadow hundreds of >>> thousands or even millions of rows, and the query would probably just >> time >>> out. Personally, I’m in favor of the change in behavior of this patch but >>> I wanted to get some more feedback before committing to it. Are there any >>> objections to what Alex described? >>> >>> Regarding Mockito, I’ve been meaning to bring this up for a while, and >> I’m >>> a solid +1 on introducing it to help with testing. In an ideal world we’d >>> have no singletons and could test everything in isolation, but >>> realistically that’s a multi year process and we just aren’t there. >>> >>> >>>> On Dec 19, 2017, at 11:07 PM, Alexander Dejanovski < >>> a...@thelastpickle.com> wrote: >>>> >>>> Hi folks, >>>> >>>> I'm working on CASSANDRA-8527 >>>> <https://issues.apache.org/jira/browse/CASSANDRA-8527> and would need >> to >>>> discuss a few things. >>>> >>>> The ticket makes it visible in tracing and metrics that rows shadowed >> by >>>> range tombstones were scanned during reads. >>>> Currently, scanned range tombstones aren't reported anywhere which >> hides >>>> the cause of performance issues during reads when the users perform >>> primary >>>> key deletes. >>>> As such, they do not count in the warn and failure thresholds. >>>> >>>> While the number of live rows and tombstone cells is counted in the >>>> ReadCommand class, it is currently not possible to count the number of >>>> range tombstones there as they are merged with the rows they shadow >>> before >>>> reaching the class. >>>> Instead, we can count the number of deleted rows that were read , which >>>> already improves diagnosis and show that range tombstones were scanned >> : >>>> >>>> if (row.hasLiveData(ReadCommand.this.nowInSec(), >> enforceStrictLiveness)) >>>> ++liveRows; >>>> else if (!row.primaryKeyLivenessInfo().isLive(ReadCommand.this. >>> nowInSec())) >>>> { >>>> // We want to detect primary key deletions only. >>>> // If all cells have expired they will count as tombstones. >>>> ++deletedRows; >>>> } >>>> >>>> Deleted rows would be part of the warning threshold so that we can spot >>> the >>>> range tombstone scans in the logs and tracing would look like this : >>>> >>>> WARN [ReadStage-2] 2017-12-18 18:22:31,352 ReadCommand.java:491 - >>>> Read 2086 live rows, 2440 deleted rows and 0 tombstone cells for >>>> query.. >>>> >>>> >>>> Are there any caveats to that approach ? >>>> Should we include the number of deleted rows in the failure threshold >> or >>>> make it optional, knowing that it could make some queries fail while >> they >>>> were passing before ? >>>> >>>> On a side note, is it ok to bring in Mockito in order to make it easier >>>> writing tests ? I would like to use a Spy in order to write some tests >>>> without starting the whole stack. >>>> >>>> Thanks, >>>> >>>> >>>> -- >>>> ----------------- >>>> Alexander Dejanovski >>>> France >>>> @alexanderdeja >>>> >>>> Consultant >>>> Apache Cassandra Consulting >>>> http://www.thelastpickle.com >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >>> For additional commands, e-mail: dev-h...@cassandra.apache.org >>> >>> >> > > > -- > ----------------- > Alexander Dejanovski > France > @alexanderdeja > > Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org