Re: CASSANDRA-8527

Jon Haddad Fri, 05 Jan 2018 10:54:01 -0800

I think it’s reasonable to count the number of range tombstones towards the 
total tombstone count / threshold.


I agree the number of rows shadowed by the tombstones should be tracked 
separately, and we should probably think a bit more about how to add 
configuration / limits around this without burdening the user with a bunch of 
new flags they have to think about.  I would prefer to avoid any more 
configuration settings as complex as back_pressure_strategy going forward.

> On Jan 5, 2018, at 9:36 AM, Alexander Dejanovski <[email protected]> 
> wrote:
> 
> Hi Aleksey,
> 
> ok we'll split the work and only deal with row level tombstones in
> CASSANDRA-8527 <https://issues.apache.org/jira/browse/CASSANDRA-8527> and
> create a follow up ticket to work on the separate counts you're suggesting.
> My understanding of what you say is that you would not include range
> tombstones in the warn/fail threshold, but row level tombstones have
> somewhat an impact that is similar to cell tombstones. They will be
> retained in memory and will be sent to replicas.
> If we don't count them in the thresholds (at least for warnings), people
> will miss the fact that they may be reading a lot of tombstones.
> Are you ok with including those row tombstones as part of the thresholds ?
> This was the original intent for creating this ticket, which was a
> follow-up to CASSANDRA-8477
> <https://issues.apache.org/jira/browse/CASSANDRA-8477>.
> 
> For the follow up ticket, we may want to move the discussion in JIRA once
> we've create the ticket, but a merge listener seems like the right way to
> detect rows shadowed by range tombstones. That would force to change the
> UnfilteredRowIterator interface to include the tombstones/rows/cells stats
> as this is what is returned from the lower levels of the read path.
> Is there any easier way to achieve this that you can think of, as that
> interface is used in many parts of the code ?
> 
> On Wed, Dec 27, 2017 at 1:35 PM Aleksey Yeshchenko <[email protected]>
> wrote:
> 
>> As Jeff says, the number of actual tombstones is no less relevant than the
>> number of
>> cells and rows shadowed by them. And the way row and cell tombstones
>> affect a read
>> query can be very different from the way a big range tombstone might: you
>> potentially
>> have to retain every (regular) tombstone in memory and have replicas ship
>> them to
>> a coordinator, while you can discard everything shadowed by a big RT and
>> only serialize
>> a tiny bit of the data between the replicas and the coordinator.
>> 
>> So a mixed metric that just mixes up rows and cells shadowed by all three
>> kinds of tombstones
>> without any differentiation, while better than not having that visibility
>> at all, is worse than having
>> a detailed rundown. If possible, I’d love to see proper separate counts
>> tracked: range, row, and single-cell tombstones encountered, and # of rows
>> or cells obsoleted by each type.
>> 
>> I know that this is a non-trivial change, however, but hey, it doesn’t
>> have to be a trivial patch if it’s going into 4.0.
>> 
>> In the meantime I think it’d be helpful to report that single count. But I
>> don’t like the idea of redefining what
>> tombstone_warn_threshold and tombstone_failure_threshold mean, even in a
>> major release, as RTs are
>> qualitatively different from other tombstones, and have a much lower
>> impact per dead row.
>> 
>> —
>> AY
>> 
>> On 22 December 2017 at 03:53:47, kurt greaves ([email protected])
>> wrote:
>> 
>> I think that's a good idea for 4.x, but not so for current branches. I
>> think as long as we document it well for 4.0 upgrades it's not so much of a
>> problem. Obviously there will be cases of queries failing that were
>> previously succeeding but we can already use
>> tombstone_failure|warn_threshold to tune around that already. I don't think
>> we need another yaml setting to enable/disable counting deleted rows for
>> these thresholds, especially because it's going into 4.0. It *might* be a
>> good idea to bump the tombstone failure threshold default as a safety net
>> though (as well as put something in the NEWS.txt).
>> 
>> On 21 December 2017 at 20:11, Jon Haddad <[email protected]> wrote:
>> 
>>> I had suggested to Alex we kick this discussion over to the ML because
>> the
>>> change will have a significant impact on the behavior of Cassandra when
>>> doing reads with range tombstones that cover a lot of rows. The behavior
>>> now is a little weird, a single tombstone could shadow hundreds of
>>> thousands or even millions of rows, and the query would probably just
>> time
>>> out. Personally, I’m in favor of the change in behavior of this patch but
>>> I wanted to get some more feedback before committing to it. Are there any
>>> objections to what Alex described?
>>> 
>>> Regarding Mockito, I’ve been meaning to bring this up for a while, and
>> I’m
>>> a solid +1 on introducing it to help with testing. In an ideal world we’d
>>> have no singletons and could test everything in isolation, but
>>> realistically that’s a multi year process and we just aren’t there.
>>> 
>>> 
>>>> On Dec 19, 2017, at 11:07 PM, Alexander Dejanovski <
>>> [email protected]> wrote:
>>>> 
>>>> Hi folks,
>>>> 
>>>> I'm working on CASSANDRA-8527
>>>> <https://issues.apache.org/jira/browse/CASSANDRA-8527> and would need
>> to
>>>> discuss a few things.
>>>> 
>>>> The ticket makes it visible in tracing and metrics that rows shadowed
>> by
>>>> range tombstones were scanned during reads.
>>>> Currently, scanned range tombstones aren't reported anywhere which
>> hides
>>>> the cause of performance issues during reads when the users perform
>>> primary
>>>> key deletes.
>>>> As such, they do not count in the warn and failure thresholds.
>>>> 
>>>> While the number of live rows and tombstone cells is counted in the
>>>> ReadCommand class, it is currently not possible to count the number of
>>>> range tombstones there as they are merged with the rows they shadow
>>> before
>>>> reaching the class.
>>>> Instead, we can count the number of deleted rows that were read , which
>>>> already improves diagnosis and show that range tombstones were scanned
>> :
>>>> 
>>>> if (row.hasLiveData(ReadCommand.this.nowInSec(),
>> enforceStrictLiveness))
>>>> ++liveRows;
>>>> else if (!row.primaryKeyLivenessInfo().isLive(ReadCommand.this.
>>> nowInSec()))
>>>> {
>>>> // We want to detect primary key deletions only.
>>>> // If all cells have expired they will count as tombstones.
>>>> ++deletedRows;
>>>> }
>>>> 
>>>> Deleted rows would be part of the warning threshold so that we can spot
>>> the
>>>> range tombstone scans in the logs and tracing would look like this :
>>>> 
>>>> WARN [ReadStage-2] 2017-12-18 18:22:31,352 ReadCommand.java:491 -
>>>> Read 2086 live rows, 2440 deleted rows and 0 tombstone cells for
>>>> query..
>>>> 
>>>> 
>>>> Are there any caveats to that approach ?
>>>> Should we include the number of deleted rows in the failure threshold
>> or
>>>> make it optional, knowing that it could make some queries fail while
>> they
>>>> were passing before ?
>>>> 
>>>> On a side note, is it ok to bring in Mockito in order to make it easier
>>>> writing tests ? I would like to use a Spy in order to write some tests
>>>> without starting the whole stack.
>>>> 
>>>> Thanks,
>>>> 
>>>> 
>>>> --
>>>> -----------------
>>>> Alexander Dejanovski
>>>> France
>>>> @alexanderdeja
>>>> 
>>>> Consultant
>>>> Apache Cassandra Consulting
>>>> http://www.thelastpickle.com
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>> 
>>> 
>> 
> 
> 
> -- 
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
> 
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: CASSANDRA-8527

Reply via email to