Re: Inclusive/exclusive endpoints when compacting token ranges

Benedict Elliott Smith Tue, 26 Jul 2022 06:28:00 -0700


I think a change like this could be dangerous for a lot of existing automation 
built atop nodetool.


I’m not sure this change is worthwhile. I think it would be better to introduce 
e.g. -ste and -ete for “start token exclusive” and “end token exclusive” so 
that users can opt-in to whichever scheme they prefer for their tooling, 
without breaking existing users.

> On 26 Jul 2022, at 14:22, Brandon Williams <[email protected]> wrote:
> 
> +1, I think that makes the most sense.
> 
> Kind Regards,
> Brandon
> 
> On Tue, Jul 26, 2022 at 8:19 AM J. D. Jordan <[email protected]> 
> wrote:
>> 
>> I like the third option, especially if it makes it consistent with repair, 
>> which has supported ranges longer and I would guess most people would think 
>> the compact ranges work the same as the repair ranges.
>> 
>> -Jeremiah Jordan
>> 
>>> On Jul 26, 2022, at 6:49 AM, Andrés de la Peña <[email protected]> wrote:
>>> 
>>> 
>>> Hi all,
>>> 
>>> CASSANDRA-17575 has detected that token ranges in nodetool compact are 
>>> interpreted as closed on both sides. For example, the command "nodetool 
>>> compact -st 10 -et 50" will compact the tokens in [10, 50]. This way of 
>>> interpreting token ranges is unusual since token ranges are usually 
>>> half-open, and I think that in the previous example one would expect that 
>>> the compacted tokens would be in (10, 50]. That's for example the way 
>>> nodetool repair works, and indeed the class org.apache.cassandra.dht.Range 
>>> is always half-open.
>>> 
>>> It's worth mentioning that, differently from nodetool repair, the help and 
>>> doc for nodetool compact doesn't specify whether the supplied start/end 
>>> tokens are inclusive or exclusive.
>>> 
>>> I think that ideally nodetool compact should interpret the provided token 
>>> ranges as half-open, to be consistent with how token ranges are usually 
>>> interpreted. However, this would change the way the tool has worked until 
>>> now. This change might be problematic for existing users relying on the old 
>>> behaviour. That would be especially severe for the case where the begin and 
>>> end token are the same, because interpreting [x, x] we would compact a 
>>> single token, whereas I think that interpreting (x, x] would compact all 
>>> the tokens. As for compacting ranges including multiple tokens, I think the 
>>> change wouldn't be so bad, since probably the supplied token ranges come 
>>> from tools that are already presenting the ranges as half-open. Also, if we 
>>> are splitting the full ring into smaller ranges, half-open intervals would 
>>> still work and would save us some repetitions.
>>> 
>>> So my question is: Should we change the behaviour of nodetool compact to 
>>> interpret the token ranges as half-opened, aligning it with the usual 
>>> interpretation of ranges? Or should we just document the current odd 
>>> behaviour to prevent compatibility issues?
>>> 
>>> A third option would be changing to half-opened ranges and also forbidding 
>>> ranges where the begin and end token are the same, to prevent the 
>>> accidental compaction of the entire ring. Note that nodetool repair also 
>>> forbids this type of token ranges.
>>> 
>>> What do you think?

Re: Inclusive/exclusive endpoints when compacting token ranges

Reply via email to