I think a change like this could be dangerous for a lot of existing automation built atop nodetool.
I’m not sure this change is worthwhile. I think it would be better to introduce e.g. -ste and -ete for “start token exclusive” and “end token exclusive” so that users can opt-in to whichever scheme they prefer for their tooling, without breaking existing users. > On 26 Jul 2022, at 14:22, Brandon Williams <dri...@gmail.com> wrote: > > +1, I think that makes the most sense. > > Kind Regards, > Brandon > > On Tue, Jul 26, 2022 at 8:19 AM J. D. Jordan <jeremiah.jor...@gmail.com> > wrote: >> >> I like the third option, especially if it makes it consistent with repair, >> which has supported ranges longer and I would guess most people would think >> the compact ranges work the same as the repair ranges. >> >> -Jeremiah Jordan >> >>> On Jul 26, 2022, at 6:49 AM, Andrés de la Peña <adelap...@apache.org> wrote: >>> >>> >>> Hi all, >>> >>> CASSANDRA-17575 has detected that token ranges in nodetool compact are >>> interpreted as closed on both sides. For example, the command "nodetool >>> compact -st 10 -et 50" will compact the tokens in [10, 50]. This way of >>> interpreting token ranges is unusual since token ranges are usually >>> half-open, and I think that in the previous example one would expect that >>> the compacted tokens would be in (10, 50]. That's for example the way >>> nodetool repair works, and indeed the class org.apache.cassandra.dht.Range >>> is always half-open. >>> >>> It's worth mentioning that, differently from nodetool repair, the help and >>> doc for nodetool compact doesn't specify whether the supplied start/end >>> tokens are inclusive or exclusive. >>> >>> I think that ideally nodetool compact should interpret the provided token >>> ranges as half-open, to be consistent with how token ranges are usually >>> interpreted. However, this would change the way the tool has worked until >>> now. This change might be problematic for existing users relying on the old >>> behaviour. That would be especially severe for the case where the begin and >>> end token are the same, because interpreting [x, x] we would compact a >>> single token, whereas I think that interpreting (x, x] would compact all >>> the tokens. As for compacting ranges including multiple tokens, I think the >>> change wouldn't be so bad, since probably the supplied token ranges come >>> from tools that are already presenting the ranges as half-open. Also, if we >>> are splitting the full ring into smaller ranges, half-open intervals would >>> still work and would save us some repetitions. >>> >>> So my question is: Should we change the behaviour of nodetool compact to >>> interpret the token ranges as half-opened, aligning it with the usual >>> interpretation of ranges? Or should we just document the current odd >>> behaviour to prevent compatibility issues? >>> >>> A third option would be changing to half-opened ranges and also forbidding >>> ranges where the begin and end token are the same, to prevent the >>> accidental compaction of the entire ring. Note that nodetool repair also >>> forbids this type of token ranges. >>> >>> What do you think?