I like the third option, especially if it makes it consistent with repair, which has supported ranges longer and I would guess most people would think the compact ranges work the same as the repair ranges.
-Jeremiah Jordan > On Jul 26, 2022, at 6:49 AM, Andrés de la Peña <adelap...@apache.org> wrote: > > > Hi all, > > CASSANDRA-17575 has detected that token ranges in nodetool compact are > interpreted as closed on both sides. For example, the command "nodetool > compact -st 10 -et 50" will compact the tokens in [10, 50]. This way of > interpreting token ranges is unusual since token ranges are usually > half-open, and I think that in the previous example one would expect that the > compacted tokens would be in (10, 50]. That's for example the way nodetool > repair works, and indeed the class org.apache.cassandra.dht.Range is always > half-open. > > It's worth mentioning that, differently from nodetool repair, the help and > doc for nodetool compact doesn't specify whether the supplied start/end > tokens are inclusive or exclusive. > > I think that ideally nodetool compact should interpret the provided token > ranges as half-open, to be consistent with how token ranges are usually > interpreted. However, this would change the way the tool has worked until > now. This change might be problematic for existing users relying on the old > behaviour. That would be especially severe for the case where the begin and > end token are the same, because interpreting [x, x] we would compact a single > token, whereas I think that interpreting (x, x] would compact all the tokens. > As for compacting ranges including multiple tokens, I think the change > wouldn't be so bad, since probably the supplied token ranges come from tools > that are already presenting the ranges as half-open. Also, if we are > splitting the full ring into smaller ranges, half-open intervals would still > work and would save us some repetitions. > > So my question is: Should we change the behaviour of nodetool compact to > interpret the token ranges as half-opened, aligning it with the usual > interpretation of ranges? Or should we just document the current odd > behaviour to prevent compatibility issues? > > A third option would be changing to half-opened ranges and also forbidding > ranges where the begin and end token are the same, to prevent the accidental > compaction of the entire ring. Note that nodetool repair also forbids this > type of token ranges. > > What do you think?