I like the third option, especially if it makes it consistent with repair, 
which has supported ranges longer and I would guess most people would think the 
compact ranges work the same as the repair ranges.

-Jeremiah Jordan

> On Jul 26, 2022, at 6:49 AM, Andrés de la Peña <adelap...@apache.org> wrote:
> 
> 
> Hi all,
> 
> CASSANDRA-17575 has detected that token ranges in nodetool compact are 
> interpreted as closed on both sides. For example, the command "nodetool 
> compact -st 10 -et 50" will compact the tokens in [10, 50]. This way of 
> interpreting token ranges is unusual since token ranges are usually 
> half-open, and I think that in the previous example one would expect that the 
> compacted tokens would be in (10, 50]. That's for example the way nodetool 
> repair works, and indeed the class org.apache.cassandra.dht.Range is always 
> half-open.
> 
> It's worth mentioning that, differently from nodetool repair, the help and 
> doc for nodetool compact doesn't specify whether the supplied start/end 
> tokens are inclusive or exclusive.
> 
> I think that ideally nodetool compact should interpret the provided token 
> ranges as half-open, to be consistent with how token ranges are usually 
> interpreted. However, this would change the way the tool has worked until 
> now. This change might be problematic for existing users relying on the old 
> behaviour. That would be especially severe for the case where the begin and 
> end token are the same, because interpreting [x, x] we would compact a single 
> token, whereas I think that interpreting (x, x] would compact all the tokens. 
> As for compacting ranges including multiple tokens, I think the change 
> wouldn't be so bad, since probably the supplied token ranges come from tools 
> that are already presenting the ranges as half-open. Also, if we are 
> splitting the full ring into smaller ranges, half-open intervals would still 
> work and would save us some repetitions.
> 
> So my question is: Should we change the behaviour of nodetool compact to 
> interpret the token ranges as half-opened, aligning it with the usual 
> interpretation of ranges? Or should we just document the current odd 
> behaviour to prevent compatibility issues?
> 
> A third option would be changing to half-opened ranges and also forbidding 
> ranges where the begin and end token are the same, to prevent the accidental 
> compaction of the entire ring. Note that nodetool repair also forbids this 
> type of token ranges.
> 
> What do you think?

Reply via email to