[ 
https://issues.apache.org/jira/browse/CASSANDRA-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17880836#comment-17880836
 ] 

Caleb Rackliffe edited comment on CASSANDRA-13704 at 9/12/24 5:36 AM:
----------------------------------------------------------------------

|4.0|[patch|https://github.com/apache/cassandra/pull/3526]| [^ci_summary.html] 
|n/a|
|4.1|[patch|https://github.com/apache/cassandra/pull/3539]| 
[^ci_summary-1.html] |n/a|
|5.0|[patch|https://github.com/apache/cassandra/pull/3544]| 
[^ci_summary-2.html] |[ASF 
CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch-5/61/]|

I've now posted PRs for 4.0, 4.1, and 5.0. CI is looking pretty clean on 4.0 
and 4.1, and 5.0 runs are in progress...

The patch introduces two new YAML options, {{log_out_of_token_range_requests}} 
and {{{}reject_out_of_token_range_requests{}}}, which default to {{{}true{}}}. 
They determine how streaming, repair, hints, mutations, read repair, and 
point/range reads handle cases where they are being executed on nodes that do 
not own the range(es) for the data involved.

When enabled, {{log_out_of_token_range_requests}} logs at WARN level, 
indicating the kind of request, its source, the invalid ranges requested, and 
the ranges the node actually owns. When {{reject_out_of_token_range_requests}}, 
out-of-range operations are outright rejected, rather than being accepted by a 
node that may never own the relevant range(s) and cannot, for example, safely 
participate in a write quorum. (Writes are not considered out-of-range if the 
range is pending, but in the event the node itself isn't yet aware of the 
pending range, they will be rejected. This may cause a short window of degraded 
availability, but it is safer and more visible than silently and erroneously 
accepting them.)

Once review settles, I'll likely add entries to {{NEWS.txt}} along w/ the 
CHANGES content, but given this is something we should probably never disable, 
I'm not too keen on adding it to the example {{cassandra.yaml}}.

Finally, {{nodetool info}} has a new option, {{--out-of-range-ops}}, that will 
display per-keyspace counts of operations for invalid tokens.


was (Author: maedhroz):
|4.0|[patch|https://github.com/apache/cassandra/pull/3526]| [^ci_summary.html] 
|n/a|
|4.1|[patch|https://github.com/apache/cassandra/pull/3539]| 
[^ci_summary-1.html] |n/a|
|5.0|[patch|https://github.com/apache/cassandra/pull/3544]|...|[ASF 
CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch-5/61/]|

I've now posted PRs for 4.0, 4.1, and 5.0. CI is looking pretty clean on 4.0 
and 4.1, and 5.0 runs are in progress...

The patch introduces two new YAML options, {{log_out_of_token_range_requests}} 
and {{{}reject_out_of_token_range_requests{}}}, which default to {{{}true{}}}. 
They determine how streaming, repair, hints, mutations, read repair, and 
point/range reads handle cases where they are being executed on nodes that do 
not own the range(es) for the data involved.

When enabled, {{log_out_of_token_range_requests}} logs at WARN level, 
indicating the kind of request, its source, the invalid ranges requested, and 
the ranges the node actually owns. When {{reject_out_of_token_range_requests}}, 
out-of-range operations are outright rejected, rather than being accepted by a 
node that may never own the relevant range(s) and cannot, for example, safely 
participate in a write quorum. (Writes are not considered out-of-range if the 
range is pending, but in the event the node itself isn't yet aware of the 
pending range, they will be rejected. This may cause a short window of degraded 
availability, but it is safer and more visible than silently and erroneously 
accepting them.)

Once review settles, I'll likely add entries to {{NEWS.txt}} along w/ the 
CHANGES content, but given this is something we should probably never disable, 
I'm not too keen on adding it to the example {{cassandra.yaml}}.

Finally, {{nodetool info}} has a new option, {{--out-of-range-ops}}, that will 
display per-keyspace counts of operations for invalid tokens.

> Safer handling of out of range tokens
> -------------------------------------
>
>                 Key: CASSANDRA-13704
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13704
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Coordination, Legacy/Observability
>            Reporter: Sam Tunnicliffe
>            Assignee: Caleb Rackliffe
>            Priority: Urgent
>             Fix For: 4.0.x, 4.1.x, 5.0.x
>
>         Attachments: ci_summary-1.html, ci_summary-2.html, ci_summary.html, 
> result_details.tar-1.gz, result_details.tar-2.gz, result_details.tar.gz
>
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> It is possible for nodes to have a divergent view of the ring, which can 
> result in some operations being sent to the wrong nodes. This is an umbrella 
> ticket to mitigate such issues by adding logging when a node is asked to 
> perform an operation for tokens it does not own. This will be useful for 
> detecting when the nodes' views of the ring diverge, which is not highly 
> visible at the moment, and also for post-hoc analysis.
> It may also be beneficial to straight up reject certain operations, though 
> this will need to balance the risk of performing those ops against the 
> consequences rejecting them has on availability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to