[
https://issues.apache.org/jira/browse/CASSANDRA-18758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jaydeepkumar Chovatia updated CASSANDRA-18758:
----------------------------------------------
Description:
*Problem Statement*
As we know, Cassandra exchanges important topology and token-ownership-related
details over Gossip. Cassandra internally maintains the following two separate
caches that have the token-ownership information maintained: 1) Gossip cache
and 2) Storage Service cache. The first Gossip cache is updated on a node,
followed by the storage service cache. In the hot path, ownership is calculated
from the storage service cache. Since two separate caches maintain the same
information, then inconsistencies are bound to happen. It could be very well
feasible that the Gossip cache has up-to-date ownership of the Cassandra
cluster, but the service cache does not, and in that scenario, inconsistent
data will be served to the user.
Currently, there is no mechanism in Cassandra that detects and fixes these two
caches.
*Long-term solution*
We are going with the long-term transactional metadata
([https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21]) to handle such
inconsistencies, and that’s the right thing to do.
*Short-term solution*
But CEP-21 might take some time, and until then, there is a need to *detect*
such inconsistencies. Once we detect inconsistencies, then we could have two
options: 1) restart the node or 2) Fix the inconsistencies on-the-fly.
This JIRA is providing a short-term solution.
was:
*Problem Statement*
As we know, Cassandra exchanges important topology and token-ownership-related
details over Gossip. Cassandra internally maintains the following two separate
caches that have the token-ownership information maintained: 1) Gossip cache
and 2) Storage Service cache. The first Gossip cache is updated on a node,
followed by the storage service cache. In the hot path, ownership is calculated
from the storage service cache. Since two separate caches are maintaining the
same information, then inconsistencies are bound to happen. It could be very
well feasible that the Gossip cache has up-to-date ownership of the Cassandra
cluster, but the service cache does not, and in that scenario, inconsistent
data will be served to the user.
Currently, there is no mechanism in Cassandra that detects and fixes these two
caches.
*Long-term solution*
We are going with the long-term transactional metadata
(https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21) to handle such
inconsistencies, and that’s the right thing to do.
*Short-term solution*
But CEP-21 might take some time, and until then, there is a need to detect such
inconsistencies. Once we detect inconsistencies, then we could have two
options: 1) Simply restart the node or 2) Fix the inconsistencies on-the-fly.
This JIRA is providing a short-term solution.
> Detect token-ownership mismatch
> -------------------------------
>
> Key: CASSANDRA-18758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18758
> Project: Cassandra
> Issue Type: Improvement
> Components: Cluster/Gossip
> Reporter: Jaydeepkumar Chovatia
> Assignee: Jaydeepkumar Chovatia
> Priority: Normal
>
> *Problem Statement*
> As we know, Cassandra exchanges important topology and
> token-ownership-related details over Gossip. Cassandra internally maintains
> the following two separate caches that have the token-ownership information
> maintained: 1) Gossip cache and 2) Storage Service cache. The first Gossip
> cache is updated on a node, followed by the storage service cache. In the hot
> path, ownership is calculated from the storage service cache. Since two
> separate caches maintain the same information, then inconsistencies are bound
> to happen. It could be very well feasible that the Gossip cache has
> up-to-date ownership of the Cassandra cluster, but the service cache does
> not, and in that scenario, inconsistent data will be served to the user.
> Currently, there is no mechanism in Cassandra that detects and fixes these
> two caches.
> *Long-term solution*
> We are going with the long-term transactional metadata
> ([https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21]) to handle
> such inconsistencies, and that’s the right thing to do.
> *Short-term solution*
> But CEP-21 might take some time, and until then, there is a need to *detect*
> such inconsistencies. Once we detect inconsistencies, then we could have two
> options: 1) restart the node or 2) Fix the inconsistencies on-the-fly.
> This JIRA is providing a short-term solution.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]