[jira] [Updated] (IGNITE-24394) Fix flaky ItZoneDataReplicationTest#testDataRebalance

Aleksandr Polovtsev (Jira) Wed, 05 Feb 2025 00:22:10 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-24394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Aleksandr Polovtsev updated IGNITE-24394:
-----------------------------------------
    Description: In the colocation track if two tables belong to the same zone, 
they will share a Raft group. When a table is created into a zone, it registers 
a "sub-listener" that is nested in the zone Raft group listener and the Raft 
group listener forwards table-level requests to it. Imagine a situation, when a 
node exists, containing a zone and a table. A new node joins, creates a zone, 
starts the zone Raft group and only then creates the table and registers the 
"sub-listener". It means that there's a gap between Raft node start and 
"sub-listener" registration during which we can already start receiving updates 
from the other node. During normal operation, this gap is closed by the fact 
that we create the zone and the table in the Meta Storage thread and we have an 
interceptor, that rejects Raft updates until the target Catalog version is 
achieved (which only happens when all the actions in the Meta Storage thread 
have been performed). However, during recovery (when a node is restarted) this 
is no longer the case, the zone and the table are created independently and the 
gap still exists which can lead to missing updates and errors.  

> Fix flaky ItZoneDataReplicationTest#testDataRebalance
> -----------------------------------------------------
>
>                 Key: IGNITE-24394
>                 URL: https://issues.apache.org/jira/browse/IGNITE-24394
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Aleksandr Polovtsev
>            Assignee: Aleksandr Polovtsev
>            Priority: Major
>              Labels: ignite-3
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the colocation track if two tables belong to the same zone, they will 
> share a Raft group. When a table is created into a zone, it registers a 
> "sub-listener" that is nested in the zone Raft group listener and the Raft 
> group listener forwards table-level requests to it. Imagine a situation, when 
> a node exists, containing a zone and a table. A new node joins, creates a 
> zone, starts the zone Raft group and only then creates the table and 
> registers the "sub-listener". It means that there's a gap between Raft node 
> start and "sub-listener" registration during which we can already start 
> receiving updates from the other node. During normal operation, this gap is 
> closed by the fact that we create the zone and the table in the Meta Storage 
> thread and we have an interceptor, that rejects Raft updates until the target 
> Catalog version is achieved (which only happens when all the actions in the 
> Meta Storage thread have been performed). However, during recovery (when a 
> node is restarted) this is no longer the case, the zone and the table are 
> created independently and the gap still exists which can lead to missing 
> updates and errors.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-24394) Fix flaky ItZoneDataReplicationTest#testDataRebalance

Reply via email to