Rongtong Jin created COMDEV-513:
-----------------------------------

             Summary: RocketMQ TieredStore Integration with High Availability 
Architecture
                 Key: COMDEV-513
                 URL: https://issues.apache.org/jira/browse/COMDEV-513
             Project: Community Development
          Issue Type: Task
          Components: Comdev, GSoC/Mentoring ideas
            Reporter: Rongtong Jin


{*}Apache RocketMQ{*}{*}{*}

Apache RocketMQ is a distributed messaging and streaming platform with low 
latency, high performance and reliability, trillion-level capacity and flexible 
scalability.

Page: [https://rocketmq.apache.org|https://rocketmq.apache.org/]

 

*Background*

With the official release of RocketMQ 5.1.0, tiered storage has arrived as a 
new independent module in the Technical Preview milestone. This allows users to 
unload messages from local disks to other cheaper storage, extending message 
retention time at a lower cost.

Reference RIP-57: 
[https://github.com/apache/rocketmq/wiki/RIP-57-Tiered-storage-for-RocketMQ]

In addition, RocketMQ introduced a new high availability architecture in 
version 5.0.

Reference RIP-44: 
[https://github.com/apache/rocketmq/wiki/RIP-44-Support-DLedger-Controller]

However, currently RocketMQ tiered storage only supports single replicas.

 

*Task*

Currently, tiered storage only supports single replicas, and there are still 
the following issues in the integration with the high availability architecture:
 * Metadata synchronization: how to reliably synchronize metadata between 
master and slave nodes.
 * Disallowing message uploads beyond the confirm offset: to avoid message 
rollback, the maximum uploaded offset cannot exceed the confirm offset.
 * Starting multi-tier storage upload when the slave changes to master, and 
stopping tiered storage upload when the master becomes the slave: only the 
master node has write and delete permissions, and after the slave node is 
promoted, it needs to quickly resume tiered storage breakpoint resumption.
 * Design of slave pull protocol: how a newly launched empty slave can properly 
synchronize data through the tiered storage architecture. (If synchronization 
is performed based on the first or last file, resumption of breakpoints may not 
be possible when switching again).

So you need to provide a complete plan to solve the above issues and ultimately 
complete the integration of tiered storage and high availability architecture, 
while verifying it through the existing tiered storage file version and 
OpenChaos testing.

 

*Relevant Skills*
 * Interest in messaging middleware and distributed storage systems
 * Java development skills
 * Having a good understanding of RocketMQ tiered storage and high availability 
architecture



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@community.apache.org
For additional commands, e-mail: dev-h...@community.apache.org

Reply via email to