Ivan Andika created HDDS-12578:
----------------------------------
Summary: Ozone on CRAQ
Key: HDDS-12578
URL: https://issues.apache.org/jira/browse/HDDS-12578
Project: Apache Ozone
Issue Type: Wish
Reporter: Ivan Andika
Assignee: Ivan Andika
This is just a long-term wish to explore having Chain Replication or CRAQ on
Ozone.
Currently Ozone supports Raft based write pipeline and EC. From the Data
replication spectrum
([https://transactional.blog/blog/2024-data-replication-design-spectrum]),
these two pipelines cover the Leader-based (Raft based write pipeline) and
Quorum-based (EC) replication algorithm types. CRAQ falls under
Reconfiguration-based replication algorithms.
We can consider supporting CRAQ pipelines on Ozone. As mentioned in discussion
[https://github.com/apache/ozone/discussions/6870#discussioncomment-9907706],
chained replication might be needed for rolling upgrade support. Although CRAQ
promised higher bandwidth and strong consistency, there are some drawbacks such
as higher write latency (since all writes need to propagate to the tail),
higher downtime during node failure (due to wait for the control plane to
reconfigure the chains), etc.
The wish comes from the recent DeepSeek 3FS distributed file system that uses
CRAQ as its main write pipeline
(https://github.com/deepseek-ai/3FS/blob/main/docs/design_notes.md). Other
system such as Meta's Delta
(https://engineering.fb.com/2022/05/04/data-infrastructure/delta/) also uses
CRAQ.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]