Dear Kafka Community, I am proposing a new KIP to introduce a unified shared storage solution for Kafka, aiming to enhance its scalability and flexibility. This KIP is inspired by the ongoing discussions around KIP-1150 and KIP-1176, which explore leveraging object storage to achieve cost and elasticity benefits. These efforts are commendable, but given the widespread adoption of Kafka's classic shared-nothing architecture, especially in on-premise environments, we need a unified approach that supports a smooth transition from shared-nothing to shared storage. This KIP proposes refactoring the log layer to support both architectures simultaneously, ensuring long-term compatibility and allowing Kafka to fully leverage shared storage services like S3, HDFS, and NFS.
The core of this proposal includes introducing abstract log and log segment classes and a new 'Stream' API to bridge the gap between shared storage services and Kafka's storage layer. This unified solution will enable Kafka to evolve while maintaining backward compatibility, supporting both on-premise and cloud deployments. I believe this approach is crucial for Kafka's continued success and look forward to your thoughts and feedback. Link to the KIP for more details: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1183%3A+Unified+Shared+Storage Best regards, Xinyu