Hi Josep,
Thank you for the detailed response. I appreciate the clarification
regarding the distinction between the Inkless POC and the KIP design.
However, my objection is not based on temporary bugs in the fork, but *on
architectural gaps in the KIPs themselves* that these implementation issues
highlighted. If we are voting to approve the design, the design documents
must be structurally complete regarding data safety.
*1. Regarding Storage Leaks (The Missing Design)* You mentioned that
cleanup logic "can be defined later." However, KIP-1163 explicitly
delegates this responsibility to a separate process, and KIP-1165 (Object
Compaction/GC) is currently marked as "Discarded" in the wiki.
We cannot vote to approve a storage engine that has no specified mechanism
for garbage collection. The "Upload-then-Commit" pattern described in
KIP-1163 structurally creates orphaned segments during broker failures.
Without an active KIP defining the reconciliation protocol (since KIP-1165
was withdrawn), the proposal effectively describes a system with unbounded
storage growth during failure modes. This is a blocking design gap, not an
implementation detail.
*2. Regarding EOS (The Coordinator Synchronization Gap)* This is not a
misunderstanding of standard Kafka transactions; it is a critique of how
KIP-1150 changes them. Standard EOS relies on the Partition Leader to
sequence markers and calculate the LSO (Last Stable Offset) in memory.
KIP-1150 removes the Leader.
KIP-1164 (Batch Coordinator) must explicitly define the RPC flow between
the Transaction Coordinator and the Batch Coordinator to replace the
leader's role. Currently, the KIP does not specify how the system prevents
a "Split Brain" scenario where a consumer reads ahead of a transaction
marker that hasn't yet been sequenced by the Batch Coordinator. This is a
protocol-level correctness issue that must be resolved in the text before
adoption.
Please note - I am maintaining my objection based on missing
specifications, not code bugs.
I respectfully request that we pause the vote until:
A valid design for Garbage Collection (replacing the discarded
KIP-1165) is added to the proposal.
The Transaction/LSO synchronization protocol is explicitly documented
in KIP-1164.
Regards,
Vaquar Khan
Sr Data Architect
https://www.linkedin.com/in/vaquar-khan-b695577/