[
https://issues.apache.org/jira/browse/KAFKA-14312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Colin McCabe resolved KAFKA-14312.
----------------------------------
Resolution: Won't Fix
Based on the discussion, this behavior isn't unique to KRaft, and isn't a bug.
If you want a different behavior for sequence numbers, consider filing a KIP.
> Kraft + ProducerStateManager: produce requests to new partitions with a
> non-zero sequence number should be rejected
> -------------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-14312
> URL: https://issues.apache.org/jira/browse/KAFKA-14312
> Project: Kafka
> Issue Type: Bug
> Components: kraft, producer
> Reporter: Travis Bischel
> Priority: Major
>
> h1. Background
> In Kraft mode, if I create a topic, I am occasionally seeing MetadataResponse
> with a valid leader, and if I immediately produce to that topic, I am seeing
> NOT_LEADER_FOR_PARTITION. There may be another bug causing Kraft to return a
> leader in metadata but reject requests to that leader, _but_ this is showing
> a bigger problem.
> Kafka currently accepts produce requests to new partitions with a non-zero
> sequence number. I have confirmed this locally by modifying my client to
> start producing with a sequence number of 10. Producing three records
> sequentially back to back (seq 10, 11, 12) are all successful. I _think_ this
> [comment|https://github.com/apache/kafka/blob/3e7eddecd6a63ea6a9793d3270bef6d0be5c9021/core/src/main/scala/kafka/log/ProducerStateManager.scala#L235-L236]
> in the Kafka source also indicates roughly the same thing.
> h1. Problem
> * Client initializes producer ID
> * Client creates topic "foo" (for the problem, we will ignore partitions –
> there is just one partition)
> * Client sends produce request A with 5 records
> * Client sends produce request B with 5 records before receiving a response
> for A
> * Broker returns NOT_LEADER_FOR_PARTITION to produce request A
> * Broker finally initializes, becomes leader before seeing request B
> * Broker accepts request B as the first request
> * Broker believes sequence number 5 is ok, and is expecting the next
> sequence to be 10
> * Client retries requests A and B, because A failed
> * Broker sees request A with sequence 0, returns OutOfOrderSequenceException
> * Client enters a fatal state, because OOOSN is not retryable
> h1. Reproducing
> I can reliably reproduce this error using Kraft mode with 1 broker. I am
> using the following docker compose:
> {{version: "3.7"}}
> {{services:}}
> {{ kafka:}}
> {{ image: bitnami/kafka:latest}}
> {{ network_mode: host}}
> {{ environment:}}
> {{ KAFKA_ENABLE_KRAFT: yes}}
> {{ KAFKA_CFG_PROCESS_ROLES: controller,broker}}
> {{ KAFKA_CFG_CONTROLLER_LISTENER_NAMES: CONTROLLER}}
> {{ KAFKA_CFG_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093}}
> {{ KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP:
> CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT}}
> {{ KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: [email protected]:9093}}
> {{ # Set this to "PLAINTEXT://127.0.0.1:9092" if you want to run this
> container on localhost via Docker}}
> {{ KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://127.0.0.1:9092}}
> {{ KAFKA_CFG_BROKER_ID: 1}}
> {{ ALLOW_PLAINTEXT_LISTENER: yes}}
> {{ KAFKA_KRAFT_CLUSTER_ID: XkpGZQ27R3eTl3OdTm2LYA # 16 byte
> base64-encoded UUID}}
> {{ BITNAMI_DEBUG: true # Enable this to get more info on startup
> failures}}
>
> I am running the franz-go integration tests to trigger this (frequently, but
> not all of the time). However, these tests are not required. The behavior
> described above can occasionally reproduce this.
> I have never experienced this against the zookeeper version. It seems that
> the zk version always fully initializes a topic immediately and does not
> return NOT_LEADER_FOR_PARTITION on the first produce request. This is a
> separate problem – but the main problem described above exists in all
> versions, and _can_ be experienced in zk in very strange circumstances.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)