Jeff Kim created KAFKA-14694:
--------------------------------
Summary: RPCProducerIdManager should not wait for a new block
Key: KAFKA-14694
URL: https://issues.apache.org/jira/browse/KAFKA-14694
Project: Kafka
Issue Type: Bug
Reporter: Jeff Kim
Assignee: Jeff Kim
RPCProducerIdManager initiates an async request to the controller to grab a
block of producer IDs and then blocks waiting for a response from the
controller.
This is done in the request handler threads while holding a global lock. This
means that if many producers are requesting producer IDs and the controller is
slow to respond, many threads can get stuck waiting for the lock.
This may also be a deadlock concern under the following scenario:
if the controller has 1 request handler thread (1 chosen for simplicity) and
receives an InitProducerId request, it may deadlock.
basically any time the controller has N InitProducerId requests where N >= # of
request handler threads has the potential to deadlock.
consider this:
1. the request handler thread tries to handle an InitProducerId request to the
controller by forwarding an AllocateProducerIds request.
2. the request handler thread then waits on the controller response (timed poll
on nextProducerIdBlock)
3. the controller's request handler threads need to pick this request up, and
handle it, but the controller's request handler threads are blocked waiting for
the forwarded AllocateProducerIds response.
We should not block while waiting for a new block and instead return
immediately to free the request handler threads.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)