Patrick Allen created KAFKA-20650:
-------------------------------------
Summary: In combined mode, BrokerServer and ControllerServer each
independently called QuotaFactory.instantiate(), which created a separate
ClientQuotaCallback instance via reflection for each role.
Key: KAFKA-20650
URL: https://issues.apache.org/jira/browse/KAFKA-20650
Project: Kafka
Issue Type: Bug
Components: core
Affects Versions: 4.1.2, 4.1.1, 4.0.2, 4.3.0, 4.2.0, 4.1.0, 4.0.1, 4.0.0,
3.9.2, 3.9.1, 3.9.0, 3.8.1, 3.8.0, 3.7.2, 3.7.1, 3.7.0, 3.6.2, 3.6.1, 3.6.0,
3.5.2, 3.5.1, 3.5.0, 3.4.1, 3.4.0, 3.3.2, 3.3.1, 3.3.0, 3.2.3, 3.2.2, 3.2.1,
3.2.0, 3.1.2, 3.1.1, 3.1.0, 3.0.2, 3.0.1, 3.0.0, 2.8.2, 2.8.1, 2.8.0, 2.8.3,
3.5.3
Reporter: Patrick Allen
In combined mode,
[BrokerServer|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/BrokerServer.scala]
and
[ControllerServer|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ControllerServer.scala]
each independently called
[QuotaFactory.instantiate()|https://github.com/apache/kafka/blob/trunk/core/src/main/java/kafka/server/QuotaFactory.java],
which created a separate ClientQuotaCallback instance via reflection for each
role.
{code:java}
// ControllerServer.scala
quotaManagers = QuotaFactory.instantiate(config,
metrics,
time,
s"controller-${config.nodeId}-", ProcessRole.ControllerRole.toString){code}
{code:java}
// BrokerServer.scala
quotaManagers = QuotaFactory.instantiate(config, metrics, time,
s"broker-${config.nodeId}-", ProcessRole.BrokerRole.toString) {code}
{code:java}
// QuotaFactory.java
Optional<Plugin<ClientQuotaCallback>> clientQuotaCallbackPlugin =
createClientQuotaCallback(cfg, metrics, role);{code}
This means that two independent callback objects are created with divergent
state in the same JVM — a regression from the pre-KRaft single-server
architecture where only one instance existed.
This only affects combined instances, which in theory should only be used in
development environments as suggested by Kafka. However, at my company we use a
custom quota plugin for our Kafka instances, which are often run on resource
constrained environments where we unfortunately have to run in combined mode.
We only noticed this issue when migrating to KRaft for version 4.0 but I
believe it has been present since KRaft's inception (I checked 2.8 which did
indeed have this).
I have written a simple fix where the quota callback is instead created in the
shared server, and then passed into the QuotaFactory instantiate as a variable
so that only a single instance is used across both servers. This is in my
opinion the quickest fix which I am happy to contribute (waiting on developer
perms before I can assign this to myself).
But I do have a wider question of whether Kafka wants to enable custom quota
management on the controller, and if perhaps we should allow for it also to be
set in config such as to prevent this issue when in combined:
{code:java}
controller.client.quota.callback.class: <package>{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)