Patrick Allen created KAFKA-20650:
-------------------------------------

             Summary: In combined mode, BrokerServer and ControllerServer each 
independently called QuotaFactory.instantiate(), which created a separate 
ClientQuotaCallback instance via reflection for each role.
                 Key: KAFKA-20650
                 URL: https://issues.apache.org/jira/browse/KAFKA-20650
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 4.1.2, 4.1.1, 4.0.2, 4.3.0, 4.2.0, 4.1.0, 4.0.1, 4.0.0, 
3.9.2, 3.9.1, 3.9.0, 3.8.1, 3.8.0, 3.7.2, 3.7.1, 3.7.0, 3.6.2, 3.6.1, 3.6.0, 
3.5.2, 3.5.1, 3.5.0, 3.4.1, 3.4.0, 3.3.2, 3.3.1, 3.3.0, 3.2.3, 3.2.2, 3.2.1, 
3.2.0, 3.1.2, 3.1.1, 3.1.0, 3.0.2, 3.0.1, 3.0.0, 2.8.2, 2.8.1, 2.8.0, 2.8.3, 
3.5.3
            Reporter: Patrick Allen


In combined mode, 
[BrokerServer|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/BrokerServer.scala]
 and 
[ControllerServer|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ControllerServer.scala]
 each independently called 
[QuotaFactory.instantiate()|https://github.com/apache/kafka/blob/trunk/core/src/main/java/kafka/server/QuotaFactory.java],
 which created a separate ClientQuotaCallback instance via reflection for each 
role.
{code:java}
// ControllerServer.scala
quotaManagers = QuotaFactory.instantiate(config,
    metrics,
    time,
    s"controller-${config.nodeId}-", ProcessRole.ControllerRole.toString){code}
{code:java}
// BrokerServer.scala
quotaManagers = QuotaFactory.instantiate(config, metrics, time, 
s"broker-${config.nodeId}-", ProcessRole.BrokerRole.toString) {code}
{code:java}
// QuotaFactory.java
Optional<Plugin<ClientQuotaCallback>> clientQuotaCallbackPlugin = 
createClientQuotaCallback(cfg, metrics, role);{code}
 

This means that two independent callback objects are created with divergent 
state in the same JVM — a regression from the pre-KRaft single-server 
architecture where only one instance existed.

This only affects combined instances, which in theory should only be used in 
development environments as suggested by Kafka. However, at my company we use a 
custom quota plugin for our Kafka instances, which are often run on resource 
constrained environments where we unfortunately have to run in combined mode. 
We only noticed this issue when migrating to KRaft for version 4.0 but I 
believe it has been present since KRaft's inception (I checked 2.8 which did 
indeed have this).

I have written a simple fix where the quota callback is instead created in the 
shared server, and then passed into the QuotaFactory instantiate as a variable 
so that only a single instance is used across both servers. This is in my 
opinion the quickest fix which I am happy to contribute (waiting on developer 
perms before I can assign this to myself).

But I do have a wider question of whether Kafka wants to enable custom quota 
management on the controller, and if perhaps we should allow for it also to be 
set in config such as to prevent this issue when in combined: 
{code:java}
controller.client.quota.callback.class: <package>{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to