ppatierno commented on PR #20859:
URL: https://github.com/apache/kafka/pull/20859#issuecomment-3580795668

   @kevin-wu24 thank you very much for looking into this! So to answer your 
questions ...
   
   > Question for @ppatierno regarding their use case: Just to confirm my 
understanding on the issue: when you remove the controller role from these 
nodes, is the operator "watching" the config file, and then sends the 
RemoveRaftVoter based on that? And then you end up in this state where the 
combined "controllers" are continuously being removed + added back because of 
auto-join?
   
   I am not sure what you mean by "is the operator "watching" the config file" 
but let me explain how it works in practice.
   What the operator detects (because the user has update the corresponding 
`KafkaNodePool` custom resource, which hosts info for the broker/controller 
nodes) is that it has to remove the controllers from the combined-nodes. So it 
knows what should be the new configuration for these nodes (i.e. removing 
"controller" from the `process.roles` property) and then it rolls them to allow 
restarting as broker only with the new configuration. Before rolling, it's 
going to remove them from voters list by using the RemoveRaftVoter Admin API. 
You can see here that with auto-join enabled, these controllers re-join 
immediately right after the RemoveRaftVoter Admin API and before rolling. So 
they would restart as broker-only but seeing them still in the voters list and 
try to send the UpdateVoteRequest (see the 
https://issues.apache.org/jira/browse/KAFKA-19867).
   
   > I think what you want is actually to first disable auto-join on the 
combined controller, restart it, and then remove it from the voter set. Then 
you are free to make it broker-only.
   
   This would make the controllers scaling more complex and longer, even 
because the operator should:
   
   * disable auto-join, restart the nodes still as "controller" (as combined 
nodes)
   * remove them from the voters set
   * re-configuring them by removing the "controller" (making broker only), 
restarting them.
   
   Two rollings while just one would be sufficient, and for a big cluster it's 
really a lot imho other than the complexity being added to the operator which 
would be not necessary thanks to this fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to