ppatierno commented on PR #20859: URL: https://github.com/apache/kafka/pull/20859#issuecomment-3580795668
@kevin-wu24 thank you very much for looking into this! So to answer your questions ... > Question for @ppatierno regarding their use case: Just to confirm my understanding on the issue: when you remove the controller role from these nodes, is the operator "watching" the config file, and then sends the RemoveRaftVoter based on that? And then you end up in this state where the combined "controllers" are continuously being removed + added back because of auto-join? I am not sure what you mean by "is the operator "watching" the config file" but let me explain how it works in practice. What the operator detects (because the user has update the corresponding `KafkaNodePool` custom resource, which hosts info for the broker/controller nodes) is that it has to remove the controllers from the combined-nodes. So it knows what should be the new configuration for these nodes (i.e. removing "controller" from the `process.roles` property) and then it rolls them to allow restarting as broker only with the new configuration. Before rolling, it's going to remove them from voters list by using the RemoveRaftVoter Admin API. You can see here that with auto-join enabled, these controllers re-join immediately right after the RemoveRaftVoter Admin API and before rolling. So they would restart as broker-only but seeing them still in the voters list and try to send the UpdateVoteRequest (see the https://issues.apache.org/jira/browse/KAFKA-19867). > I think what you want is actually to first disable auto-join on the combined controller, restart it, and then remove it from the voter set. Then you are free to make it broker-only. This would make the controllers scaling more complex and longer, even because the operator should: * disable auto-join, restart the nodes still as "controller" (as combined nodes) * remove them from the voters set * re-configuring them by removing the "controller" (making broker only), restarting them. Two rollings while just one would be sufficient, and for a big cluster it's really a lot imho other than the complexity being added to the operator which would be not necessary thanks to this fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
