On Mon, Oct 14, 2024, at 15:04, Jakub Scholz wrote: > Hi Colin, > > So, how exactly does this major misconfiguration that was not documented > for over a year and nobody complained manifest itself? What should I look > for in the logs? What are the problems it manifests itself through? There > are plenty of users who went through the migration with this "major > misconfiguration". So what should they be looking for? Did they lose > messages? Does it have security consequences that require a CVE? How > does one recover from the problems it caused? > > I do not think changing the controller name is as simple as you suggest. > And it is especially not simple if you need to roll it out to thousands of > users. > > I also feel that issues like this need to be taken more seriously than just > shrugging it off and quickly closing JIRA someone else opened as invalid. > This is not the first thing that was found as undocumented. It indicates > significant quality issues, at least on the documentation side. How many > other "major misconfigurations" are still left?
Hi Jakub, I understand your frustration. Let me explain. Kafka wants to divide up all listeners into either broker listeners or controller listeners. The sets are disjoint: a listener can't be both. BrokerServer will try to open the ports that belong to the broker; ControllerServer will try to open the ports that belong to the controller. You obviously can't open the same port twice (in standard UNIX, at least) or if you somehow did, the result would be nonsense. This isn't such a big deal on broker or controller nodes, but it becomes a bigger deal when in "combined" mode, where a process functions as both broker and controller. You are getting the error message "requirement failed: control.plane.listener.name must be a listener name defined in advertised.listeners" because the broker is looking for a listener named CONTROLPLANE-9090, but it already knows that this is NOT an advertised listener for the broker, it is an advertised listener for the controller. You would get the same error if you tried to set the inter.broker.listener or the broker replication listener to the CONTROLLER listener. The reason why this wasn't an issue in 3.8.0, and is in 3.9-RC2 is that in 3.8.0 and earlier, advertised.listeners was just for broker listeners. There was never a need to put a controller listener in there because controller listeners were statically configured by controller.quorum.voters. In fact, when in KRaft mode, it would be a fatal configuration error to put a controller listener into advertised.listeners. Unfortunately, we failed to enforce this when the broker was in migration mode. I guess for the purpose of being bug-compatible with 3.8, we could make an exception here and force the listener specified by control.plane.listener to appear in effectiveAdvertisedBrokerListeners. Since control plane listeners are going away anyway, we won't have to support this exception for very long. By the way, I wasn't trying to be dismissive of your bug report (in fact I spent several hours on it). I felt (in fact I still feel) that it's a configuration error. But your compatibility argument is a reasonable one. So let's be compatible. best, Colin > > Jakub > > On Mon, Oct 14, 2024 at 11:33 PM Colin McCabe <cmcc...@apache.org> wrote: > >> Hi Jakub, >> >> It has always been required to separate control plane listeners and >> controller listeners. Failing to do this is a major misconfiguration. It >> may not have been caught sometimes, but that is a bug. >> >> It should be simple to fix the configuration you posted -- simply have a >> different name for the controller listener than the control plane listener. >> >> best, >> Colin >> >> On Mon, Oct 14, 2024, at 11:16, Jakub Scholz wrote: >> > The different name of the controller listener for KRaft controllers and >> > control plane listener in ZooKeeper-based cluster was not required before >> > and it is not simple to change to handle now at the "last minute". So >> given >> > that this is called production-ready already for some time, I think this >> is >> > breaking API change and should be treated as such. >> > >> > Thanks & Regards >> > Jakub >> > >> > On Mon, Oct 14, 2024 at 7:55 PM Colin McCabe <cmcc...@apache.org> wrote: >> > >> >> Hi Jakub, >> >> >> >> After looking through the attached file on the JIRA, I can say that this >> >> is a misconfiguration. control.plane.listener is a totally separate >> concept >> >> from control.plane.listener.name. They should never be set to the same >> >> value. The controller listener must have a different name and value than >> >> the control plane listener (if any). >> >> >> >> I also tested myself that KRaft migration works with >> >> control.plane.listener configured. It works on both 3.8 and 3.9-RC2. >> >> >> >> My initial statement that control.plane.listener was not supported >> during >> >> ZK migration was incorrect. As you said, it is supported during >> migration >> >> up to the point that we are in KRaft mode. (Another reason why having >> the >> >> control plane listener = controller listener would not make sense.) >> >> >> >> Thanks for the bug report and discussion. I've closed this as invalid >> now >> >> that I have tested migration using control.plane.listener for myself and >> >> verified that it works. >> >> >> >> best, >> >> Colin >> >> >> >> On Mon, Oct 14, 2024, at 08:31, Jakub Scholz wrote: >> >> >> control.plane.listener is not (and never has been) supported in KRaft >> >> > mode. >> >> > >> >> > You mean control.plane.listener.name is not supported in KRaft I >> guess? >> >> > Well, this is not KRaft, this is migration, so it uses the settings >> that >> >> it >> >> > used before for the Zoo-based cluster and that includes using >> dedicated >> >> > control plane listener. I don't think I can "just remove it" because >> the >> >> > other nodes will use it during the rolling update. >> >> > >> >> > This also worked fine with 3.8 (and 3.7, etc.) -> so if it is not >> >> supported >> >> > now, it is a breaking API change I guess which should be a blocker. >> >> > >> >> > Thanks & Regards >> >> > Jakub >> >> > >> >> > On Mon, Oct 14, 2024 at 5:12 PM Colin McCabe <cmcc...@apache.org> >> wrote: >> >> > >> >> >> Hi Jakub, >> >> >> >> >> >> Thanks for testing. control.plane.listener is not (and never has >> been) >> >> >> supported in KRaft mode. You have to remove control.plane.listener >> >> >> configurations before migrating. I filed KAFKA-17790 to document >> this in >> >> >> the migration instructions. (This is not a blocker for the release, >> >> though.) >> >> >> >> >> >> best, >> >> >> Colin >> >> >> >> >> >> On Mon, Oct 14, 2024, at 02:52, Jakub Scholz wrote: >> >> >> > Hi Colin, >> >> >> > >> >> >> > Thanks for the RC. I did some testing of it and run into >> >> >> > https://issues.apache.org/jira/browse/KAFKA-17788 which seems to >> be a >> >> >> > regression in the migration to KRaft process. >> >> >> > >> >> >> > Can someone who understands this part of the codebase look into it >> >> >> please? >> >> >> > >> >> >> > Thanks & Regards >> >> >> > Jakub >> >> >> > >> >> >> > On Thu, Oct 10, 2024 at 11:16 PM Colin McCabe <cmcc...@apache.org> >> >> >> wrote: >> >> >> > >> >> >> >> This is the second candidate for the release of Apache Kafka >> 3.9.0. I >> >> >> have >> >> >> >> titled it rc2 since I had an rc1 which got very far, even to the >> >> point >> >> >> of >> >> >> >> pushing tags and docker images, before I spotted an issue. So >> rather >> >> >> than >> >> >> >> mutate the tags, I decided to skip over rc1. >> >> >> >> >> >> >> >> - This is a major release, the final one in the 3.x line. (There >> may >> >> of >> >> >> >> course be other minor releases in this line, such as 3.9.1.) >> >> >> >> - Tiered storage will be considered production-ready in this >> release. >> >> >> >> - This will be the final major release to feature the deprecated >> >> >> ZooKeeper >> >> >> >> mode. >> >> >> >> >> >> >> >> This release includes the following KIPs: >> >> >> >> - KIP-853: Support dynamically changing KRaft controller >> membership >> >> >> >> - KIP-1057: Add remote log metadata flag to the dump log tool >> >> >> >> - KIP-1049: Add config log.summary.interval.ms to Kafka Streams >> >> >> >> - KIP-1040: Improve handling of nullable values in InsertField, >> >> >> >> ExtractField, and other transformations >> >> >> >> - KIP-1031: Control offset translation in MirrorSourceConnector >> >> >> >> - KIP-1033: Add Kafka Streams exception handler for exceptions >> >> occurring >> >> >> >> during processing >> >> >> >> - KIP-1017: Health check endpoint for Kafka Connect >> >> >> >> - KIP-1025: Optionally URL-encode clientID and clientSecret in >> >> >> >> authorization header >> >> >> >> - KIP-1005: Expose EarliestLocalOffset and TieredOffset >> >> >> >> - KIP-950: Tiered Storage Disablement >> >> >> >> - KIP-956: Tiered Storage Quotas >> >> >> >> >> >> >> >> Release notes for the 3.9.0 release: >> >> >> >> >> >> >> >> >> >> https://dist.apache.org/repos/dist/dev/kafka/3.9.0-rc2/RELEASE_NOTES.html >> >> >> >> >> >> >> >> *** Please download, test and vote by October 16, 2024. >> >> >> >> >> >> >> >> Kafka's KEYS file containing PGP keys we use to sign the release: >> >> >> >> https://kafka.apache.org/KEYS >> >> >> >> >> >> >> >> * Release artifacts to be voted upon (source and binary): >> >> >> >> https://dist.apache.org/repos/dist/dev/kafka/3.9.0-rc2/ >> >> >> >> >> >> >> >> * Docker release artifacts to be voted upon: >> >> >> >> apache/kafka:3.9.0-rc2 >> >> >> >> apache/kafka-native:3.9.0-rc2 >> >> >> >> >> >> >> >> * Maven artifacts to be voted upon: >> >> >> >> >> >> https://repository.apache.org/content/groups/staging/org/apache/kafka/ >> >> >> >> >> >> >> >> * Javadoc: >> >> >> >> https://dist.apache.org/repos/dist/dev/kafka/3.9.0-rc2/javadoc/ >> >> >> >> >> >> >> >> * Documentation: >> >> >> >> https://kafka.apache.org/39/documentation.html >> >> >> >> >> >> >> >> * Protocol: >> >> >> >> https://kafka.apache.org/39/protocol.html >> >> >> >> >> >> >> >> * Tag to be voted upon (off 3.9 branch) is the 3.9.0-rc2 tag: >> >> >> >> https://github.com/apache/kafka/releases/tag/3.9.0-rc2 >> >> >> >> >> >> >> >> * Successful Docker Image Github Actions Pipeline for 3.9 branch: >> >> >> >> Docker Build Test Pipeline (JVM): >> >> >> >> https://github.com/apache/kafka/actions/runs/11281563007 >> >> >> >> Docker Build Test Pipeline (Native): >> >> >> >> https://github.com/apache/kafka/actions/runs/11281608809 >> >> >> >> >> >> >> >> Thanks to everyone who helped with this release candidate, either >> by >> >> >> >> contributing code, testing, or documentation. >> >> >> >> >> >> >> >> Regards, >> >> >> >> Colin >> >> >> >> >> >> >> >> >> >>