Hi Stan, I wanted to share some updates about the bugs you shared earlier.
- KAFKA-14616: I've reviewed and tested the PR from Colin and have observed the fix works as intended. - KAFKA-16162: I reviewed Proven's PR and found some gaps in the proposed fix. I've therefore raised https://github.com/apache/kafka/pull/15270 following a discussion with Luke in JIRA. - KAFKA-16082: I don't think this is marked as a blocker anymore. I'm awaiting feedback/reviews at https://github.com/apache/kafka/pull/15136 In addition to the above, there are 2 JIRAs I'd like to bring everyone's attention to: - KAFKA-16157: This is similar to KAFKA-14616 and is marked as a blocker. I've raised https://github.com/apache/kafka/pull/15263 and am awaiting reviews on it. - KAFKA-16157: I raised this yesterday and have addressed feedback from Luke. This should hopefully get merged soon. Regards, Gaurav > On 24 Jan 2024, at 11:51, ka...@gnarula.com wrote: > > Hi Stanislav, > > Thanks for bringing these JIRAs/PRs up. > > I'll be testing the open PRs for KAFKA-14616 and KAFKA-16162 this week and I > hope to have some feedback > by Friday. I gather the latter JIRA is marked as a WIP by Proven and he's > away. I'll try to build on his work in the meantime. > > As for KAFKA-16082, we haven't been able to deduce a data loss scenario. > There's a PR open > by me for promoting an abandoned future replica with approvals from Omnia and > Proven, > so I'd appreciate a committer reviewing it. > > Regards, > Gaurav > > On 23 Jan 2024, at 20:17, Stanislav Kozlovski > <stanis...@confluent.io.INVALID> wrote: >> >> Hey all, I figured I'd give an update about what known blockers we have >> right now: >> >> - KAFKA-16101: KRaft migration rollback documentation is incorrect - >> https://github.com/apache/kafka/pull/15193; This need not block RC >> creation, but we need the docs updated so that people can test properly >> - KAFKA-14616: Topic recreation with offline broker causes permanent URPs - >> https://github.com/apache/kafka/pull/15230 ; I am of the understanding that >> this is blocking JBOD for 3.7 >> - KAFKA-16162: New created topics are unavailable after upgrading to 3.7 - >> a strict blocker with an open PR https://github.com/apache/kafka/pull/15232 >> - although I understand Proveen is out of office >> - KAFKA-16082: JBOD: Possible dataloss when moving leader partition - I am >> hearing mixed opinions on whether this is a blocker ( >> https://github.com/apache/kafka/pull/15136) >> >> Given that there are 3 JBOD blocker bugs, and I am not confident they will >> all be merged this week - I am on the edge of voting to revert JBOD from >> this release, or mark it early access. >> >> By all accounts, it seems that if we keep with JBOD the release will have >> to spill into February, which is a month extra from the time-based release >> plan we had of start of January. >> >> Can I ask others for an opinion? >> >> Best, >> Stan >> >> On Thu, Jan 18, 2024 at 1:21 PM Luke Chen <show...@gmail.com> wrote: >> >>> Hi all, >>> >>> I think I've found another blocker issue: KAFKA-16162 >>> <https://issues.apache.org/jira/browse/KAFKA-16162> . >>> The impact is after upgrading to 3.7.0, any new created topics/partitions >>> will be unavailable. >>> I've put my findings in the JIRA. >>> >>> Thanks. >>> Luke >>> >>> On Thu, Jan 18, 2024 at 9:50 AM Matthias J. Sax <mj...@apache.org> wrote: >>> >>>> Stan, thanks for driving this all forward! Excellent job. >>>> >>>> About >>>> >>>>> StreamsStandbyTask - https://issues.apache.org/jira/browse/KAFKA-16141 >>>>> StreamsUpgradeTest - https://issues.apache.org/jira/browse/KAFKA-16139 >>>> >>>> For `StreamsUpgradeTest` it was a test setup issue and should be fixed >>>> now in trunk and 3.7 (and actually also in 3.6...) >>>> >>>> For `StreamsStandbyTask` the failing test exposes a regression bug, so >>>> it's a blocker. I updated the ticket accordingly. We already have an >>>> open PR that reverts the code introducing the regression. >>>> >>>> >>>> -Matthias >>>> >>>> On 1/17/24 9:44 AM, Proven Provenzano wrote: >>>>> We have another blocking issue for the RC : >>>>> https://issues.apache.org/jira/browse/KAFKA-16157. This bug is similar >>>> to >>>>> https://issues.apache.org/jira/browse/KAFKA-14616. The new issue >>> however >>>>> can lead to the new topic having partitions that a producer cannot >>> write >>>> to. >>>>> >>>>> --Proven >>>>> >>>>> On Tue, Jan 16, 2024 at 12:04 PM Proven Provenzano < >>>> pprovenz...@confluent.io> >>>>> wrote: >>>>> >>>>>> >>>>>> I have a PR https://github.com/apache/kafka/pull/15197 for >>>>>> https://issues.apache.org/jira/browse/KAFKA-16131 that is building >>> now. >>>>>> --Proven >>>>>> >>>>>> On Mon, Jan 15, 2024 at 5:03 AM Jakub Scholz <ja...@scholz.cz> wrote: >>>>>> >>>>>>> *> Hi Jakub,> > Thanks for trying the RC. I think what you found is a >>>>>>> blocker bug because it * >>>>>>> *> will generate huge amount of logspam. I guess we didn't find it in >>>>>>> junit >>>>>>> tests * >>>>>>> *> since logspam doesn't fail the automated tests. But certainly it's >>>> not >>>>>>> suitable * >>>>>>> *> for production. Did you file a JIRA yet?* >>>>>>> >>>>>>> Hi Colin, >>>>>>> >>>>>>> I opened https://issues.apache.org/jira/browse/KAFKA-16131. >>>>>>> >>>>>>> Thanks & Regards >>>>>>> Jakub >>>>>>> >>>>>>> On Mon, Jan 15, 2024 at 8:57 AM Colin McCabe <cmcc...@apache.org> >>>> wrote: >>>>>>> >>>>>>>> Hi Stanislav, >>>>>>>> >>>>>>>> Thanks for making the first RC. The fact that it's titled RC2 is >>>> messing >>>>>>>> with my mind a bit. I hope this doesn't make people think that we're >>>>>>>> farther along than we are, heh. >>>>>>>> >>>>>>>> On Sun, Jan 14, 2024, at 13:54, Jakub Scholz wrote: >>>>>>>>> *> Nice catch! It does seem like we should have gated this behind >>> the >>>>>>>>> metadata> version as KIP-858 implies. Is the cluster configured >>> with >>>>>>>>> multiple log> dirs? What is the impact of the error messages?* >>>>>>>>> >>>>>>>>> I did not observe any obvious impact. I was able to send and >>> receive >>>>>>>>> messages as normally. But to be honest, I have no idea what else >>>>>>>>> this might impact, so I did not try anything special. >>>>>>>>> >>>>>>>>> I think everyone upgrading an existing KRaft cluster will go >>> through >>>>>>> this >>>>>>>>> stage (running Kafka 3.7 with an older metadata version for at >>> least >>>> a >>>>>>>>> while). So even if it is just a logged exception without any other >>>>>>>> impact I >>>>>>>>> wonder if it might scare users from upgrading. But I leave it to >>>>>>> others >>>>>>>> to >>>>>>>>> decide if this is a blocker or not. >>>>>>>>> >>>>>>>> >>>>>>>> Hi Jakub, >>>>>>>> >>>>>>>> Thanks for trying the RC. I think what you found is a blocker bug >>>>>>> because >>>>>>>> it will generate huge amount of logspam. I guess we didn't find it >>> in >>>>>>> junit >>>>>>>> tests since logspam doesn't fail the automated tests. But certainly >>>> it's >>>>>>>> not suitable for production. Did you file a JIRA yet? >>>>>>>> >>>>>>>>> On Sun, Jan 14, 2024 at 10:17 PM Stanislav Kozlovski >>>>>>>>> <stanis...@confluent.io.invalid> wrote: >>>>>>>>> >>>>>>>>>> Hey Luke, >>>>>>>>>> >>>>>>>>>> This is an interesting problem. Given the fact that the KIP for >>>>>>> having a >>>>>>>>>> 3.8 release passed, I think it weights the scale towards not >>> calling >>>>>>>> this a >>>>>>>>>> blocker and expecting it to be solved in 3.7.1. >>>>>>>>>> >>>>>>>>>> It is unfortunate that it would not seem safe to migrate to KRaft >>> in >>>>>>>> 3.7.0 >>>>>>>>>> (given the inability to rollback safely), but if that's true - the >>>>>>> same >>>>>>>>>> case would apply for 3.6.0. So in any case users w\ould be >>> expected >>>>>>> to >>>>>>>> use a >>>>>>>>>> patch release for this. >>>>>>>> >>>>>>>> Hi Luke, >>>>>>>> >>>>>>>> Thanks for testing rollback. I think this is a case where the >>>>>>>> documentation is wrong. The intention was to for the steps to >>>> basically >>>>>>> be: >>>>>>>> >>>>>>>> 1. roll all the brokers into zk mode, but with migration enabled >>>>>>>> 2. take down the kraft quorum >>>>>>>> 3. rmr /controller, allowing a hybrid broker to take over. >>>>>>>> 4. roll all the brokers into zk mode without migration enabled (if >>>>>>> desired) >>>>>>>> >>>>>>>> With these steps, there isn't really unavailability since a ZK >>>>>>> controller >>>>>>>> can be elected quickly after the kraft quorum is gone. >>>>>>>> >>>>>>>>>> Further, since we will have a 3.8 release - it is >>>>>>>>>> likely we will ultimately recommend users upgrade from that >>> version >>>>>>>> given >>>>>>>>>> its aim is to have strategic KRaft feature parity with ZK. >>>>>>>>>> That being said, I am not 100% on this. Let me know whether you >>>> think >>>>>>>> this >>>>>>>>>> should block the release, Luke. I am also tagging Colin and David >>> to >>>>>>>> weigh >>>>>>>>>> in with their opinions, as they worked on the migration logic. >>>>>>>> >>>>>>>> The rollback docs are new in 3.7 so the fact that they're wrong is a >>>>>>> clear >>>>>>>> blocker, I think. But easy to fix, I believe. I will create a PR. >>>>>>>> >>>>>>>> best, >>>>>>>> Colin >>>>>>>> >>>>>>>>>> >>>>>>>>>> Hey Kirk and Chris, >>>>>>>>>> >>>>>>>>>> Unless I'm missing something - KAFKALESS-16029 is simply a bad log >>>>>>> due >>>>>>>> to >>>>>>>>>> improper closing. And the PR description implies this has been >>>>>>> present >>>>>>>>>> since 3.5. While annoying, I don't see a strong reason for this to >>>>>>> block >>>>>>>>>> the release. >>>>>>>>>> >>>>>>>>>> Hey Jakub, >>>>>>>>>> >>>>>>>>>> Nice catch! It does seem like we should have gated this behind the >>>>>>>> metadata >>>>>>>>>> version as KIP-858 implies. Is the cluster configured with >>> multiple >>>>>>> log >>>>>>>>>> dirs? What is the impact of the error messages? >>>>>>>>>> >>>>>>>>>> Tagging Igor (the author of the KIP) to weigh in. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Stanislav >>>>>>>>>> >>>>>>>>>> On Sat, Jan 13, 2024 at 7:22 PM Jakub Scholz <ja...@scholz.cz> >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I was trying the RC2 and run into the following issue ... when I >>>>>>> run >>>>>>>>>>> 3.7.0-RC2 KRaft cluster with metadata version set to 3.6-IV2 >>>>>>> metadata >>>>>>>>>>> version, I seem to be getting repeated errors like this in the >>>>>>>> controller >>>>>>>>>>> logs: >>>>>>>>>>> >>>>>>>>>>> 2024-01-13 16:58:01,197 INFO [QuorumController id=0] >>>>>>>>>> assignReplicasToDirs: >>>>>>>>>>> event failed with UnsupportedVersionException in 15 microseconds. >>>>>>>>>>> (org.apache.kafka.controller.QuorumController) >>>>>>>>>>> [quorum-controller-0-event-handler] >>>>>>>>>>> 2024-01-13 16:58:01,197 ERROR [ControllerApis nodeId=0] >>> Unexpected >>>>>>>> error >>>>>>>>>>> handling request RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS, >>>>>>>>>>> apiVersion=0, clientId=1000, correlationId=14, headerVersion=2) >>> -- >>>>>>>>>>> AssignReplicasToDirsRequestData(brokerId=1000, brokerEpoch=5, >>>>>>>>>>> directories=[DirectoryData(id=w_uxN7pwQ6eXSMrOKceYIQ, >>>>>>>>>>> topics=[TopicData(topicId=bvAKLSwmR7iJoKv2yZgygQ, >>>>>>>>>>> partitions=[PartitionData(partitionIndex=2), >>>>>>>>>>> PartitionData(partitionIndex=1)]), >>>>>>>>>>> TopicData(topicId=uNe7f5VrQgO0zST6yH1jDQ, >>>>>>>>>>> partitions=[PartitionData(partitionIndex=0)])])]) with context >>>>>>>>>>> >>> RequestContext(header=RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS, >>>>>>>>>>> apiVersion=0, clientId=1000, correlationId=14, headerVersion=2), >>>>>>>>>>> connectionId='172.16.14.219:9090-172.16.14.217:53590-7', >>>>>>>> clientAddress=/ >>>>>>>>>>> 172.16.14.217, principal=User:CN=my-cluster-kafka,O=io.strimzi, >>>>>>>>>>> listenerName=ListenerName(CONTROLPLANE-9090), >>> securityProtocol=SSL, >>>>>>>>>>> >>> clientInformation=ClientInformation(softwareName=apache-kafka-java, >>>>>>>>>>> softwareVersion=3.7.0), fromPrivilegedListener=false, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> principalSerde=Optional[org.apache.kafka.common.security.authenticator.DefaultKafkaPrincipalBuilder@71004ad2 >>>>>>>>>>> ]) >>>>>>>>>>> (kafka.server.ControllerApis) [quorum-controller-0-event-handler] >>>>>>>>>>> java.util.concurrent.CompletionException: >>>>>>>>>>> org.apache.kafka.common.errors.UnsupportedVersionException: >>>>>>> Directory >>>>>>>>>>> assignment is not supported yet. >>>>>>>>>>> >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> org.apache.kafka.controller.QuorumController$ControllerWriteEvent.complete(QuorumController.java:880) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> org.apache.kafka.controller.QuorumController$ControllerWriteEvent.handleException(QuorumController.java:871) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> org.apache.kafka.queue.KafkaEventQueue$EventContext.completeWithException(KafkaEventQueue.java:148) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:137) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210) >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181) >>>>>>>>>>> at java.base/java.lang.Thread.run(Thread.java:840) >>>>>>>>>>> >>>>>>>>>>> Caused by: >>>>>>> org.apache.kafka.common.errors.UnsupportedVersionException: >>>>>>>>>>> Directory assignment is not supported yet. >>>>>>>>>>> >>>>>>>>>>> Is that expected? I guess with the metadata version set to >>>>>>> 3.6-IV2, it >>>>>>>>>>> makes sense that the request is not supported. But shouldn't then >>>>>>> the >>>>>>>>>>> request not be sent at all by the brokers? (I did not opened a >>> JIRA >>>>>>>> for >>>>>>>>>> it, >>>>>>>>>>> but I can open one if you agree this is not expected) >>>>>>>>>>> >>>>>>>>>>> Thanks & Regards >>>>>>>>>>> Jakub >>>>>>>>>>> >>>>>>>>>>> On Sat, Jan 13, 2024 at 8:03 AM Luke Chen <show...@gmail.com> >>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Stanislav, >>>>>>>>>>>> >>>>>>>>>>>> I commented in the "Apache Kafka 3.7.0 Release" thread, but >>> maybe >>>>>>>> you >>>>>>>>>>>> missed it. >>>>>>>>>>>> cross-posting here: >>>>>>>>>>>> >>>>>>>>>>>> There is a bug KAFKA-16101 >>>>>>>>>>>> <https://issues.apache.org/jira/browse/KAFKA-16101> reporting >>>>>>> that >>>>>>>>>>> "Kafka >>>>>>>>>>>> cluster will be unavailable during KRaft migration rollback". >>>>>>>>>>>> The impact for this issue is that if brokers try to rollback to >>>>>>> ZK >>>>>>>> mode >>>>>>>>>>>> during KRaft migration process, there will be a period of time >>>>>>> the >>>>>>>>>>> cluster >>>>>>>>>>>> is unavailable. >>>>>>>>>>>> Since ZK migrating to KRaft feature is a production ready >>>>>>> feature, I >>>>>>>>>>> think >>>>>>>>>>>> this should be addressed soon. >>>>>>>>>>>> Do you think this is a blocker for v3.7.0? >>>>>>>>>>>> >>>>>>>>>>>> Thanks. >>>>>>>>>>>> Luke >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Jan 13, 2024 at 8:36 AM Chris Egerton < >>>>>>>> fearthecel...@gmail.com >>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, Kirk! >>>>>>>>>>>>> >>>>>>>>>>>>> @Stanislav--do you believe that this warrants a new RC? >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jan 12, 2024, 19:08 Kirk True <k...@kirktrue.pro> >>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Chris/Stanislav, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm working on the 'Unable to find FetchSessionHandler' log >>>>>>>> problem >>>>>>>>>>>>>> (KAFKA-16029) and have put out a draft PR ( >>>>>>>>>>>>>> https://github.com/apache/kafka/pull/15186). I will use the >>>>>>>>>>> quickstart >>>>>>>>>>>>>> approach as a second means to reproduce/verify while I wait >>>>>>> for >>>>>>>> the >>>>>>>>>>>> PR's >>>>>>>>>>>>>> Jenkins job to finish. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Kirk >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jan 12, 2024, at 11:31 AM, Chris Egerton wrote: >>>>>>>>>>>>>>> Hi Stanislav, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for running this release! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> To verify, I: >>>>>>>>>>>>>>> - Built from source using Java 11 with both: >>>>>>>>>>>>>>> - - the 3.7.0-rc2 tag on GitHub >>>>>>>>>>>>>>> - - the kafka-3.7.0-src.tgz artifact from >>>>>>>>>>>>>>> >>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/ >>>>>>>>>>>>>>> - Checked signatures and checksums >>>>>>>>>>>>>>> - Ran the quickstart using both: >>>>>>>>>>>>>>> - - The kafka_2.13-3.7.0.tgz artifact from >>>>>>>>>>>>>>> >>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/ >>>>>>>>>> with >>>>>>>>>>>> Java >>>>>>>>>>>>>> 11 >>>>>>>>>>>>>>> and Scala 13 in KRaft mode >>>>>>>>>>>>>>> - - Our shiny new broker Docker image, >>>>>>> apache/kafka:3.7.0-rc2 >>>>>>>>>>>>>>> - Ran all unit tests >>>>>>>>>>>>>>> - Ran all integration tests for Connect and MM2 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I found two minor areas for concern: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. (Possibly a blocker) >>>>>>>>>>>>>>> When running the quickstart, I noticed this ERROR-level log >>>>>>>>>> message >>>>>>>>>>>>> being >>>>>>>>>>>>>>> emitted frequently (not not every time) when I killed my >>>>>>>> console >>>>>>>>>>>>> consumer >>>>>>>>>>>>>>> via ctrl-C: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [2024-01-12 11:00:31,088] ERROR [Consumer >>>>>>>>>>>> clientId=console-consumer, >>>>>>>>>>>>>>> groupId=console-consumer-74388] Unable to find >>>>>>>>>> FetchSessionHandler >>>>>>>>>>>> for >>>>>>>>>>>>>> node >>>>>>>>>>>>>>> 1. Ignoring fetch response >>>>>>>>>>>>>>> (org.apache.kafka.clients.consumer.internals.AbstractFetch) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I see that this error message is already reported in >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-16029. I >>>>>>> think we >>>>>>>>>>> should >>>>>>>>>>>>>>> prioritize fixing it for this release. I know it's probably >>>>>>>>>> benign >>>>>>>>>>>> but >>>>>>>>>>>>>> it's >>>>>>>>>>>>>>> really not a good look for us when basic operations log >>>>>>> error >>>>>>>>>>>> messages, >>>>>>>>>>>>>> and >>>>>>>>>>>>>>> it may give new users some headaches. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. (Probably not a blocker) >>>>>>>>>>>>>>> The following unit tests failed the first time around, and >>>>>>>> all of >>>>>>>>>>>> them >>>>>>>>>>>>>>> passed the second time I ran them: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - (clients) >>>>>>>>>>>>>> >>>>>>> ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup() >>>>>>>>>>>>>>> - (clients) SelectorTest.testConnectionsByClientMetric() >>>>>>>>>>>>>>> - (clients) >>>>>>> Tls13SelectorTest.testConnectionsByClientMetric() >>>>>>>>>>>>>>> - (connect) >>>>>>>>>>>> TopicAdminTest.retryEndOffsetsShouldRetryWhenTopicNotFound >>>>>>>>>>>>> (I >>>>>>>>>>>>>>> thought I fixed this one! 🤬🤬) >>>>>>>>>>>>>>> - (core) >>>>>>>> ProducerIdManagerTest.testUnrecoverableErrors(Errors)[2] >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks again for your work on this release, and >>>>>>>> congratulations >>>>>>>>>> to >>>>>>>>>>>>> Kafka >>>>>>>>>>>>>>> Streams for having zero flaky unit tests during my >>>>>>>>>>>> highly-experimental >>>>>>>>>>>>>>> single laptop run! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Jan 11, 2024 at 1:33 PM Stanislav Kozlovski >>>>>>>>>>>>>>> <stanis...@confluent.io.invalid> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hello Kafka users, developers, and client-developers, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This is the first candidate for release of Apache Kafka >>>>>>>> 3.7.0. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Note it's named "RC2" because I had a few "failed" RCs >>>>>>> that >>>>>>>> I >>>>>>>>>> had >>>>>>>>>>>>>>>> cut/uploaded but ultimately had to scrap prior to >>>>>>> announcing >>>>>>>>>> due >>>>>>>>>>> to >>>>>>>>>>>>> new >>>>>>>>>>>>>>>> blockers arriving before I could even announce them. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Further - I haven't yet been able to set up the system >>>>>>> tests >>>>>>>>>>>>>> successfully. >>>>>>>>>>>>>>>> And the integration/unit tests do have a few failures >>>>>>> that I >>>>>>>>>> have >>>>>>>>>>>> to >>>>>>>>>>>>>> spend >>>>>>>>>>>>>>>> time triaging. I would appreciate any help in case anyone >>>>>>>>>> notices >>>>>>>>>>>> any >>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>> failing that they're subject matters experts in. Expect >>>>>>> me >>>>>>>> to >>>>>>>>>>>> follow >>>>>>>>>>>>>> up in >>>>>>>>>>>>>>>> a day or two with more detailed analysis. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Major changes include: >>>>>>>>>>>>>>>> - Early Access to KIP-848 - the next generation of the >>>>>>>> consumer >>>>>>>>>>>>>> rebalance >>>>>>>>>>>>>>>> protocol >>>>>>>>>>>>>>>> - KIP-858: Adding JBOD support to KRaft >>>>>>>>>>>>>>>> - KIP-714: Observability into Client metrics via a >>>>>>>> standardized >>>>>>>>>>>>>> interface >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Check more information in the WIP blog post: >>>>>>>>>>>>>>>> https://github.com/apache/kafka-site/pull/578 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Release notes for the 3.7.0 release: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/RELEASE_NOTES.html >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *** Please download, test and vote by Thursday, January >>>>>>> 18, >>>>>>>> 9am >>>>>>>>>>> PT >>>>>>>>>>>>> *** >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Usually these deadlines tend to be 2-3 days, but due to >>>>>>> this >>>>>>>>>>> being >>>>>>>>>>>>> the >>>>>>>>>>>>>>>> first RC and the tests not having ran yet, I am giving >>>>>>> it a >>>>>>>> bit >>>>>>>>>>>> more >>>>>>>>>>>>>> time. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Kafka's KEYS file containing PGP keys we use to sign the >>>>>>>>>> release: >>>>>>>>>>>>>>>> https://kafka.apache.org/KEYS >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * Release artifacts to be voted upon (source and binary): >>>>>>>>>>>>>>>> >>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * Docker release artifact to be voted upon: >>>>>>>>>>>>>>>> apache/kafka:3.7.0-rc2 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * Maven artifacts to be voted upon: >>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> >>>>>>> >>> https://repository.apache.org/content/groups/staging/org/apache/kafka/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * Javadoc: >>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/javadoc/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * Tag to be voted upon (off 3.7 branch) is the 3.7.0 tag: >>>>>>>>>>>>>>>> https://github.com/apache/kafka/releases/tag/3.7.0-rc2 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * Documentation: >>>>>>>>>>>>>>>> https://kafka.apache.org/37/documentation.html >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * Protocol: >>>>>>>>>>>>>>>> https://kafka.apache.org/37/protocol.html >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * Successful Jenkins builds for the 3.7 branch: >>>>>>>>>>>>>>>> Unit/integration tests: >>>>>>>>>>>>>>>> >>>>>>>> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.7/58/ >>>>>>>>>>>>>>>> There are failing tests here. I have to follow up with >>>>>>>> triaging >>>>>>>>>>>> some >>>>>>>>>>>>> of >>>>>>>>>>>>>>>> the failures and figuring out if they're actual problems >>>>>>> or >>>>>>>>>>> simply >>>>>>>>>>>>>> flakes. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> System tests: >>>>>>>>>>>>>> https://jenkins.confluent.io/job/system-test-kafka/job/3.7/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> No successful system test runs yet. I am working on >>>>>>> getting >>>>>>>> the >>>>>>>>>>> job >>>>>>>>>>>>> to >>>>>>>>>>>>>> run. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * Successful Docker Image Github Actions Pipeline for 3.7 >>>>>>>>>> branch: >>>>>>>>>>>>>>>> Attached are the scan_report and report_jvm output files >>>>>>>> from >>>>>>>>>> the >>>>>>>>>>>>>> Docker >>>>>>>>>>>>>>>> Build run: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>> >>>> https://github.com/apache/kafka/actions/runs/7486094960/job/20375761673 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And the final docker image build job - Docker Build Test >>>>>>>>>>> Pipeline: >>>>>>>>>>>>>>>> https://github.com/apache/kafka/actions/runs/7486178277 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The image is apache/kafka:3.7.0-rc2 - >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>> >>> https://hub.docker.com/layers/apache/kafka/3.7.0-rc2/images/sha256-5b4707c08170d39549fbb6e2a3dbb83936a50f987c0c097f23cb26b4c210c226?context=explore >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /************************************** >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Stanislav Kozlovski >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Best, >>>>>>>>>> Stanislav >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >> >> -- >> Best, >> Stanislav >