Hi all, Thanks for reporting this issue. This issue is fixed in KAFKA-16583 <https://issues.apache.org/jira/browse/KAFKA-16583> and will be included in v3.7.1 and v3.8.0, which should be released in June or July.
Thanks. Luke On Tue, Jun 4, 2024 at 11:29 PM Sejal Patel <se...@playerzero.ai.invalid> wrote: > Thank you, Tobias. > > If the Jira ticket is correct, that explains a bunch. This was not > something we ever noticed in lower environments, but we did only recently > upgrade to 3.7 about a month ago. We've never had problems with upgrades > before, and this one seemed to go smoothly as well until this happened. > > I've never downgraded before. How hard is it? Can I simply replace the app > with the older version, or are there additional steps that need to be done > as well? > > Thanks again for your help. > > On Tue Jun 4, 2024, 12:46 PM GMT, tobias.b...@peripetie.de <mailto: > tobias.b...@peripetie.de> wrote: > > Had the same Issue some Weeks ago, we also did an upgrade to 3.7.. > > Did not receive any hints on this Topic back in the days. > > > > There is an open StackOverflow > https://stackoverflow.com/questions/78334479/unwritablemetadataexception-on-startup-in-apache-kafka > > There is also an open Kafka Jira Ticket > https://issues.apache.org/jira/browse/KAFKA-16662 > > We fixed it by downgrading to 3.6. > > > > Hope this helps. > > > > -----Ursprüngliche Nachricht----- > > Von: Sejal Patel <se...@playerzero.ai.INVALID> > > Gesendet: Montag, 3. Juni 2024 08:26 > > An: users@kafka.apache.org > > Betreff: How do I resolve an UnwritableMetadataException > > > > I was expanding my kafka cluster to 24 nodes (from 16 nodes) and > rebalancing the topics. 1 of the partitions of the topic did not get > rebalanced as expected (it was taking a million years so I decided to look > and see what was happening). It turns out that the script for mounting the > 2nd partition for use as /data did not kick in and thus there simply wasn't > enough disk space available at the time of the rebalance. The system was > left with like 5Mb of disk space and the kafka brokers were essentially > borked at that point. > > > > So I had to kill the kafka processes, move the original kafka data > folder to a /tmp location, mounted the data partition, and migrated the > /tmp kafka folder back to the original spot. But when I went to startup the > kafka instance I got this message over and over again every few > milliseconds. > > > > [2024-06-03 06:14:01,503] ERROR Encountered metadata loading fault: > Unhandled error initializing new publishers > (org.apache.kafka.server.fault.LoggingFaultHandler) > org.apache.kafka.image.writer.UnwritableMetadataException: Metadata has > been lost because the following could not be represented in metadata > version 3.4-IV0: the directory assign ment state of one or more replicas at > org.apache.kafka.image.writer.ImageWriterOptions.handleLoss(ImageWriterOptions.java:94) > at > org.apache.kafka.metadata.PartitionRegistration.toRecord(PartitionRegistration.java:391) > at org.apache.kafka.image.TopicImage.write(TopicImage.java:71) at > org.apache.kafka.image.TopicsImage.write(TopicsImage.java:84) at > org.apache.kafka.image.MetadataImage.write(MetadataImage.java:155) at > org.apache.kafka.image.loader.MetadataLoader.initializeNewPublishers(MetadataLoader.java:295) > at > org.apache.kafka.image.loader.MetadataLoader.lambda$scheduleInitializeNewPublishers$0(MetadataLoader.java:266) > at > org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181) > at java.base/java.lang.Thread.run(Thread.java:1583) [2024-06-03 > 06:14:01,556] INFO [BrokerLifecycleManager id=28] The broker is in > RECOVERY. (kafka.server.BrokerLifecycleManager) > > Scarier is that if any node that is working gets restarted, they too > start sending off that message as well. > > > > I am using a kraft setup and have within the past month upgraded to > kafka 3.7 (original setup over a year ago was kafka 3.4). How do I resolve > this issue? I'm not sure what the problem is or how to fix it. > > > > If I restart a kraft server, it dies with the same error message and can > never get spun up again. > > > > Is it possible to recover from this or do I need to start from scratch? > If I start from scratch, how do I keep the topics? What is the best way to > proceed from here? I'm unable to find anything related to this problem via > a google search. > > > > I'm at a loss and would appreciate any help you can provide. > > > > Thank you. > >