[ https://issues.apache.org/jira/browse/KAFKA-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16218969#comment-16218969 ]
Adrian McCague commented on KAFKA-6075: --------------------------------------- We have witnessed this issue as well, we develop on windows machines and our acceptance tests use the embedded kafka instances to run against. We are running 0.10.2.1 and do not see this issue, if we attempt to upgrade to 0.11.0.0 then we see this for any and all tests that use the Embedded Kafka instance, currently blocking our upgrade path until we figure out a workaround. > Kafka cannot recover after an unclean shutdown on Windows > --------------------------------------------------------- > > Key: KAFKA-6075 > URL: https://issues.apache.org/jira/browse/KAFKA-6075 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.11.0.1 > Reporter: Vahid Hashemian > > An unclean shutdown of broker on Windows cannot be recovered by Kafka. Steps > to reproduce from a fresh build: > # Start zookeeper > # Start a broker > # Create a topic {{test}} > # Do an unclean shutdown of broker (find the process id by {{wmic process > where "caption = 'java.exe' and commandline like '%server.properties%'" get > processid}}), then kill the process by {{taskkill /pid #### /f}} > # Start the broker again > This leads to the following errors: > {code} > [2017-10-17 17:13:24,819] ERROR Error while loading log dir C:\tmp\kafka-logs > (kafka.log.LogManager) > java.nio.file.FileSystemException: > C:\tmp\kafka-logs\test-0\00000000000000000000.timeindex: The process cannot > access the file because it is being used by another process. > at > sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) > at > sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) > at > sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) > at > sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) > at > sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108) > at java.nio.file.Files.deleteIfExists(Files.java:1165) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:333) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:295) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at kafka.log.Log.loadSegmentFiles(Log.scala:295) > at kafka.log.Log.loadSegments(Log.scala:404) > at kafka.log.Log.<init>(Log.scala:201) > at kafka.log.Log$.apply(Log.scala:1729) > at > kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:221) > at > kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$8$$anonfun$apply$16$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:292) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [2017-10-17 17:13:24,819] ERROR Error while deleting the clean shutdown file > in dir C:\tmp\kafka-logs (kafka.server.LogDirFailureChannel) > java.nio.file.FileSystemException: > C:\tmp\kafka-logs\test-0\00000000000000000000.timeindex: The process cannot > access the file because it is being used by another process. > at > sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) > at > sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) > at > sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) > at > sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) > at > sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108) > at java.nio.file.Files.deleteIfExists(Files.java:1165) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:333) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:295) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at kafka.log.Log.loadSegmentFiles(Log.scala:295) > at kafka.log.Log.loadSegments(Log.scala:404) > at kafka.log.Log.<init>(Log.scala:201) > at kafka.log.Log$.apply(Log.scala:1729) > at > kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:221) > at > kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$8$$anonfun$apply$16$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:292) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [2017-10-17 17:13:24,819] INFO Logs loading complete in 47 ms. > (kafka.log.LogManager) > [2017-10-17 17:13:24,865] WARN Error processing > kafka.log:type=LogManager,name=LogDirectoryOffline,logDirectory=C:\tmp\kafka-logs > (com.yammer.metrics.reporting.JmxReporter) > javax.management.MalformedObjectNameException: Invalid character ':' in value > part of property > at javax.management.ObjectName.construct(ObjectName.java:618) > at javax.management.ObjectName.<init>(ObjectName.java:1382) > at > com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395) > at > com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516) > at > com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491) > at > com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79) > at > kafka.metrics.KafkaMetricsGroup$class.newGauge(KafkaMetricsGroup.scala:80) > at kafka.log.LogManager.newGauge(LogManager.scala:50) > at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:117) > at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:116) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at kafka.log.LogManager.<init>(LogManager.scala:116) > at kafka.log.LogManager$.apply(LogManager.scala:799) > at kafka.server.KafkaServer.startup(KafkaServer.scala:222) > at > kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38) > at kafka.Kafka$.main(Kafka.scala:92) > at kafka.Kafka.main(Kafka.scala) > [2017-10-17 17:13:24,865] INFO Starting log cleanup with a period of 300000 > ms. (kafka.log.LogManager) > [2017-10-17 17:13:24,881] INFO Starting log flusher with a default period of > 9223372036854775807 ms. (kafka.log.LogManager) > [2017-10-17 17:13:25,131] INFO Awaiting socket connections on 0.0.0.0:9092. > (kafka.network.Acceptor) > [2017-10-17 17:13:25,147] INFO [SocketServer brokerId=0] Started 1 acceptor > threads (kafka.network.SocketServer) > [2017-10-17 17:13:25,162] INFO [ExpirationReaper-0-Produce]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2017-10-17 17:13:25,162] INFO [ExpirationReaper-0-DeleteRecords]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2017-10-17 17:13:25,162] INFO [ExpirationReaper-0-Fetch]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2017-10-17 17:13:25,162] INFO [LogDirFailureHandler]: Starting > (kafka.server.ReplicaManager$LogDirFailureHandler) > [2017-10-17 17:13:25,162] INFO [ReplicaManager broker=0] Stopping serving > replicas in dir C:\tmp\kafka-logs (kafka.server.ReplicaManager) > [2017-10-17 17:13:25,162] INFO [ReplicaManager broker=0] Partitions are > offline due to failure on log directory C:\tmp\kafka-logs > (kafka.server.ReplicaManager) > [2017-10-17 17:13:25,162] INFO [ReplicaFetcherManager on broker 0] Removed > fetcher for partitions (kafka.server.ReplicaFetcherManager) > [2017-10-17 17:13:25,178] INFO [ReplicaManager broker=0] Broker 0 stopped > fetcher for partitions because they are in the failed log dir > C:\tmp\kafka-logs (kafka.server.ReplicaManager) > [2017-10-17 17:13:25,178] INFO Stopping serving logs in dir C:\tmp\kafka-logs > (kafka.log.LogManager) > [2017-10-17 17:13:25,178] FATAL Shutdown broker because all log dirs in > C:\tmp\kafka-logs have failed (kafka.log.LogManager) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)