Hello,

We are still stuck with this issue where 2.11-1.1.0 distro is failing to
cleanup logs on Windows and brings the entire cluster down one by one.
Extending the retention hours and sizes don't help because they burden the
hard drive.

Here is the log

[2018-05-12 21:36:57,673] INFO [Log partition=test-0, dir=C:\kafka1] Rolled
> new log segment at offset 45 in 105 ms. (kafka.log.Log)
> [2018-05-12 21:36:57,673] INFO [Log partition=test-0, dir=C:\kafka1]
> Scheduling log segment [baseOffset 0, size 2290] for deletion.
> (kafka.log.Log)
> [2018-05-12 21:36:57,673] ERROR Error while deleting segments for test-0
> in dir C:\kafka1 (kafka.server.LogDirFailureChannel)
> java.nio.file.FileSystemException:
> C:\kafka1\test-0\00000000000000000000.log ->
> C:\kafka1\test-0\00000000000000000000.log.deleted: The process cannot
> access the file because it is being used by another process.
>
>         at
> sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
>         at
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
>         at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:387)
>         at
> sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
>         at java.nio.file.Files.move(Files.java:1395)
>         at
> org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:697)
>         at
> org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:212)
>         at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:415)
>         at kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:1601)
>         at kafka.log.Log.kafka$log$Log$$deleteSegment(Log.scala:1588)
>         at
> kafka.log.Log$$anonfun$deleteSegments$1$$anonfun$apply$mcI$sp$1.apply(Log.scala:1170)
>         at
> kafka.log.Log$$anonfun$deleteSegments$1$$anonfun$apply$mcI$sp$1.apply(Log.scala:1170)
>         at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>         at
> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>         at
> kafka.log.Log$$anonfun$deleteSegments$1.apply$mcI$sp(Log.scala:1170)
>         at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161)
>         at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161)
>         at kafka.log.Log.maybeHandleIOException(Log.scala:1678)
>         at kafka.log.Log.deleteSegments(Log.scala:1161)
>         at kafka.log.Log.deleteOldSegments(Log.scala:1156)
>         at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1228)
>         at kafka.log.Log.deleteOldSegments(Log.scala:1222)
>         at
> kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:854)
>         at
> kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:852)
>         at
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>         at scala.collection.immutable.List.foreach(List.scala:392)
>         at
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
>         at kafka.log.LogManager.cleanupLogs(LogManager.scala:852)
>         at
> kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:385)
>         at
> kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
>         at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
>         Suppressed: java.nio.file.FileSystemException:
> C:\kafka1\test-0\00000000000000000000.log ->
> C:\kafka1\test-0\00000000000000000000.log.deleted: The process cannot
> access the file because it is being used by another process.
>
>                 at
> sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
>                 at
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
>                 at
> sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301)
>                 at
> sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
>                 at java.nio.file.Files.move(Files.java:1395)
>                 at
> org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694)
>                 ... 32 more
> [2018-05-12 21:36:57,689] INFO [ReplicaManager broker=1] Stopping serving
> replicas in dir C:\kafka1 (kafka.server.ReplicaManager)
> [2018-05-12 21:36:57,689] ERROR Uncaught exception in scheduled task
> 'kafka-log-retention' (kafka.utils.KafkaScheduler)
> org.apache.kafka.common.errors.KafkaStorageException: Error while deleting
> segments for test-0 in dir C:\kafka1
> Caused by: java.nio.file.FileSystemException:
> C:\kafka1\test-0\00000000000000000000.log ->
> C:\kafka1\test-0\00000000000000000000.log.deleted: The process cannot
> access the file because it is being used by another process.
>
>         at
> sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
>         at
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
>         at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:387)
>         at
> sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
>         at java.nio.file.Files.move(Files.java:1395)
>         at
> org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:697)
>         at
> org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:212)
>         at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:415)
>         at kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:1601)
>         at kafka.log.Log.kafka$log$Log$$deleteSegment(Log.scala:1588)
>         at
> kafka.log.Log$$anonfun$deleteSegments$1$$anonfun$apply$mcI$sp$1.apply(Log.scala:1170)
>         at
> kafka.log.Log$$anonfun$deleteSegments$1$$anonfun$apply$mcI$sp$1.apply(Log.scala:1170)
>         at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>         at
> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>         at
> kafka.log.Log$$anonfun$deleteSegments$1.apply$mcI$sp(Log.scala:1170)
>         at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161)
>         at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:1161)
>         at kafka.log.Log.maybeHandleIOException(Log.scala:1678)
>         at kafka.log.Log.deleteSegments(Log.scala:1161)
>         at kafka.log.Log.deleteOldSegments(Log.scala:1156)
>         at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1228)
>         at kafka.log.Log.deleteOldSegments(Log.scala:1222)
>         at
> kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:854)
>         at
> kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:852)
>         at
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>         at scala.collection.immutable.List.foreach(List.scala:392)
>         at
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
>         at kafka.log.LogManager.cleanupLogs(LogManager.scala:852)
>         at
> kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:385)
>         at
> kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
>         at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
>         Suppressed: java.nio.file.FileSystemException:
> C:\kafka1\test-0\00000000000000000000.log ->
> C:\kafka1\test-0\00000000000000000000.log.deleted: The process cannot
> access the file because it is being used by another process.
>
>                 at
> sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
>                 at
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
>                 at
> sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301)
>                 at
> sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
>                 at java.nio.file.Files.move(Files.java:1395)
>                 at
> org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:694)
>                 ... 32 more
> [2018-05-12 21:36:57,689] INFO [ReplicaFetcherManager on broker 1] Removed
> fetcher for partitions
> __consumer_offsets-22,__consumer_offsets-30,__consumer_offsets-8,__consumer_offsets-21,__consumer_offsets-4,__consumer_offsets-27,__consumer_offsets-7,__consumer_offsets-9,__consumer_offsets-46,__consumer_offsets-25,__consumer_offsets-35,__consumer_offsets-41,__consumer_offsets-33,__consumer_offsets-23,__consumer_offsets-49,__consumer_offsets-47,__consumer_offsets-16,test-0,__consumer_offsets-28,__consumer_offsets-31,__consumer_offsets-36,__consumer_offsets-42,__consumer_offsets-3,__consumer_offsets-18,test1-0,__consumer_offsets-37,__consumer_offsets-15,__consumer_offsets-24,__consumer_offsets-38,__consumer_offsets-17,__consumer_offsets-48,__consumer_offsets-19,__consumer_offsets-11,__consumer_offsets-13,__consumer_offsets-2,__consumer_offsets-43,__consumer_offsets-6,__consumer_offsets-14,__consumer_offsets-20,__consumer_offsets-0,__consumer_offsets-44,__consumer_offsets-39,__consumer_offsets-12,__consumer_offsets-45,__consumer_offsets-1,__consumer_offsets-5,__consumer_offsets-26,__consumer_offsets-29,__consumer_offsets-34,__consumer_offsets-10,__consumer_offsets-32,__consumer_offsets-40
> (kafka.server.ReplicaFetcherManager)
> [2018-05-12 21:36:57,689] INFO [ReplicaAlterLogDirsManager on broker 1]
> Removed fetcher for partitions
> __consumer_offsets-22,__consumer_offsets-30,__consumer_offsets-8,__consumer_offsets-21,__consumer_offsets-4,__consumer_offsets-27,__consumer_offsets-7,__consumer_offsets-9,__consumer_offsets-46,__consumer_offsets-25,__consumer_offsets-35,__consumer_offsets-41,__consumer_offsets-33,__consumer_offsets-23,__consumer_offsets-49,__consumer_offsets-47,__consumer_offsets-16,test-0,__consumer_offsets-28,__consumer_offsets-31,__consumer_offsets-36,__consumer_offsets-42,__consumer_offsets-3,__consumer_offsets-18,test1-0,__consumer_offsets-37,__consumer_offsets-15,__consumer_offsets-24,__consumer_offsets-38,__consumer_offsets-17,__consumer_offsets-48,__consumer_offsets-19,__consumer_offsets-11,__consumer_offsets-13,__consumer_offsets-2,__consumer_offsets-43,__consumer_offsets-6,__consumer_offsets-14,__consumer_offsets-20,__consumer_offsets-0,__consumer_offsets-44,__consumer_offsets-39,__consumer_offsets-12,__consumer_offsets-45,__consumer_offsets-1,__consumer_offsets-5,__consumer_offsets-26,__consumer_offsets-29,__consumer_offsets-34,__consumer_offsets-10,__consumer_offsets-32,__consumer_offsets-40
> (kafka.server.ReplicaAlterLogDirsManager)
> [2018-05-12 21:36:57,751] INFO [ReplicaManager broker=1] Broker 1 stopped
> fetcher for partitions
> __consumer_offsets-22,__consumer_offsets-30,__consumer_offsets-8,__consumer_offsets-21,__consumer_offsets-4,__consumer_offsets-27,__consumer_offsets-7,__consumer_offsets-9,__consumer_offsets-46,__consumer_offsets-25,__consumer_offsets-35,__consumer_offsets-41,__consumer_offsets-33,__consumer_offsets-23,__consumer_offsets-49,__consumer_offsets-47,__consumer_offsets-16,test-0,__consumer_offsets-28,__consumer_offsets-31,__consumer_offsets-36,__consumer_offsets-42,__consumer_offsets-3,__consumer_offsets-18,test1-0,__consumer_offsets-37,__consumer_offsets-15,__consumer_offsets-24,__consumer_offsets-38,__consumer_offsets-17,__consumer_offsets-48,__consumer_offsets-19,__consumer_offsets-11,__consumer_offsets-13,__consumer_offsets-2,__consumer_offsets-43,__consumer_offsets-6,__consumer_offsets-14,__consumer_offsets-20,__consumer_offsets-0,__consumer_offsets-44,__consumer_offsets-39,__consumer_offsets-12,__consumer_offsets-45,__consumer_offsets-1,__consumer_offsets-5,__consumer_offsets-26,__consumer_offsets-29,__consumer_offsets-34,__consumer_offsets-10,__consumer_offsets-32,__consumer_offsets-40
> and stopped moving logs for partitions  because they are in the failed log
> directory C:\kafka1. (kafka.server.ReplicaManager)
> [2018-05-12 21:36:57,751] INFO Stopping serving logs in dir C:\kafka1
> (kafka.log.LogManager)
> [2018-05-12 21:36:57,767] ERROR Shutdown broker because all log dirs in
> C:\kafka1 have failed (kafka.log.LogManager)
>

Could someone please share some ideas how to rectify this on Windows? If
this will never be supported on Windows, could we get some official
communication perhaps?

Regards,

Reply via email to