[ https://issues.apache.org/jira/browse/KAFKA-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xin updated KAFKA-9185: ----------------------- Summary: Schedule "kafka-log-retention" dosen't work ,kafka can't delete logs that exceeded retention time (was: schedule "kafka-log-retention" dosen't work ,kafka can't delete logs that exceeded retention time) > Schedule "kafka-log-retention" dosen't work ,kafka can't delete logs that > exceeded retention time > -------------------------------------------------------------------------------------------------- > > Key: KAFKA-9185 > URL: https://issues.apache.org/jira/browse/KAFKA-9185 > Project: Kafka > Issue Type: Bug > Components: log > Affects Versions: 0.10.0.1 > Reporter: Xin > Priority: Major > > there are some logs: > 2019-11-06 17:26:09,426 WARN kafka.server.ReplicaFetcherThread: > [ReplicaFetcherThread-8-10007], Replica 10002 for partition [XXXXXX,5] reset > its fetch offset from 9138650272 to current leader 10007's start offset > 9139468487 2019-11-06 17:26:09,426 WARN kafka.server.ReplicaFetcherThread: > [ReplicaFetcherThread-8-10007], Replica 10002 for partition [XXXXXX,5] reset > its fetch offset from 9138650272 to current leader 10007's start offset > 9139468487 2019-11-06 17:26:36,954 ERROR kafka.server.ReplicaFetcherThread: > [ReplicaFetcherThread-8-10007], Current offset 9138650272 for partition > [XXXXXX,5] out of range; reset offset to 9139468487 2019-11-06 18:20:56,533 > INFO kafka.log.Log: Deleting segment 9136990019 from log XXXXXX-5. > 2019-11-06 18:20:56,666 INFO kafka.log.OffsetIndex: Deleting index > /data4/zdh/kafka/data/XXXXXX-5/00000000009136990019.index.deleted > 2019-11-12 16:40:53,147 INFO kafka.log.Log: Scheduling log segment 9139468487 > for log XXXXXX-5 for deletion. 019-11-12 16:40:53,153 ERROR > kafka.utils.KafkaScheduler: Uncaught exception in scheduled task > 'kafka-log-retention' kafka.common.KafkaStorageException: Failed to change > the log file suffix from to .deleted for log segment 9139468487 at > kafka.log.LogSegment.kafkaStorageException$1(LogSegment.scala:263) at > kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:265) at > kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:832) at > kafka.log.Log.kafka$log$Log$$deleteSegment(Log.scala:823) at > kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:579) at > kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:579) at > scala.collection.immutable.List.foreach(List.scala:318) at > kafka.log.Log.deleteOldSegments(Log.scala:579) at > kafka.log.LogManager.kafka$log$LogManager$$cleanupExpiredSegments(LogManager.scala:427) > at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:458) > at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:456) at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) at > scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at > scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at > scala.collection.AbstractIterable.foreach(Iterable.scala:54) at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) > at kafka.log.LogManager.cleanupLogs(LogManager.scala:456) at > kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:192) at > kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:56) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745)Caused by: > java.nio.file.NoSuchFileException: > /data4/zdh/kafka/data/XXXXXX-5/00000000009139468487.log at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at > sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:403) at > sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) at > java.nio.file.Files.move(Files.java:1345) at > org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:670) at > kafka.log.FileMessageSet.renameTo(FileMessageSet.scala:370) ... 27 more > Suppressed: java.nio.file.NoSuchFileException: > /data4/zdh/kafka/data/XXXXXX-5/00000000009139468487.log -> > /data4/zdh/kafka/data/XXXXXX-5/00000000009139468487.log.deleted at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at > sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:390) at > sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) at > java.nio.file.Files.move(Files.java:1345) at > org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:667) > > 00000000009139468487.log is the oldest log. > Deleting looks like stoped after ‘2019-11-06 18:20:56’ > At about 11.12 16:40 the space used 98%,I delete some old logs include > 00000000009139468487.log/index manually, the delete schedule started working, > but an exception was throwed(file not exist),then kafka process exited. > Someone tell me why the deleting shcedule didn't work from 11.6 18:20 to > 11.12 16:40? > Because of lock? > > -- This message was sent by Atlassian Jira (v8.3.4#803005)