Hawking Du created KAFKA-9877:
---------------------------------
Summary: ERROR Shutdown broker because all log dirs in
/tmp/kafka-logs have failed (kafka.log.LogManager)
Key: KAFKA-9877
URL: https://issues.apache.org/jira/browse/KAFKA-9877
Project: Kafka
Issue Type: Bug
Components: log cleaner
Affects Versions: 2.1.1
Environment: Redhat
Reporter: Hawking Du
Attachments: server-125.log
There is a so confused problem around me long time.
Kafka server often stop exceptionally seems caused by log clean process. Here
are some of logs from server. Can anyone give me some ideas for fixing it.
{code:java}
[2020-04-04 02:07:57,410] INFO [GroupMetadataManager brokerId=5] Removed 0
expired offsets in 0 milliseconds.
(kafka.coordinator.group.GroupMetadataManager)[2020-04-04 02:07:57,410] INFO
[GroupMetadataManager brokerId=5] Removed 0 expired offsets in 0 milliseconds.
(kafka.coordinator.group.GroupMetadataManager)[2020-04-04 02:17:57,410] INFO
[GroupMetadataManager brokerId=5] Removed 0 expired offsets in 0 milliseconds.
(kafka.coordinator.group.GroupMetadataManager)[2020-04-04 02:27:57,410] INFO
[GroupMetadataManager brokerId=5] Removed 0 expired offsets in 0 milliseconds.
(kafka.coordinator.group.GroupMetadataManager)[2020-04-04 02:30:22,272] INFO
[ProducerStateManager partition=__consumer_offsets-35] Writing producer
snapshot at offset 741037 (kafka.log.ProducerStateManager)[2020-04-04
02:30:22,274] INFO [Log partition=__consumer_offsets-35, dir=/tmp/kafka-logs]
Rolled new log segment at offset 741037 in 3 ms. (kafka.log.Log)[2020-04-04
02:30:26,289] ERROR Failed to clean up log for __consumer_offsets-35 in dir
/tmp/kafka-logs due to IOException
(kafka.server.LogDirFailureChannel)java.nio.file.NoSuchFileException:
/tmp/kafka-logs/__consumer_offsets-35/00000000000000000000.log at
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at
sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409) at
sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) at
java.nio.file.Files.move(Files.java:1395) at
org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:815) at
org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:224) at
kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:508) at
kafka.log.Log.asyncDeleteSegment(Log.scala:1962) at
kafka.log.Log.$anonfun$replaceSegments$6(Log.scala:2025) at
kafka.log.Log.$anonfun$replaceSegments$6$adapted(Log.scala:2020) at
scala.collection.immutable.List.foreach(List.scala:392) at
kafka.log.Log.replaceSegments(Log.scala:2020) at
kafka.log.Cleaner.cleanSegments(LogCleaner.scala:602) at
kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:528) at
kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:527) at
scala.collection.immutable.List.foreach(List.scala:392) at
kafka.log.Cleaner.doClean(LogCleaner.scala:527) at
kafka.log.Cleaner.clean(LogCleaner.scala:501) at
kafka.log.LogCleaner$CleanerThread.cleanLog(LogCleaner.scala:359) at
kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:328) at
kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:307) at
kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89) Suppressed:
java.nio.file.NoSuchFileException:
/tmp/kafka-logs/__consumer_offsets-35/00000000000000000000.log ->
/tmp/kafka-logs/__consumer_offsets-35/00000000000000000000.log.deleted at
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at
sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396) at
sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) at
java.nio.file.Files.move(Files.java:1395) at
org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:812) ...
17 more[2020-04-04 02:30:26,296] INFO [ReplicaManager broker=5] Stopping
serving replicas in dir /tmp/kafka-logs
(kafka.server.ReplicaManager)[2020-04-04 02:30:26,302] INFO
[ReplicaFetcherManager on broker 5] Removed fetcher for partitions
Set(fitment-deduct-0, __consumer_offsets-22, __consumer_offsets-30,
__consumer_offsets-4, __consumer_offsets-27, __consumer_offsets-7,
__consumer_offsets-9, __consumer_offsets-46, __consumer_offsets-35,
__consumer_offsets-23, __consumer_offsets-49, __consumer_offsets-47, test-0,
__consumer_offsets-31, __consumer_offsets-42, __consumer_offsets-3,
__consumer_offsets-18, __consumer_offsets-15, __consumer_offsets-24,
ajhz-log-0, __consumer_offsets-38, __consumer_offsets-19,
__consumer_offsets-11, bpinfo-sync-0, spinfo-sync-backup-0,
__consumer_offsets-2, __consumer_offsets-43, __consumer_offsets-6,
__consumer_offsets-14, __consumer_offsets-44, __consumer_offsets-39,
__consumer_offsets-26, __consumer_offsets-29, __consumer_offsets-34,
__consumer_offsets-10, video-log-0)
(kafka.server.ReplicaFetcherManager)[2020-04-04 02:30:26,303] INFO
[ReplicaAlterLogDirsManager on broker 5] Removed fetcher for partitions
Set(fitment-deduct-0, __consumer_offsets-22, __consumer_offsets-30,
__consumer_offsets-4, __consumer_offsets-27, __consumer_offsets-7,
__consumer_offsets-9, __consumer_offsets-46, __consumer_offsets-35,
__consumer_offsets-23, __consumer_offsets-49, __consumer_offsets-47, test-0,
__consumer_offsets-31, __consumer_offsets-42, __consumer_offsets-3,
__consumer_offsets-18, __consumer_offsets-15, __consumer_offsets-24,
ajhz-log-0, __consumer_offsets-38, __consumer_offsets-19,
__consumer_offsets-11, bpinfo-sync-0, spinfo-sync-backup-0,
__consumer_offsets-2, __consumer_offsets-43, __consumer_offsets-6,
__consumer_offsets-14, __consumer_offsets-44, __consumer_offsets-39,
__consumer_offsets-26, __consumer_offsets-29, __consumer_offsets-34,
__consumer_offsets-10, video-log-0)
(kafka.server.ReplicaAlterLogDirsManager)[2020-04-04 02:30:26,330] INFO
[ReplicaManager broker=5] Broker 5 stopped fetcher for partitions
fitment-deduct-0,__consumer_offsets-22,__consumer_offsets-30,__consumer_offsets-4,__consumer_offsets-27,__consumer_offsets-7,__consumer_offsets-9,__consumer_offsets-46,__consumer_offsets-35,__consumer_offsets-23,__consumer_offsets-49,__consumer_offsets-47,test-0,__consumer_offsets-31,__consumer_offsets-42,__consumer_offsets-3,__consumer_offsets-18,__consumer_offsets-15,__consumer_offsets-24,ajhz-log-0,__consumer_offsets-38,__consumer_offsets-19,__consumer_offsets-11,bpinfo-sync-0,spinfo-sync-backup-0,__consumer_offsets-2,__consumer_offsets-43,__consumer_offsets-6,__consumer_offsets-14,__consumer_offsets-44,__consumer_offsets-39,__consumer_offsets-26,__consumer_offsets-29,__consumer_offsets-34,__consumer_offsets-10,video-log-0
and stopped moving logs for partitions because they are in the failed log
directory /tmp/kafka-logs. (kafka.server.ReplicaManager)[2020-04-04
02:30:26,330] INFO Stopping serving logs in dir /tmp/kafka-logs
(kafka.log.LogManager)[2020-04-04 02:30:26,347] ERROR Shutdown broker because
all log dirs in /tmp/kafka-logs have failed (kafka.log.LogManager){code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)