Hi all, We use Kafka version 2.11-1.1.1. We produce and consume transactional messages and recently we noticed that 2 partitions of the __consumer_offset topic have very high disk usage (256GB) When we looked at the log segments for these 2 partitions, there were files that were 6 months old. By dumping the content of an old log segment using the following command
kafka-run-class.sh kafka.tools.DumpLogSegments --deep-iteration --print-data-log --files 00000000003949894887.log | less we found that all the records were COMMIT transaction markers. offset: 1924582627 position: 183 CreateTime: 1548972578376 isvalid: true keysize: 4 valuesize: 6 magic: 2 compresscodec: NONE producerId: 126015 producerEpoch: 0 sequence: -1 isTransactional: true headerKeys: [] endTxnMarker: COMMIT coordinatorEpoch: 28 Why are the commit transaction markers not compacted and deleted? Log cleaner config max.message.bytes 10000120 min.cleanable.dirty.ratio 0.1 compression.type uncompressed cleanup.policy compact retention.ms 2160000000 segment.bytes 104857600 # By default the log cleaner is disabled and the log retention policy will default to just delete segments after their retention expires. # If log.cleaner.enable=true is set the cleaner will be enabled and individual logs can then be marked for log compaction. log.cleaner.enable=true # give larger heap space to log cleaner log.cleaner.dedupe.buffer.size=1342177280