Evan Huus created KAFKA-2871: -------------------------------- Summary: Newly replicated brokers don't expire log segments properly Key: KAFKA-2871 URL: https://issues.apache.org/jira/browse/KAFKA-2871 Project: Kafka Issue Type: Bug Components: replication Affects Versions: 0.8.2.1 Reporter: Evan Huus Assignee: Neha Narkhede Priority: Minor
We recently brought up a few brokers to replace some existing nodes, and used the provided script to reassign partitions from the retired nodes to the new ones, one at a time. A little while after the fact, we noticed extreme disk usage on the new nodes. Tracked this down to the fact that the replicated segments are all timestamped from the moment of replication rather than using whatever timestamp was set on the original node. Since this is the timestamp the log roller uses, it takes a full week (rollover time) before any data is purged from the new brokers. In the short term, what is the safest workaround? Can we just `rm` these old segments, or should we be messing with the filesystem metadata so kafka removes them itself? In the longer term, the partition mover should be setting timestamps appropriately on the segments it moves. -- This message was sent by Atlassian JIRA (v6.3.4#6332)