Evan Huus created KAFKA-2871:
--------------------------------

             Summary: Newly replicated brokers don't expire log segments 
properly
                 Key: KAFKA-2871
                 URL: https://issues.apache.org/jira/browse/KAFKA-2871
             Project: Kafka
          Issue Type: Bug
          Components: replication
    Affects Versions: 0.8.2.1
            Reporter: Evan Huus
            Assignee: Neha Narkhede
            Priority: Minor


We recently brought up a few brokers to replace some existing nodes, and used 
the provided script to reassign partitions from the retired nodes to the new 
ones, one at a time.

A little while after the fact, we noticed extreme disk usage on the new nodes. 
Tracked this down to the fact that the replicated segments are all timestamped 
from the moment of replication rather than using whatever timestamp was set on 
the original node. Since this is the timestamp the log roller uses, it takes a 
full week (rollover time) before any data is purged from the new brokers.

In the short term, what is the safest workaround? Can we just `rm` these old 
segments, or should we be messing with the filesystem metadata so kafka removes 
them itself?

In the longer term, the partition mover should be setting timestamps 
appropriately on the segments it moves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to