Nick Howard created KAFKA-1394:
----------------------------------

             Summary: Ensure last segment isn't deleted on expiration when 
there are unflushed messages
                 Key: KAFKA-1394
                 URL: https://issues.apache.org/jira/browse/KAFKA-1394
             Project: Kafka
          Issue Type: Improvement
          Components: log
    Affects Versions: 0.8.0, 0.7
            Reporter: Nick Howard
            Assignee: Jay Kreps
            Priority: Minor


We have observed that Kafka will sometimes flush messages to a file that is 
immediately deleted due to expiration. This happens because the LogManager's 
predicate for deleting expired segments is based on the file system modified 
time. The modified time reflects the last time messages were flushed to disk, 
so when there are messages waiting to be flushed, those are not considered in 
the current cleanup strategy. When the last segment is expired, but has 
unflushed messages, the deleteOldSegments method will do a roll, then delete 
all the segments. Rolls begin by flushing to the last segment, so the unflushed 
messages are flushed, then deleted.

It looks like this:

* messages appended, but not enough to trigger a flush
* LogManager begins cleaning expired logs
* predicate checks modified time of last segment -- it's too old
* since all segments are old, it does a roll
*   messages flushed to last segment
* last segment deleted

If this happens in between consumer reads, the messages will never be seen 
downstream.

Patch:

The patch changes the deletion logic so that if the log has unflushed messages, 
the last segment will not be deleted. It widens the lock sychronization back to 
where is was earlier to prevent a race condition between deciding to delete the 
last segment and an append coming in during the expired segment clean up and 
causing unflushed messages that then hit the issue.

I've also got a backport for 0.7



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to