[ https://issues.apache.org/jira/browse/KAFKA-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jay Kreps updated KAFKA-636: ---------------------------- Attachment: KAFKA-636-v1.patch This patch implements asynchronous delete in the log. To do this Log.scala now requires a scheduler to be used for scheduling the deletions. The deletion works as described above. The locking for segment deletion can now be more aggressive since the file renames are assumed to be fast they can be inside the lock. As part of testing this I also found a problem with MockScheduler, namely that it does not reentrant. That is, if scheduled tasks themselves create scheduled tasks it misbehaves. To fix this I rewrote MockScheduler to use a priority queue. The code is simpler and more correct since it now performs all executions in the correct order too. > Make log segment delete asynchronous > ------------------------------------ > > Key: KAFKA-636 > URL: https://issues.apache.org/jira/browse/KAFKA-636 > Project: Kafka > Issue Type: Bug > Reporter: Jay Kreps > Assignee: Jay Kreps > Attachments: KAFKA-636-v1.patch > > > We have a few corner-case bugs around delete of segment files: > 1. It is possible for delete and truncate to kind of cross streams and end up > with a case where you have no segments. > 2. Reads on the log have no locking (which is good) but as a result deleting > a segment that is being read will result in some kind of I/O exception. > 3. We can't easily fix the synchronization problems without deleting files > inside the log's write lock. This can be a problem as deleting a 2GB segment > can take a couple of seconds even on an unloaded system. > The proposed fix for these problems is to make file removal asynchronous > using the following scheme as the new delete scheme: > 1. Immediately remove the file from segment map and rename the file from X to > X.deleted (e.g. 0000000.log to 000000.log.deleted. We think renaming a file > will not impact reads since the file is already open and hence the name is > irrelevant. This will always be O(1) and can be done inside the write lock. > 2. Schedule a future operation to delete the file. The time to wait would be > configurable but we would just default it to 60 seconds and probably no one > would ever change it. > 3. On startup we would delete any files with the .deleted suffix as they > would have been pending deletes that didn't take place. > I plan to do this soon working against the refactored log (KAFKA-521). We can > opt to back port the patch for 0.8 if we are feeling daring. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira