[ https://issues.apache.org/jira/browse/KAFKA-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neha Narkhede updated KAFKA-636: -------------------------------- Description: We have a few corner-case bugs around delete of segment files: 1. It is possible for delete and truncate to kind of cross streams and end up with a case where you have no segments. 2. Reads on the log have no locking (which is good) but as a result deleting a segment that is being read will result in some kind of I/O exception. 3. We can't easily fix the synchronization problems without deleting files inside the log's write lock. This can be a problem as deleting a 2GB segment can take a couple of seconds even on an unloaded system. The proposed fix for these problems is to make file removal asynchronous using the following scheme as the new delete scheme: 1. Immediately remove the file from segment map and rename the~ file from X to X.deleted (e.g. 0000000.log to 000000.log.deleted. We think renaming a file will not impact reads since the file is already open and hence the name is irrelevant. This will always be O(1) and can be done inside the write lock. 2. Schedule a future operation to delete the file. The time to wait would be configurable but we would just default it to 60 seconds and probably no one would ever change it. 3. On startup we would delete any files with the .deleted suffix as they would have been pending deletes that didn't take place. I plan to do this soon working against the refactored log (KAFKA-521). We can opt to back port the patch for 0.8 if we are feeling daring. was: We have a few corner-case bugs around delete of segment files: 1. It is possible for delete and truncate to kind of cross streams and end up with a case where you have no segments. 2. Reads on the log have no locking (which is good) but as a result deleting a segment that is being read will result in some kind of I/O exception. 3. We can't easily fix the synchronization problems without deleting files inside the log's write lock. This can be a problem as deleting a 2GB segment can take a couple of seconds even on an unloaded system. The proposed fix for these problems is to make file removal asynchronous using the following scheme as the new delete scheme: 1. Immediately remove the file from segment map and rename the file from X to X.deleted (e.g. 0000000.log to 000000.log.deleted. We think renaming a file will not impact reads since the file is already open and hence the name is irrelevant. This will always be O(1) and can be done inside the write lock. 2. Schedule a future operation to delete the file. The time to wait would be configurable but we would just default it to 60 seconds and probably no one would ever change it. 3. On startup we would delete any files with the .deleted suffix as they would have been pending deletes that didn't take place. I plan to do this soon working against the refactored log (KAFKA-521). We can opt to back port the patch for 0.8 if we are feeling daring. > Make log segment delete asynchronous > ------------------------------------ > > Key: KAFKA-636 > URL: https://issues.apache.org/jira/browse/KAFKA-636 > Project: Kafka > Issue Type: Bug > Reporter: Jay Kreps > Assignee: Jay Kreps > Attachments: KAFKA-636-v1.patch > > > We have a few corner-case bugs around delete of segment files: > 1. It is possible for delete and truncate to kind of cross streams and end up > with a case where you have no segments. > 2. Reads on the log have no locking (which is good) but as a result deleting > a segment that is being read will result in some kind of I/O exception. > 3. We can't easily fix the synchronization problems without deleting files > inside the log's write lock. This can be a problem as deleting a 2GB segment > can take a couple of seconds even on an unloaded system. > The proposed fix for these problems is to make file removal asynchronous > using the following scheme as the new delete scheme: > 1. Immediately remove the file from segment map and rename the~ file from X > to X.deleted (e.g. 0000000.log to 000000.log.deleted. We think renaming a > file will not impact reads since the file is already open and hence the name > is irrelevant. This will always be O(1) and can be done inside the write lock. > 2. Schedule a future operation to delete the file. The time to wait would be > configurable but we would just default it to 60 seconds and probably no one > would ever change it. > 3. On startup we would delete any files with the .deleted suffix as they > would have been pending deletes that didn't take place. > I plan to do this soon working against the refactored log (KAFKA-521). We can > opt to back port the patch for 0.8 if we are feeling daring. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira