[ https://issues.apache.org/jira/browse/KAFKA-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392910#comment-15392910 ]
ASF GitHub Bot commented on KAFKA-1911: --------------------------------------- GitHub user sutambe opened a pull request: https://github.com/apache/kafka/pull/1664 KAFKA-1911: Async delete topic The last patch submitted by @MayureshGharat (back in Dec 15) has been rebased to the latest trunk. I took care of a couple of test failures (MetricsTest) along the way. @jjkoshy , @granders , @avianey , you may be interested in this PR. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sutambe/kafka async-delete-topic Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1664.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1664 ---- commit dbc54e6bcd5001c478028f7032f9ff0a59f53f89 Author: Mayuresh Gharat <mgha...@mgharat-ld1.linkedin.biz> Date: 2015-12-07T22:01:22Z Made Delete topic on the brokers Async Signed-off-by: Sumant Tambe <suta...@linkedin.com> commit 5bfb31b070502ae06a65ec9cb5fcd4c8d6693278 Author: Mayuresh Gharat <mgha...@mgharat-ld1.linkedin.biz> Date: 2015-12-17T23:13:07Z Change the file pointers for log and index to point to the renamed directory Signed-off-by: Sumant Tambe <suta...@linkedin.com> commit 692f8768f0903a60274f317f6a238b5a2c621c1f Author: Mayuresh Gharat <mgha...@mgharat-ld1.linkedin.biz> Date: 2015-12-21T23:54:15Z Added a check to not recoverLogs for directories marked for delete. This is to speedup startup process. Also added check that Log directories ending with .delete be added to a separate set of logs that the are to be deleted asynchronously. Signed-off-by: Sumant Tambe <suta...@linkedin.com> commit b6920ffaef5d086a691eba1ac4a66d429d1c5fcf Author: Mayuresh Gharat <mgha...@mgharat-ld1.linkedin.biz> Date: 2015-12-22T00:00:50Z Removed a bug from earlier commit Signed-off-by: Sumant Tambe <suta...@linkedin.com> commit 161cb8c669298744d60eec75ca5072d3b0f0045f Author: Mayuresh Gharat <mgha...@mgharat-ld1.linkedin.biz> Date: 2015-12-22T00:03:44Z Removed the extra ';' Signed-off-by: Sumant Tambe <suta...@linkedin.com> commit 161c14c5486cbe1fc0411fe6fc5745c78e36bedc Author: MayureshGharat <gharatmayures...@gmail.com> Date: 2016-03-10T21:25:38Z Addressed Joel's comments on the patch Signed-off-by: Sumant Tambe <suta...@linkedin.com> commit 859dbb7a1d3f2b7b9ddc5d8b9edc128b1903f28c Author: MayureshGharat <gharatmayures...@gmail.com> Date: 2016-04-01T20:26:11Z Addressed the NPE issue with race condition and also issues related to loading segments on a crash Signed-off-by: Sumant Tambe <suta...@linkedin.com> commit b93328cd446d9e0c753a966217b8b58fc6150ec6 Author: Sumant Tambe <suta...@linkedin.com> Date: 2016-07-26T00:11:25Z Async log deletion rebase and successful testing ---- > Log deletion on stopping replicas should be async > ------------------------------------------------- > > Key: KAFKA-1911 > URL: https://issues.apache.org/jira/browse/KAFKA-1911 > Project: Kafka > Issue Type: Bug > Components: log, replication > Reporter: Joel Koshy > Assignee: Mayuresh Gharat > Labels: newbie++, newbiee > > If a StopReplicaRequest sets delete=true then we do a file.delete on the file > message sets. I was under the impression that this is fast but it does not > seem to be the case. > On a partition reassignment in our cluster the local time for stop replica > took nearly 30 seconds. > {noformat} > Completed request:Name: StopReplicaRequest; Version: 0; CorrelationId: 467; > ClientId: ; DeletePartitions: true; ControllerId: 1212; ControllerEpoch: > 53 from > client/...:45964;totalTime:29191,requestQueueTime:1,localTime:29190,remoteTime:0,responseQueueTime:0,sendTime:0 > {noformat} > This ties up one API thread for the duration of the request. > Specifically in our case, the queue times for other requests also went up and > producers to the partition that was just deleted on the old leader took a > while to refresh their metadata (see KAFKA-1303) and eventually ran out of > retries on some messages leading to data loss. > I think the log deletion in this case should be fully asynchronous although > we need to handle the case when a broker may respond immediately to the > stop-replica-request but then go down after deleting only some of the log > segments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)