[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17655527#comment-17655527 ]
ASF subversion and git services commented on KUDU-3367: ------------------------------------------------------- Commit 27072d3382889b1852f4fef58010115585685bd3 in kudu's branch refs/heads/master from Yingchun Lai [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=27072d338 ] [tools] Add 'kudu local_replica tmeta delete_rowsets' to delete rowsets from tablet There are some use cases we need to delete rowsets from a tablet. For example: 1. Some blocks are corrupted in a single node cluster, the server cannot be started. Note: some data will be lost in this case. 2. Some rowsets are fully deleted but the blocks can not be GCed (KUDU-3367). Note: no data will be lost in this case. There is 'kudu pbc edit' CLI tool to achieve that, but it's error prone and hard to operate when working with large amount of data. This patch introduces a new CLI tool 'kudu local_replica tmeta delete_rowsets' which makes removing rowsets from a tablet much easier. Change-Id: If2cf9035babf4c3af4c238cebe8dcecd2c65848f Reviewed-on: http://gerrit.cloudera.org:8080/19357 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <ale...@apache.org> > Delta file with full of delete op can not be schedule to compact > ---------------------------------------------------------------- > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction > Reporter: dengke > Assignee: dengke > Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)