[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884686#comment-17884686 ]
ASF subversion and git services commented on KUDU-3367: ------------------------------------------------------- Commit 3666d2026d48adb5ff636321ef22320a8af5facb in kudu's branch refs/heads/master from Alexey Serbin [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=3666d2026 ] KUDU-3619 disable KUDU-3367 behavior by default As it turned out, KUDU-3367 has introduced a regression due to a deficiency in its implementation, where major compactions would fail with errors like below if it had kicked in: Corruption: Failed major delta compaction on RowSet(1): No min key found: CFile base data in RowSet(1) Since KUDU-3367 isn't quite relevant in Kudu versions of 1.12.0 and newer when working with data that supports live row count (see KUDU-1625), a quick-and-dirty fix is to set the default value for the corresponding flag --all_delete_op_delta_file_cnt_for_compaction to a value that effectively disables KUDU-3367 behavior. This patch does exactly so. Change-Id: Iec0719462e379b7a0fb05ca011bb9cdd991a58ef Reviewed-on: http://gerrit.cloudera.org:8080/21848 Reviewed-by: KeDeng <kdeng...@gmail.com> Tested-by: Alexey Serbin <ale...@apache.org> > Delta file with full of delete op can not be schedule to compact > ---------------------------------------------------------------- > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction > Reporter: dengke > Assignee: dengke > Priority: Major > Fix For: 1.17.0 > > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reducesĀ compact's performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)