[ https://issues.apache.org/jira/browse/KUDU-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884781#comment-17884781 ]
ASF subversion and git services commented on KUDU-1625: ------------------------------------------------------- Commit 05043e6aba6ab45c1b77de9f0762de3dfc5a54c0 in kudu's branch refs/heads/branch-1.17.x from Alexey Serbin [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=05043e6ab ] KUDU-3619 disable KUDU-3367 behavior by default As it turned out, KUDU-3367 has introduced a regression due to a deficiency in its implementation, where major compactions would fail with errors like below if it had kicked in: Corruption: Failed major delta compaction on RowSet(1): No min key found: CFile base data in RowSet(1) Since KUDU-3367 isn't quite relevant in Kudu versions of 1.12.0 and newer when working with data that supports live row count (see KUDU-1625), a quick-and-dirty fix is to set the default value for the corresponding flag --all_delete_op_delta_file_cnt_for_compaction to a value that effectively disables KUDU-3367 behavior. This patch does exactly so. Change-Id: Iec0719462e379b7a0fb05ca011bb9cdd991a58ef Reviewed-on: http://gerrit.cloudera.org:8080/21848 Reviewed-by: KeDeng <kdeng...@gmail.com> Tested-by: Alexey Serbin <ale...@apache.org> (cherry picked from commit 3666d2026d48adb5ff636321ef22320a8af5facb) Conflicts: src/kudu/tablet/delta_tracker.cc Reviewed-on: http://gerrit.cloudera.org:8080/21855 Reviewed-by: Abhishek Chennaka <achenn...@cloudera.com> > Schedule compaction on rowsets with high percentage of deleted data > ------------------------------------------------------------------- > > Key: KUDU-1625 > URL: https://issues.apache.org/jira/browse/KUDU-1625 > Project: Kudu > Issue Type: Improvement > Components: tablet > Affects Versions: 1.0.0 > Reporter: Todd Lipcon > Priority: Major > > Although with KUDU-236 we can now remove rows that were deleted prior to the > ancient history mark, we don't actively schedule compactions based on deleted > rows. So, if for example we have a fully compacted table and issue a DELETE > for every row, the data size actually does not change, because no compactions > are triggered. > We need some way to notice the fact that the ratio of deletes to rows is high > and decide to compact those rowsets. -- This message was sent by Atlassian Jira (v8.20.10#820010)