[ https://issues.apache.org/jira/browse/FLINK-34050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819579#comment-17819579 ]
Jinzhong Li edited comment on FLINK-34050 at 2/22/24 10:28 AM: --------------------------------------------------------------- Here are the benchmark results before and after applying this proposal. ||rescaleType ||Base||Applying this proposal||Diff|| |RESCALE_IN|42536.540 ms|42219.484 ms|-0.3s| |RESCALE_OUT|595.798 ms|619.703 ms|+0.02s| >From this result, it can be seen that this change has no significant impact on >rescaling performance. More detailed data for reference: ||rescaleType||Base||Applying this proposal|| |RESCALE_IN|Fork 1 of 3 : Iteration 1: 44214.024 ms/op Iteration 2: 44473.591 ms/op Iteration 3: 41143.378 ms/op Iteration 4: 45364.796 ms/op Iteration 5: 45955.292 ms/op Iteration 6: 41078.509 ms/op Iteration 7: 45984.066 ms/op Iteration 8: 41000.731 ms/op Iteration 9: 45595.620 ms/op Iteration 10: 46044.924 ms/op Fork 2 of 3: Iteration 1: 34444.761 ms/op Iteration 2: 43152.346 ms/op Iteration 3: 43060.378 ms/op Iteration 4: 44337.494 ms/op Iteration 5: 34670.528 ms/op Iteration 6: 42514.179 ms/op Iteration 7: 34496.979 ms/op Iteration 8: 41989.620 ms/op Iteration 9: 44067.735 ms/op Iteration 10: 44704.516 ms/op Fork 3 of 3: Iteration 1: 43385.168 ms/op Iteration 2: 43096.595 ms/op Iteration 3: 43370.825 ms/op Iteration 4: 45175.983 ms/op Iteration 5: 34956.635 ms/op Iteration 6: 43147.011 ms/op Iteration 7: 42810.926 ms/op Iteration 8: 44908.913 ms/op Iteration 9: 44195.383 ms/op Iteration 10: 42755.294 ms/op Result: {color:#ff0000}42536.540 ms{color}| | |RESCALE_OUT|Fork 1 of 3 : Iteration 1: 44987.651 ms/op Iteration 2: 45913.319 ms/op Iteration 3: 34740.433 ms/op Iteration 4: 43833.981 ms/op Iteration 5: 44912.708 ms/op Iteration 6: 45030.893 ms/op Iteration 7: 44079.639 ms/op Iteration 8: 34754.194 ms/op Iteration 9: 42423.861 ms/op Iteration 10: 42765.109 ms/op Fork 2 of 3 : Iteration 1: 44712.705 ms/op Iteration 2: 44599.266 ms/op Iteration 3: 45105.132 ms/op Iteration 4: 42825.562 ms/op Iteration 5: 45664.281 ms/op Iteration 6: 34835.676 ms/op Iteration 7: 43294.868 ms/op Iteration 8: 43319.576 ms/op Iteration 9: 44627.813 ms/op Iteration 10: 41309.822 ms/op Fork 3 of 3 : Iteration 1: 41423.187 ms/op Iteration 2: 42499.661 ms/op Iteration 3: 42638.880 ms/op Iteration 4: 43574.138 ms/op Iteration 5: 34969.848 ms/op Iteration 6: 43349.239 ms/op Iteration 7: 41596.289 ms/op Iteration 8: 42500.620 ms/op Iteration 9: 40192.633 ms/op Iteration 10: 40103.528 ms/op Result: {color:#ff0000}42219.484 ms{color}| | {color:#ff0000} {color}| |RESCALE_OUT|Fork 1 of 3 :| Iteration 1: 648.341 ms/op Iteration 2: 588.388 ms/op Iteration 3: 598.590 ms/op Iteration 4: 585.059 ms/op Iteration 5: 585.281 ms/op Iteration 6: 585.623 ms/op Iteration 7: 587.027 ms/op Iteration 8: 584.607 ms/op Iteration 9: 586.894 ms/op Iteration 10: 588.937 ms/opFork 2 of 3 : Iteration 1: 572.042 ms/op Iteration 2: 684.044 ms/op Iteration 3: 578.065 ms/op Iteration 4: 569.977 ms/op Iteration 5: 684.787 ms/op Iteration 6: 628.206 ms/op Iteration 7: 636.114 ms/op Iteration 8: 581.125 ms/op Iteration 9: 565.372 ms/op Iteration 10: 573.666 ms/opFork 3 of 3 : Iteration 1: 579.150 ms/op Iteration 2: 570.446 ms/op Iteration 3: 572.630 ms/op Iteration 4: 637.539 ms/op Iteration 5: 578.346 ms/op Iteration 6: 572.454 ms/op Iteration 7: 570.402 ms/op Iteration 8: 645.658 ms/op Iteration 9: 566.496 ms/op Iteration 10: 568.685 ms/op Result: {color:#ff0000} 595.798 ms {color}| Fork 1 of 3 : Iteration 1: 628.622 ms/op Iteration 2: 665.121 ms/op Iteration 3: 572.333 ms/op Iteration 4: 569.295 ms/op Iteration 5: 576.826 ms/op Iteration 6: 578.466 ms/op Iteration 7: 604.477 ms/op Iteration 8: 628.138 ms/op Iteration 9: 588.552 ms/op Iteration 10: 667.854 ms/op Fork 2 of 3 : Iteration 1: 702.258 ms/op Iteration 2: 625.250 ms/op Iteration 3: 587.017 ms/op Iteration 4: 600.382 ms/op Iteration 5: 701.370 ms/op Iteration 6: 617.869 ms/op Iteration 7: 615.285 ms/op Iteration 8: 657.985 ms/op Iteration 9: 641.899 ms/op Iteration 10: 595.750 ms/op Fork 3 of 3 : Iteration 1: 586.578 ms/op Iteration 2: 634.318 ms/op Iteration 3: 717.040 ms/op Iteration 4: 708.608 ms/op Iteration 5: 573.883 ms/op Iteration 6: 589.254 ms/op Iteration 7: 572.359 ms/op Iteration 8: 622.834 ms/op Iteration 9: 590.137 ms/op Iteration 10: 571.335 ms/op Result: {color:#ff0000}619.703 ms {color}| was (Author: lijinzhong): Here are the benchmark results before and after applying this proposal. ||rescaleType ||Base||Applying this proposal||Diff|| |RESCALE_IN|42536.540 ms|42219.484 ms|-0.3s| |RESCALE_OUT|595.798 ms|619.703 ms|+0.02s| >From this result, it can be seen that this change has no significant impact on >rescaling performance. More detailed data for reference: ||rescaleType||Base||Applying this proposal|| |RESCALE_IN|Fork 1 of 3 : Iteration 1: 44214.024 ms/op Iteration 2: 44473.591 ms/op Iteration 3: 41143.378 ms/op Iteration 4: 45364.796 ms/op Iteration 5: 45955.292 ms/op Iteration 6: 41078.509 ms/op Iteration 7: 45984.066 ms/op Iteration 8: 41000.731 ms/op Iteration 9: 45595.620 ms/op Iteration 10: 46044.924 ms/op Fork 2 of 3:| Iteration 1: 34444.761 ms/op Iteration 2: 43152.346 ms/op Iteration 3: 43060.378 ms/op Iteration 4: 44337.494 ms/op Iteration 5: 34670.528 ms/op Iteration 6: 42514.179 ms/op Iteration 7: 34496.979 ms/op Iteration 8: 41989.620 ms/op Iteration 9: 44067.735 ms/op Iteration 10: 44704.516 ms/op Fork 3 of 3: Iteration 1: 43385.168 ms/op Iteration 2: 43096.595 ms/op Iteration 3: 43370.825 ms/op Iteration 4: 45175.983 ms/op Iteration 5: 34956.635 ms/op Iteration 6: 43147.011 ms/op Iteration 7: 42810.926 ms/op Iteration 8: 44908.913 ms/op Iteration 9: 44195.383 ms/op Iteration 10: 42755.294 ms/op Result: {color:#ff0000}42536.540 ms{color}|Fork 1 of 3 : Iteration 1: 44987.651 ms/op Iteration 2: 45913.319 ms/op Iteration 3: 34740.433 ms/op Iteration 4: 43833.981 ms/op Iteration 5: 44912.708 ms/op Iteration 6: 45030.893 ms/op Iteration 7: 44079.639 ms/op Iteration 8: 34754.194 ms/op Iteration 9: 42423.861 ms/op Iteration 10: 42765.109 ms/op Fork 2 of 3 : Iteration 1: 44712.705 ms/op Iteration 2: 44599.266 ms/op Iteration 3: 45105.132 ms/op Iteration 4: 42825.562 ms/op Iteration 5: 45664.281 ms/op Iteration 6: 34835.676 ms/op Iteration 7: 43294.868 ms/op Iteration 8: 43319.576 ms/op Iteration 9: 44627.813 ms/op Iteration 10: 41309.822 ms/op Fork 3 of 3 : Iteration 1: 41423.187 ms/op Iteration 2: 42499.661 ms/op Iteration 3: 42638.880 ms/op Iteration 4: 43574.138 ms/op Iteration 5: 34969.848 ms/op Iteration 6: 43349.239 ms/op Iteration 7: 41596.289 ms/op Iteration 8: 42500.620 ms/op Iteration 9: 40192.633 ms/op Iteration 10: 40103.528 ms/op Result: {color:#ff0000}42219.484 ms {color}| |RESCALE_OUT|Fork 1 of 3 :| Iteration 1: 648.341 ms/op Iteration 2: 588.388 ms/op Iteration 3: 598.590 ms/op Iteration 4: 585.059 ms/op Iteration 5: 585.281 ms/op Iteration 6: 585.623 ms/op Iteration 7: 587.027 ms/op Iteration 8: 584.607 ms/op Iteration 9: 586.894 ms/op Iteration 10: 588.937 ms/opFork 2 of 3 : Iteration 1: 572.042 ms/op Iteration 2: 684.044 ms/op Iteration 3: 578.065 ms/op Iteration 4: 569.977 ms/op Iteration 5: 684.787 ms/op Iteration 6: 628.206 ms/op Iteration 7: 636.114 ms/op Iteration 8: 581.125 ms/op Iteration 9: 565.372 ms/op Iteration 10: 573.666 ms/opFork 3 of 3 : Iteration 1: 579.150 ms/op Iteration 2: 570.446 ms/op Iteration 3: 572.630 ms/op Iteration 4: 637.539 ms/op Iteration 5: 578.346 ms/op Iteration 6: 572.454 ms/op Iteration 7: 570.402 ms/op Iteration 8: 645.658 ms/op Iteration 9: 566.496 ms/op Iteration 10: 568.685 ms/op Result: {color:#ff0000} 595.798 ms {color}| Fork 1 of 3 : Iteration 1: 628.622 ms/op Iteration 2: 665.121 ms/op Iteration 3: 572.333 ms/op Iteration 4: 569.295 ms/op Iteration 5: 576.826 ms/op Iteration 6: 578.466 ms/op Iteration 7: 604.477 ms/op Iteration 8: 628.138 ms/op Iteration 9: 588.552 ms/op Iteration 10: 667.854 ms/op Fork 2 of 3 : Iteration 1: 702.258 ms/op Iteration 2: 625.250 ms/op Iteration 3: 587.017 ms/op Iteration 4: 600.382 ms/op Iteration 5: 701.370 ms/op Iteration 6: 617.869 ms/op Iteration 7: 615.285 ms/op Iteration 8: 657.985 ms/op Iteration 9: 641.899 ms/op Iteration 10: 595.750 ms/op Fork 3 of 3 : Iteration 1: 586.578 ms/op Iteration 2: 634.318 ms/op Iteration 3: 717.040 ms/op Iteration 4: 708.608 ms/op Iteration 5: 573.883 ms/op Iteration 6: 589.254 ms/op Iteration 7: 572.359 ms/op Iteration 8: 622.834 ms/op Iteration 9: 590.137 ms/op Iteration 10: 571.335 ms/op Result: {color:#ff0000}619.703 ms {color}| > Rocksdb state has space amplification after rescaling with DeleteRange > ---------------------------------------------------------------------- > > Key: FLINK-34050 > URL: https://issues.apache.org/jira/browse/FLINK-34050 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends > Reporter: Jinzhong Li > Assignee: Jinzhong Li > Priority: Major > Labels: pull-request-available > Attachments: image-2024-01-10-21-23-48-134.png, > image-2024-01-10-21-24-10-983.png, image-2024-01-10-21-28-24-312.png > > > FLINK-21321 use deleteRange to speed up rocksdb rescaling, however it will > cause space amplification in some case. > We can reproduce this problem using wordCount job: > 1) before rescaling, state operator in wordCount job has 2 parallelism and > 4G+ full checkpoint size; > !image-2024-01-10-21-24-10-983.png|width=266,height=130! > 2) then restart job with 4 parallelism (for state operator), the full > checkpoint size of new job will be 8G+ ; > 3) after many successful checkpoints, the full checkpoint size is still 8G+; > !image-2024-01-10-21-28-24-312.png|width=454,height=111! > > The root cause of this issue is that the deleted keyGroupRange does not > overlap with current DB keyGroupRange, so new data written into rocksdb after > rescaling almost never do LSM compaction with the deleted data (belonging to > other keyGroupRange.) > > And the space amplification may affect Rocksdb read performance and disk > space usage after rescaling. It looks like a regression due to the > introduction of deleteRange for rescaling optimization. > > To slove this problem, I think maybe we can invoke > Rocksdb.deleteFilesInRanges after deleteRange? > {code:java} > public static void clipDBWithKeyGroupRange() { > //....... > List<byte[]> ranges = new ArrayList<>(); > //....... > deleteRange(db, columnFamilyHandles, beginKeyGroupBytes, endKeyGroupBytes); > ranges.add(beginKeyGroupBytes); > ranges.add(endKeyGroupBytes); > //.... > for (ColumnFamilyHandle columnFamilyHandle : columnFamilyHandles) { > db.deleteFilesInRanges(columnFamilyHandle, ranges, false); > } > } > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)