github-actions[bot] commented on code in PR #63295:
URL: https://github.com/apache/doris/pull/63295#discussion_r3315949501
##########
cloud/src/recycler/recycler.cpp:
##########
@@ -5067,78 +5172,29 @@ int InstanceRecycler::recycle_rowsets() {
if (rowset.type() == RecycleRowsetPB::PREPARE) {
// unable to calculate file path, can only be deleted by rowset id
prefix
num_prepare += 1;
- prepare_rowset_keys_to_delete.emplace_back(k);
+ if (delete_rowset_data_by_prefix(std::string(k),
rowset_meta->resource_id(),
+ rowset_meta->tablet_id(),
Review Comment:
This deletes PREPARE rowset data before the related txn/job abort is
guaranteed to finish. Above, `rowset_keys_to_abort` is only queued, and
`loop_done()` now submits the abort work to `mark_abort_worker_pool`
independently from the delete work. For a prepared load rowset, this branch can
call `delete_rowset_data_by_prefix()` while the transaction is still
commit-able; if the commit wins that race, the committed rowset will point at
files the recycler just removed. The old code executed
`batch_abort_txn_or_job_for_recycle()` in the same worker before
collecting/deleting prepare rowsets, so please preserve that ordering, for
example by keeping prepare deletion in the same job after abort completes.
##########
cloud/src/recycler/recycler.cpp:
##########
@@ -4892,20 +4942,69 @@ int InstanceRecycler::recycle_rowsets() {
.tag("expired_rowset_meta_size", expired_rowset_size);
};
- std::vector<std::string> rowset_keys;
+ struct RecycleRowsetEntry {
+ std::string key;
+ doris::RowsetMetaCloudPB meta;
+ };
+ struct RecycleRowsetDeleteJob {
+ std::vector<std::string> keys;
+ std::map<std::string, doris::RowsetMetaCloudPB> rowsets;
+ };
+ // Store the scanned recycle key with rowset meta. The scanned key is the
actual KV key to delete.
+ std::vector<RecycleRowsetEntry> rowsets;
+ int64_t current_tablet_id = -1;
+ int64_t recycled_rowset_count_for_current_tablet = 0;
+ bool current_tablet_skip_logged = false;
+ std::string next_scan_begin;
+ const int64_t rowset_batch_size_per_tablet =
+ std::max(1, config::recycle_rowsets_per_tablet_batch_size);
+ const int64_t delete_rowset_batch_size =
+ std::min(500000, config::recycle_rowsets_delete_batch_size);
+ auto try_reserve_tablet_recycle_slot = [&](int64_t tablet_id) -> bool {
Review Comment:
Please clamp this mutable config to at least 1, similar to
`rowset_batch_size_per_tablet`. If an operator sets
`recycle_rowsets_delete_batch_size` to 0 or a negative value at runtime,
`rowsets.size() < delete_rowset_batch_size` either flushes on every iterator
for 0 or, for negative values converted in the `size_t` comparison, stays true
until the full scan finishes. In the latter case the recycler can retain every
scanned `RowsetMetaCloudPB` in `rowsets`, which defeats the memory cap and can
OOM on a large recycle backlog.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]