github-actions[bot] commented on code in PR #61666:
URL: https://github.com/apache/doris/pull/61666#discussion_r3008304627
##########
cloud/src/recycler/util.cpp:
##########
@@ -237,6 +250,13 @@ int lease_instance_recycle_job(TxnKv* txn_kv,
std::string_view key, const std::s
// ret: 0: success, 1: tablet not found, -1: failed
int get_tablet_idx(TxnKv* txn_kv, const std::string& instance_id, int64_t
tablet_id,
TabletIndexPB& tablet_idx) {
Review Comment:
**[Performance]** Same `LOG(INFO)` on every cache hit issue as in
`meta_service.cpp`. Should be `VLOG` or removed.
##########
cloud/src/meta-service/meta_service.cpp:
##########
@@ -90,6 +93,14 @@ MetaServiceImpl::MetaServiceImpl(std::shared_ptr<TxnKv>
txn_kv,
snapshot_manager_(std::move(snapshot_manager)) {
rate_limiter_->init(this);
delete_bitmap_lock_white_list_->init();
+ CacheConfig config;
+ config.tablet_index_capacity = config::ms_tablet_index_cache_capacity;
+ config.tablet_index_ttl_seconds = config::tablet_index_cache_ttl_seconds;
+ if (config::enable_tablet_index_cache) {
+ g_ms_cache_manager =
Review Comment:
**[Performance - Critical]** `LOG(INFO)` on every cache hit will flood logs
and negate the performance benefit of caching. This function is called from hot
paths like `commit_rowset`, `finish_tablet_job`, and `commit_txn`. Consider
using `VLOG(3)` or `LOG_EVERY_N(INFO, 10000)` at most, or removing this log
entirely. The same applies to the recycler's `get_tablet_idx` in `util.cpp`.
##########
cloud/src/meta-service/meta_service.cpp:
##########
@@ -78,6 +79,8 @@ using namespace std::chrono;
namespace doris::cloud {
+static TabletIndexCache* g_ms_cache_manager = nullptr;
+
MetaServiceImpl::MetaServiceImpl(std::shared_ptr<TxnKv> txn_kv,
Review Comment:
**[Correctness - High]** `g_ms_cache_manager` is allocated with `new` in the
constructor but never cleaned up. Two problems:
1. **Memory leak on multiple construction:** If `MetaServiceImpl` is
constructed more than once (e.g., in tests like `meta_server_test.cpp`), the
previous `g_ms_cache_manager` pointer is overwritten without being deleted.
2. **Missing cache invalidation:** The MS-side code modifies `TabletIndexPB`
in at least two places without invalidating this cache:
- `repair_tablet_index` in `meta_service_txn.cpp` (~line 2150): sets
`db_id` on existing `TabletIndexPB` and writes it back via `txn->put()`
- `fix_tablet_db_id` in `meta_service.cpp` (~line 5719): same pattern
With TTL=0 (default), stale entries persist indefinitely until
LRU-evicted. Either add invalidation at those sites or set a reasonable default
TTL.
##########
cloud/src/recycler/util.cpp:
##########
@@ -21,13 +21,26 @@
#include <cstdint>
+#include "common/config.h"
+#include "common/kv_cache.h"
#include "common/util.h"
#include "meta-service/meta_service_schema.h"
#include "meta-store/keys.h"
#include "meta-store/txn_kv.h"
#include "meta-store/txn_kv_error.h"
namespace doris::cloud {
+
+TabletIndexCache* g_recycler_cache_manager = nullptr;
+
+void init_recycler_cache() {
+ CacheConfig config;
+ config.tablet_index_capacity =
config::recycler_tablet_index_cache_capacity;
+ config.tablet_index_ttl_seconds = config::tablet_index_cache_ttl_seconds;
+ g_recycler_cache_manager =
+ new TabletIndexCache(config.tablet_index_capacity,
config.tablet_index_ttl_seconds,
+ "recycler_tablet_index_cache");
Review Comment:
**[Style]** Missing blank line between the closing brace of
`init_recycler_cache()` and `namespace config {`. This makes the code harder to
read.
##########
cloud/src/common/config.h:
##########
@@ -158,6 +158,12 @@ CONF_mBool(enable_mvcc_meta_check, "false");
CONF_mInt64(mow_job_key_check_expiration_diff_seconds, "600"); // 10min
+// KV cache config
+CONF_mBool(enable_tablet_index_cache, "true");
Review Comment:
**[Configuration]** These configs are declared as `CONF_mBool`/`CONF_mInt64`
(mutable/dynamic), but the cache is only created once at startup in the
`MetaServiceImpl` constructor and `init_recycler_cache()`. Changing these
values at runtime will have no effect. Either:
- Make them non-mutable (`CONF_Bool`/`CONF_Int64`), or
- Add runtime reconfiguration support that recreates/resizes the cache when
these values change.
##########
cloud/src/recycler/util.cpp:
##########
@@ -21,13 +21,26 @@
#include <cstdint>
+#include "common/config.h"
+#include "common/kv_cache.h"
#include "common/util.h"
#include "meta-service/meta_service_schema.h"
#include "meta-store/keys.h"
#include "meta-store/txn_kv.h"
#include "meta-store/txn_kv_error.h"
namespace doris::cloud {
Review Comment:
**[Lifecycle]** Same concern as the MS cache: `g_recycler_cache_manager` is
allocated with `new` and never freed. If `init_recycler_cache()` is called
multiple times (e.g., in test scenarios), the old pointer leaks. Consider using
`std::unique_ptr` or at minimum `delete`-ing the old value before reassignment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]