cyningsun commented on code in PR #3194:
URL: https://github.com/apache/kvrocks/pull/3194#discussion_r2365617026
##########
src/storage/redis_db.cc:
##########
@@ -208,12 +206,37 @@ rocksdb::Status Database::MDel(engine::Context &ctx,
const std::vector<Slice> &k
}
rocksdb::Status Database::Exists(engine::Context &ctx, const
std::vector<Slice> &keys, int *ret) {
+ *ret = 0;
+
+ if (keys.empty()) {
+ return rocksdb::Status::OK();
+ }
+
std::vector<std::string> ns_keys;
+ std::vector<Slice> slice_keys;
ns_keys.reserve(keys.size());
+ slice_keys.reserve(keys.size());
+
for (const auto &key : keys) {
ns_keys.emplace_back(AppendNamespacePrefix(key));
+ slice_keys.emplace_back(ns_keys.back());
+ }
+
+ std::vector<rocksdb::Status> statuses(slice_keys.size());
+ std::vector<rocksdb::PinnableSlice> pin_values(slice_keys.size());
+ storage_->MultiGet(ctx, ctx.DefaultMultiGetOptions(), metadata_cf_handle_,
slice_keys.size(), slice_keys.data(),
+ pin_values.data(), statuses.data());
+
+ for (size_t i = 0; i < slice_keys.size(); i++) {
+ if (!statuses[i].ok() && !statuses[i].IsNotFound()) return statuses[i];
+ if (statuses[i].ok()) {
+ Metadata metadata(kRedisNone, false);
+ auto s = metadata.Decode(&rocksdb::Slice(pin_values[i].data(),
pin_values[i].size()));
Review Comment:
@mapleFU Thanks so much for commenting—please don't be sorry! I really
welcome the discussion, and I'm glad we're digging into the details.
You're absolutely right, and in fact, I completely agree with your thinking.
The implementation you're proposing is a more conservative and robust approach.
If I were working on this within a company codebase, I would likely advocate
for the same solution you've outlined, for a few key reasons:
* In real-world usage, the number of keys for `EXISTS` maybe smaller than
for `MGET` or `MDEL`.
* With a smaller number of keys and a higher in-memory hit rate, the
potential performance difference between a single `Get` and a `MultiGet`
becomes more significant.
The main reason I went with the current implementation are:
* In the benchmark testing(#3169), the performance regression for smaller
key lists was measured to be quite small—within 5%. (`the memory to disk ratio
is indeed not very high`)
* `MGET` and `MDEL` also shares the same problem for handling small lists
of keys
* Using the same `MultiGet` path kept the implementation simpler and more
elegant.
However, you've made a very valid point that deserves more thorough testing.
**Before we merge this, let me run some more benchmarks.** I'll specifically
focus on scenarios with a higher cache hit rate and very small key lists to
better quantify the impact. Once I have those results, we can make a more
data-driven decision on whether to keep the current unified implementation or
adopt the optimized approach you suggested for `existsInternal`.
I really appreciate you pushing for the best solution! Let's sync up again
after the next round of tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]