cyningsun commented on code in PR #3194:
URL: https://github.com/apache/kvrocks/pull/3194#discussion_r2365617026


##########
src/storage/redis_db.cc:
##########
@@ -208,12 +206,37 @@ rocksdb::Status Database::MDel(engine::Context &ctx, 
const std::vector<Slice> &k
 }
 
 rocksdb::Status Database::Exists(engine::Context &ctx, const 
std::vector<Slice> &keys, int *ret) {
+  *ret = 0;
+
+  if (keys.empty()) {
+    return rocksdb::Status::OK();
+  }
+
   std::vector<std::string> ns_keys;
+  std::vector<Slice> slice_keys;
   ns_keys.reserve(keys.size());
+  slice_keys.reserve(keys.size());
+
   for (const auto &key : keys) {
     ns_keys.emplace_back(AppendNamespacePrefix(key));
+    slice_keys.emplace_back(ns_keys.back());
+  }
+
+  std::vector<rocksdb::Status> statuses(slice_keys.size());
+  std::vector<rocksdb::PinnableSlice> pin_values(slice_keys.size());
+  storage_->MultiGet(ctx, ctx.DefaultMultiGetOptions(), metadata_cf_handle_, 
slice_keys.size(), slice_keys.data(),
+                     pin_values.data(), statuses.data());
+
+  for (size_t i = 0; i < slice_keys.size(); i++) {
+    if (!statuses[i].ok() && !statuses[i].IsNotFound()) return statuses[i];
+    if (statuses[i].ok()) {
+      Metadata metadata(kRedisNone, false);
+      auto s = metadata.Decode(&rocksdb::Slice(pin_values[i].data(), 
pin_values[i].size()));

Review Comment:
   @mapleFU Thanks so much for commenting—please don't be sorry! I really 
welcome the discussion, and I'm glad we're digging into the details.
   
   You're absolutely right, and in fact, I completely agree with your thinking. 
The implementation you're proposing is a more conservative and robust approach. 
If I were working on this within a company codebase, I would likely advocate 
for the same solution you've outlined, for a few key reasons:
   
   *   In real-world usage, the number of keys for `EXISTS` maybe smaller than 
for `MGET` or `MDEL`.
   *   With a smaller number of keys and a higher in-memory hit rate, the 
potential performance difference between a single `Get` and a `MultiGet` 
becomes more significant.
   
   The main reason I went with the current implementation are:
   
   *   In the benchmark testing(#3169), the performance regression for smaller 
key lists was measured to be quite small—within 5%. (`the memory to disk ratio 
is indeed not very high`)
   *   `MGET` and `MDEL` also shares the same problem for handling small lists 
of keys
   *   Using the same `MultiGet` path kept the implementation simpler and more 
elegant.
   
   However, you've made a very valid point that deserves more thorough testing. 
**Before we merge this, let me run some more benchmarks.** I'll specifically 
focus on scenarios with a higher cache hit rate and very small key lists to 
better quantify the impact. Once I have those results, we can make a more 
data-driven decision on whether to keep the current unified implementation or 
adopt the optimized approach you suggested for `existsInternal`.
   
   I really appreciate you pushing for the best solution! Let's sync up again 
after the next round of tests.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to