GitHub user Ryan4253 edited a discussion: TTL bottlenecking performance when 
blob files are enabled

I was benchmarking Kvrocks' write performance with / without blob files my 
timing how long it takes for 10 clients to write 20GB of 4mb strings. Even 
though all of the strings are large enough to be stored in blob files which 
should improve performance, enabling blob files actually reduced the 
performance by around 3x (194 seconds vs 600 seconds for 20GB of write). 

I tried profiling the code with perf and found this:
![image](https://github.com/user-attachments/assets/acf5a545-96bf-4ae5-b9dd-e2b23cfe7d9a)

Turns out expiration timestamps are stored in the value which in this case are 
within the blob files. This means that for engine::MetadataFilter, we need to 
do a disk read for every ke which is a lot of overhead. I tried disabling TTL 
with the snippets below and was able to improve the write performance to 78 
seconds:
```cpp
rocksdb::CompactionFilter::Decision 
MetadataFilter::FilterBlobByKey([[maybe_unused]] int level, [[maybe_unused]] 
const Slice &key,
                                                                    
[[maybe_unused]] std::string *new_value,
                                                                    
[[maybe_unused]] std::string *skip_until) const {
  return rocksdb::CompactionFilter::Decision::kKeep;
}
```

I think this is a nice improvement to have since it increases write throughput 
with blob files by almost 10x. I thought of a couple potential implementation 
methods and wanted to get your thoughts.

My initial idea is to store expiration timestamp with the key. This is pretty 
unrealistic since every database created before the change is no longer 
compatible. A second approach could be to add a configuration option to disable 
TTL. This would involve disabling the metadata filter and rejecting Redis 'SET' 
requests that include expiration commands.

TTL is pretty fundamental to Redis, which makes this pretty challenging to 
integrate. I want to hear about your opinions on these ideas or any other 
potential paths forward, and whether you feel this is something we should 
incorporate.


GitHub link: https://github.com/apache/kvrocks/discussions/3036

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to