GitHub user Ryan4253 edited a discussion: TTL bottlenecking performance when blob files are enabled
I was benchmarking Kvrocks' write performance with / without blob files my timing how long it takes for 10 clients to write 20GB of 4mb strings. Even though all of the strings are large enough to be stored in blob files which should improve performance, enabling blob files actually reduced the performance by around 3x (194 seconds vs 600 seconds for 20GB of write). I tried profiling the code with perf and found this:  Turns out expiration timestamps are stored in the value which in this case are within the blob files. This means that for engine::MetadataFilter, we need to do a disk read for every ke which is a lot of overhead. I tried disabling TTL with the snippets below and was able to improve the write performance to 78 seconds: ```cpp rocksdb::CompactionFilter::Decision MetadataFilter::FilterBlobByKey([[maybe_unused]] int level, [[maybe_unused]] const Slice &key, [[maybe_unused]] std::string *new_value, [[maybe_unused]] std::string *skip_until) const { return rocksdb::CompactionFilter::Decision::kKeep; } ``` I think this is a nice improvement to have since it increases write throughput with blob files by almost 10x. I thought of a couple potential implementation methods and wanted to get your thoughts. My initial idea is to store expiration timestamp with the key. This is pretty unrealistic since every database created before the change is no longer compatible. A second approach could be to add a configuration option to disable TTL. This would involve disabling the metadata filter and rejecting Redis 'SET' requests that include expiration commands. TTL is pretty fundamental to Redis, which makes this pretty challenging to integrate. I want to hear about your opinions on these ideas or any other potential paths forward, and whether you feel this is something we should incorporate. Edit: I can speak Chinese too if that's more convenient GitHub link: https://github.com/apache/kvrocks/discussions/3036 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
