Yeah I can shed some light here: I used Universal originally since at the beginning of Kafka Streams journey there were user reports complaining about its storage amplifications. But soon enough (around 2019) I've realized that, as a OOTB config, level compaction may be more preferable.
I had a PR dating back to that time where I suggested changing a bunch of OOTB configs or RocksDB including the compaction config: https://github.com/apache/kafka/pull/6406/files, unfortunately it was not merged since I wanted to run some benchmarks to make sure it does not have any gotchas but never got the time to do so. I would be very happy in fact if someone could pick that up and re-examine if they still make sense, and if yes drive it through and merge. Guozhang On Sun, Jul 23, 2023 at 10:29 AM Matthias J. Sax <mj...@apache.org> wrote: > > Do you happen to know? > > > -------- Forwarded Message -------- > Subject: Streams/RocksDB: Why Universal Compaction? > Date: Fri, 23 Jun 2023 13:19:36 -0700 > From: Colt McNealy <c...@littlehorse.io> > Reply-To: users@kafka.apache.org > To: users@kafka.apache.org > > Hello there! > > I was wondering if anyone (perhaps an early developer or power-user of > Kafka Streams) knows why the Streams developers made the default setting > for RocksDB compaction "Universal" compaction rather than "Level" > compaction? > > My understanding (in which I am extremely UNconfident) is as follows— > > Supposedly Universal compaction leads to lower write amplification after > compaction finishes. In a run of Universal compaction, all data is > compacted; as per the RocksDB documentation it is possible for temporary > write amplification of up to 2x during this process. There have also been > reports of "write stalls" during this process [1]. > > In Level compaction, only certain levels (tiers of SST files) are compacted > at once, meaning that the compaction process is shorter and less intensive, > but that write amplification after compaction finishes is higher than with > universal compaction. > > Can anyone confirm/deny/correct this? > > [1] https://github.com/solana-labs/solana/issues/14586 (not > Streams-related, but it is RocksDB) > > Thanks in advance, > Colt McNealy > > *Founder, LittleHorse.dev* >