Yeah I can shed some light here: I used Universal originally since at
the beginning of Kafka Streams journey there were user reports
complaining about its storage amplifications. But soon enough (around
2019) I've realized that, as a OOTB config, level compaction may be
more preferable.

I had a PR dating back to that time where I suggested changing a bunch
of OOTB configs or RocksDB including the compaction config:
https://github.com/apache/kafka/pull/6406/files, unfortunately it was
not merged since I wanted to run some benchmarks to make sure it does
not have any gotchas but never got the time to do so. I would be very
happy in fact if someone could pick that up and re-examine if they
still make sense, and if yes drive it through and merge.

Guozhang


On Sun, Jul 23, 2023 at 10:29 AM Matthias J. Sax <mj...@apache.org> wrote:
>
> Do you happen to know?
>
>
> -------- Forwarded Message --------
> Subject: Streams/RocksDB: Why Universal Compaction?
> Date: Fri, 23 Jun 2023 13:19:36 -0700
> From: Colt McNealy <c...@littlehorse.io>
> Reply-To: users@kafka.apache.org
> To: users@kafka.apache.org
>
> Hello there!
>
> I was wondering if anyone (perhaps an early developer or power-user of
> Kafka Streams) knows why the Streams developers made the default setting
> for RocksDB compaction "Universal" compaction rather than "Level"
> compaction?
>
> My understanding (in which I am extremely UNconfident) is as follows—
>
> Supposedly Universal compaction leads to lower write amplification after
> compaction finishes. In a run of Universal compaction, all data is
> compacted; as per the RocksDB documentation it is possible for temporary
> write amplification of up to 2x during this process. There have also been
> reports of "write stalls" during this process [1].
>
> In Level compaction, only certain levels (tiers of SST files) are compacted
> at once, meaning that the compaction process is shorter and less intensive,
> but that write amplification after compaction finishes is higher than with
> universal compaction.
>
> Can anyone confirm/deny/correct this?
>
> [1] https://github.com/solana-labs/solana/issues/14586 (not
> Streams-related, but it is RocksDB)
>
> Thanks in advance,
> Colt McNealy
>
> *Founder, LittleHorse.dev*
>

Reply via email to