Thanks, Liam!

I have a mixture of Kafka record size.  10% are large (>100kbs) and 90% of
the records are smaller than 1kb.  I'm working on a streaming analytics
solution that streams impressions, user actions and serving info and
combines them together.  End-to-end latency is more important than storage
size.


On Mon, Mar 14, 2022 at 3:27 PM Liam Clarke-Hutchinson <lclar...@redhat.com>
wrote:

> Hi Dan,
>
> Decompression generally only happens in the broker if the topic has a
> particular compression algorithm set, and the producer is using a different
> one - then the broker will decompress records from the producer, then
> recompress it using the topic's configured algorithm. (The LogCleaner will
> also decompress then recompress records when compacting compressed topics).
>
> The consumer decompresses compressed record batches it receives.
>
> In my opinion, using topic compression instead of producer compression
> would only make sense if the overhead of a few more CPU cycles compression
> uses was not tolerable for the producing app. In all of my use cases,
> network throughput becomes a bottleneck long before producer compression
> CPU cost does.
>
> For your "if X, do Y" formulation I'd say - if your producer is sending
> tiny batches, do some analysis of compressed vs. uncompressed size for your
> given compression algorithm - you may find that compression overhead
> increases batch size for tiny batches.
>
> If you're sending a large amount of data, do tune your batching and use
> compression to reduce data being sent over the wire.
>
> If you can tell us more about what your problem domain, there might be more
> advice that's applicable :)
>
> Cheers,
>
> Liam Clarke-Hutchinson
>
> On Tue, 15 Mar 2022 at 10:05, Dan Hill <quietgol...@gmail.com> wrote:
>
> > Hi.  I looked around for advice about Kafka compression.  I've seen mixed
> > and conflicting advice.
> >
> > Is there any sorta "if X, do Y" type of documentation around Kafka
> > compression?
> >
> > Any advice?  Any good posts to read that talk about this trade off?
> >
> > *Detailed comments*
> > I tried looking for producer vs topic compression.  I didn't find much.
> > Some of the information I see is back from 2011 (which I'm guessing is
> > pretty stale).
> >
> > I can guess some potential benefits but I don't know if they are actually
> > real.  I've also seen some sites claim certain trade offs but it's
> unclear
> > if they're true.
> >
> > It looks like I can modify an existing topic's compression.  I don't know
> > if that actually works.  I'd assume it'd just impact data going forward.
> >
> > I've seen multiple sites say that decompression happens in the broker and
> > multiple that say it happens in the consumer.
> >
>

Reply via email to