Thank you for the KIP Sergio.

High level thoughts:
1\ I understand that the idea here is to provide better visibility to the
admins about potential improvements using compression and modifying batch
size. I would take it a step further and say that we should be providing
this visibility in a programmatic push based manner and make this system
generic enough so that adding new "optimization rules" in the future is
seamless. Perhaps, have a "diagnostic" mode in the cluster, which can be
dynamically enabled. In such a mode, the cluster would run a set of
"optimization" rules (at the cost of additional CPU cycles). One of such
rules would be the compression rule you mentioned in your KIP. At the end
of the diagnostic run, the generated report would contain a set of
recommendations. To begin with, we can introduce this "diagnostic" as a
one-time run by admin and later, enhance it further to be triggered
periodically in the cluster automatically (with results being published via
existing metric libraries). Even further down the line, this could lead to
"auto-tuning" producer libraries based on recommendations from the server.

KIP implementation specific comments/questions:
2\ Can you please add the algorithm that would be used to determine whether
compression is recommended or not? I am assuming that the algorithm would
take into account the factors impacting compression optimization such as
CPU utilization, network bandwidth, decompression cost by the consumers etc.
3\ Can you please add the algorithm that would be used to determine whether
batching is recommended?


Divij Vaidya



On Mon, May 16, 2022 at 8:42 AM Sergio Daniel Troiano
<sergio.troi...@adevinta.com.invalid> wrote:

> Hey guys!
>
> I would like to start an early discussion on this:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-838+Simulate+batching+and+compression
>
>
> Thanks!
>

Reply via email to