Thank you for the KIP Sergio. High level thoughts: 1\ I understand that the idea here is to provide better visibility to the admins about potential improvements using compression and modifying batch size. I would take it a step further and say that we should be providing this visibility in a programmatic push based manner and make this system generic enough so that adding new "optimization rules" in the future is seamless. Perhaps, have a "diagnostic" mode in the cluster, which can be dynamically enabled. In such a mode, the cluster would run a set of "optimization" rules (at the cost of additional CPU cycles). One of such rules would be the compression rule you mentioned in your KIP. At the end of the diagnostic run, the generated report would contain a set of recommendations. To begin with, we can introduce this "diagnostic" as a one-time run by admin and later, enhance it further to be triggered periodically in the cluster automatically (with results being published via existing metric libraries). Even further down the line, this could lead to "auto-tuning" producer libraries based on recommendations from the server.
KIP implementation specific comments/questions: 2\ Can you please add the algorithm that would be used to determine whether compression is recommended or not? I am assuming that the algorithm would take into account the factors impacting compression optimization such as CPU utilization, network bandwidth, decompression cost by the consumers etc. 3\ Can you please add the algorithm that would be used to determine whether batching is recommended? Divij Vaidya On Mon, May 16, 2022 at 8:42 AM Sergio Daniel Troiano <sergio.troi...@adevinta.com.invalid> wrote: > Hey guys! > > I would like to start an early discussion on this: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-838+Simulate+batching+and+compression > > > Thanks! >