Hi Alex, thanks for the question! In the simplest sense, the tool doesn't know anything about the messages in the log or any particular batch. The tool would compress the encrypted data to measure the resulting size, but the results would likely show no reduction in data size. Effectively, the tool would just spin a bunch of CPU cycles and produce no interesting results.
It looks like concerns around compression were raised in the KIP-317 discussion, with the possibility of compression being disabled when encryption is used due to concerns about security (which I think are quite valid). My general take on the issue in the context of this KIP would be that this tool is relatively simple in nature and if needed, could be extended upon. If KIP-317 were to change the semantics of how compression is applied to encrypted messages or whether compression is allowed at all, this tool can match those semantics, whatever they may be. Chris On 2020/08/24 21:49:29, Alex Wang <alew...@linkedin.com.INVALID> wrote: > Hi, how will this work with encrypted data in logs if/when KIP-317 gets > merged? Encrypted data will be hard to compress, so the analyzer tool might > need to acquire the decryption key somewhere measure the compression stats. > > On 2020/08/17 20:23:51, "Christopher Beard (BLOOMBERG/ 919 3RD A)" > <cbea...@bloomberg.net> wrote: > > Hi everyone, > > > > I would like to start a discussion on KIP-640: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-640%3A+Add+log+compression+analysis+tool > > > > This KIP outlines a new CLI tool which helps compare how the various > > compression types supported by Kafka reduce the size of a log (and > > therefore more broadly, of a topic). > > > > I've put together a PR that might help serve as a starting point for > > comments and suggestions. > > [WIP] PR: https://github.com/apache/kafka/pull/9193 > > > > Thanks, > > Chris Beard >