I'm reporting back my observations after enabling compression. Looks like compression is not doing anything. I'm still seeing "compression-rate-avg=1.0" and the same "record-size-avg" from JMX "kafka.producer" metrics.
I did set the following: systems.kafka.producer.compression.type=snappy Am I missing anything? Thanks, David On Wed, Aug 3, 2016 at 1:48 PM David Yu <david...@optimizely.com> wrote: > Great. Thx. > > On Wed, Aug 3, 2016 at 1:42 PM Jacob Maes <jacob.m...@gmail.com> wrote: > >> Hey David, >> >> what gets written to the changelog topic >> >> The changelog gets the same value as the store, which is the serialized >> form of the key and value. The serdes for the store are configured with >> the >> properties: >> stores.store-name.key.serde >> stores.store-name.msg.serde >> >> If I want to compress the changelog topic, do I enable that from the >> > producer? >> >> Yes. When you specify the changelog for your store, you specify it in >> terms >> of a SystemStream (typically a Kafka topic). In the part of the config >> where you define the Kafka system, you can pass any Kafka producer config >> <http://kafka.apache.org/documentation.html#newproducerconfigs>. So to >> configure compression you should configure the following property. >> systems.system-name.producer.compression.type >> >> Hope this helps. >> -Jake >> >> >> >> On Wed, Aug 3, 2016 at 11:16 AM, David Yu <david...@optimizely.com> >> wrote: >> >> > I'm trying to understand what gets written to the changelog topic. Is it >> > just the serialized value of the particular state store entry? If I >> want to >> > compress the changelog topic, do I enable that from the producer? >> > >> > The reason I'm asking is that, we are seeing producer throughput issues >> and >> > suspected that writing to changelog takes up most of the network >> bandwidth. >> > >> > Thanks, >> > David >> > >> >