Ismael, I don't think KIP-98 is related. Shallow iteration was removed in KAFKA-732, which predates KIP-98 by a few years.
Ryanne On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <ism...@juma.me.uk> wrote: > Thanks for the KIP. I have a few high level comments: > > 1. Like Tom, I'm not convinced about the proposal to make this change to > MirrorMaker 1 if we intend to deprecate it and remove it. I would rather us > focus our efforts on the implementation we intend to support going forward. > 2. The producer/consumer configs seem pretty dangerous for general usage, > but the KIP doesn't address the potential downsides. > 3. How does the ProducerRequest change impact exactly-once (if at all)? The > change we are reverting was done as part of KIP-98. Have we considered the > original reasons for the change? > > Thanks, > Ismael > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian < > vahid.hashem...@gmail.com> > wrote: > > > Retitled the thread to conform to the common format. > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ning2008w...@gmail.com> > wrote: > > > > > Hello Henry, > > > > > > This is a very interesting proposal. > > > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the similar > > > concern of re-compressing data in mirror maker. > > > > > > Probably one thing may need to clarify is: how "shallow" mirroring is > > only > > > applied to mirrormaker use case, if the changes need to be made on > > generic > > > consumer and producer (e.g. by adding `fetch.raw.bytes` and > > > `send.raw.bytes` to producer and consumer config) > > > > > > On 2021/02/05 00:59:57, Henry Cai <h...@pinterest.com.INVALID> wrote: > > > > Dear Community members, > > > > > > > > We are proposing a new feature to improve the performance of Kafka > > mirror > > > > maker: > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring > > > > > > > > The current Kafka MirrorMaker process (with the underlying Consumer > and > > > > Producer library) uses significant CPU cycles and memory to > > > > decompress/recompress, deserialize/re-serialize messages and copy > > > multiple > > > > times of messages bytes along the mirroring/replicating stages. > > > > > > > > The KIP proposes a *shallow mirror* feature which brings back the > > shallow > > > > iterator concept to the mirror process and also proposes to skip the > > > > unnecessary message decompression and recompression steps. We argue > in > > > > many cases users just want a simple replication pipeline to replicate > > the > > > > message as it is from the source cluster to the destination cluster. > > In > > > > many cases the messages in the source cluster are already compressed > > and > > > > properly batched, users just need an identical copy of the message > > bytes > > > > through the mirroring without any transformation or repartitioning. > > > > > > > > We have a prototype implementation in house with MirrorMaker v1 and > > > > observed *CPU usage dropped from 50% to 15%* for some mirror > pipelines. > > > > > > > > We name this feature: *shallow mirroring* since it has some > resemblance > > > to > > > > the old Kafka 0.7 namesake feature but the implementations are not > > quite > > > > the same. ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches > > > inside > > > > MemoryRecords structure instead of deep iterating records inside > > > > RecordBatch; 2. We *shallowly* copy (share) pointers inside > ByteBuffer > > > > instead of deep copying and deserializing bytes into objects. > > > > > > > > Please share discussions/feedback along this email thread. > > > > > > > > > > > > > -- > > > > Thanks! > > --Vahid > > >