Guozhang Wang created KAFKA-1011:
------------------------------------

             Summary: Decompression and re-compression on MirrorMaker could 
result in messages being dropped in the pipeline
                 Key: KAFKA-1011
                 URL: https://issues.apache.org/jira/browse/KAFKA-1011
             Project: Kafka
          Issue Type: Bug
            Reporter: Guozhang Wang
             Fix For: 0.8.1


The way MirrorMaker works today is that its consumers could use deep iterator 
to decompress messages received from the source brokers and its producers could 
re-compress the messages while sending them to the target brokers. Since 
MirrorMakers use a centralized data channel for its consumers to pipe messages 
to its producers, and since producers would compress messages with the same 
topic within a batch as a single produce request, this could result in messages 
accepted at the front end of the pipeline being dropped at the target brokers 
of the MirrorMaker due to MesageSizeTooLargeException if it happens that one 
batch of messages contain too many messages of the same topic in MirrorMaker's 
producer. If we can use shallow iterator at the MirrorMaker's consumer side to 
directly pipe compressed messages this issue can be fixed. 

Also as Swapnil pointed out, currently if the MirrorMaker lags and there are 
large messages in the MirrorMaker queue (large after decompression), it can run 
into an OutOfMemoryException. Shallow iteration will be very helpful in 
avoiding this exception.

The proposed solution of this issue is also related to KAFKA-527.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to