Dustin Cote created KAFKA-4169:
----------------------------------

             Summary: Calculation of message size is too conservative for 
compressed messages
                 Key: KAFKA-4169
                 URL: https://issues.apache.org/jira/browse/KAFKA-4169
             Project: Kafka
          Issue Type: Bug
          Components: producer 
    Affects Versions: 0.10.0.1
            Reporter: Dustin Cote


Currently the producer uses the uncompressed message size to check against 
{{max.request.size}} even if a {{compression.type}} is defined.  This can be 
reproduced as follows:

{code}
# dd if=/dev/zero of=/tmp/outsmaller.dat bs=1024 count=1000

# cat /tmp/out.dat | bin/kafka-console-producer --broker-list localhost:9092 
--topic tester --producer-property compression.type=gzip
{code}

The above code creates a file that is the same size as the default for 
{{max.request.size}} and the added overhead of the message pushes the 
uncompressed size over the limit.  Compressing the message ahead of time allows 
the message to go through.  When the message is blocked, the following 
exception is produced:
{code}
[2016-09-14 08:56:19,558] ERROR Error when sending message to topic tester with 
key: null, value: 1048576 bytes with error: 
(org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.RecordTooLargeException: The message is 1048610 
bytes when serialized which is larger than the maximum request size you have 
configured with the max.request.size configuration.
{code}

For completeness, I have confirmed that the console producer is setting 
{{compression.type}} properly by enabling DEBUG so this appears to be a problem 
in the size estimate of the message itself.  I would suggest we compress before 
we serialize instead of the other way around to avoid this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to