I am using gzip compression.  Too big is really difficult to define
because it always depends (for example what can your hardware handle), but
I would say no more than a few megabytes.  Having said that we are still
successfully using 50MB size in production for some things, but it comes
at a cost.  It requires us to tune each consumer individually and keep
these consumers separated (not within the same jvm) for SLA reasons.

-Luke




On 6/26/14, 6:47 PM, "Bert Corderman" <bertc...@gmail.com> wrote:

>Thanks for the details Luke.
>
>At what point would you consider a message too big?
>
>Are you using compression?
>
>Bert
>
>On Thursday, June 26, 2014, Luke Forehand <
>luke.foreh...@networkedinsights.com> wrote:
>
>> I have used 50MB message size and it is not a great idea.  First of all
>> you need to make sure you have these settings in sync:
>> message.max.bytes
>> replica.fetch.max.bytes
>> fetch.message.max.bytes
>>
>> I had not set the replica fetch setting and didn't realize one of my
>> partitions was not replicating after a large message was produced.  I
>>also
>> ran into heap issues with having to fetch such a large message, lots of
>> unnecessary garbage collection.  I suggest breaking down your message
>>into
>> smaller chunks.  In my case, I decided to break an XML input stream
>>(which
>> had a root element wrapping a ridiculously large number of children)
>>into
>> smaller messages, having to parse the large xml root document and
>>re-wrap
>> each child element with a shallow clone of its parents as I iterated the
>> stream.
>>
>> -Luke
>>
>> ________________________________________
>> From: Denny Lee <denny.g....@gmail.com <javascript:;>>
>> Sent: Tuesday, June 24, 2014 10:35 AM
>> To: users@kafka.apache.org <javascript:;>
>> Subject: Experiences with larger message sizes
>>
>> By any chance has anyone worked with using Kafka with message sizes that
>> are approximately 50MB in size?  Based on from some of the previous
>>threads
>> there are probably some concerns on memory pressure due to the
>>compression
>> on the broker and decompression on the consumer and a best practices on
>> ensuring batch size (to ultimately not have the compressed message
>>exceed
>> message size limit).
>>
>> Any other best practices or thoughts concerning this scenario?
>>
>> Thanks!
>> Denny
>>

Reply via email to