I am using gzip compression. Too big is really difficult to define because it always depends (for example what can your hardware handle), but I would say no more than a few megabytes. Having said that we are still successfully using 50MB size in production for some things, but it comes at a cost. It requires us to tune each consumer individually and keep these consumers separated (not within the same jvm) for SLA reasons.
-Luke On 6/26/14, 6:47 PM, "Bert Corderman" <bertc...@gmail.com> wrote: >Thanks for the details Luke. > >At what point would you consider a message too big? > >Are you using compression? > >Bert > >On Thursday, June 26, 2014, Luke Forehand < >luke.foreh...@networkedinsights.com> wrote: > >> I have used 50MB message size and it is not a great idea. First of all >> you need to make sure you have these settings in sync: >> message.max.bytes >> replica.fetch.max.bytes >> fetch.message.max.bytes >> >> I had not set the replica fetch setting and didn't realize one of my >> partitions was not replicating after a large message was produced. I >>also >> ran into heap issues with having to fetch such a large message, lots of >> unnecessary garbage collection. I suggest breaking down your message >>into >> smaller chunks. In my case, I decided to break an XML input stream >>(which >> had a root element wrapping a ridiculously large number of children) >>into >> smaller messages, having to parse the large xml root document and >>re-wrap >> each child element with a shallow clone of its parents as I iterated the >> stream. >> >> -Luke >> >> ________________________________________ >> From: Denny Lee <denny.g....@gmail.com <javascript:;>> >> Sent: Tuesday, June 24, 2014 10:35 AM >> To: users@kafka.apache.org <javascript:;> >> Subject: Experiences with larger message sizes >> >> By any chance has anyone worked with using Kafka with message sizes that >> are approximately 50MB in size? Based on from some of the previous >>threads >> there are probably some concerns on memory pressure due to the >>compression >> on the broker and decompression on the consumer and a best practices on >> ensuring batch size (to ultimately not have the compressed message >>exceed >> message size limit). >> >> Any other best practices or thoughts concerning this scenario? >> >> Thanks! >> Denny >>