Re: Compression and batching

2015-05-13 Thread Jiangjie Qin
Yes, in old producer we don¹t control the compressed message size. In new producer, we estimate the compressed size heuristically and decide whether to close the batch or not. It is not perfect but at least better than the old one. Jiangjie (Becket) Qin On 5/13/15, 4:00 PM, "Jamie X" wrote: >Ji

Re: Compression and batching

2015-05-13 Thread Jamie X
Jiangjie, I changed my code to group by partition, then for each partition to group mesages into up to 900kb of uncompressed data, and then sent those batches out. That worked fine and didn't cause any MessageTooLarge errors. So it looks like the issue is that the producer batches all the messages

Re: Compression and batching

2015-05-13 Thread Jiangjie Qin
If you are sending in sync mode, producer will just group by partition the list of messages you provided as argument of send() and send them out. You don¹t need to worry about batch.num.messages. There is a potential that compressed message is even bigger than uncompressed message, though. I¹m not

Re: Compression and batching

2015-05-13 Thread Jamie X
(sorry if this messes up the mailing list, I didn't seem to get replies in my inbox) Jiangjie, I am indeed using the old producer, and on sync mode. > Notice that the old producer uses number of messages as batch limitation instead of number of bytes. Can you clarify this? I see a setting batch.

Re: Compression and batching

2015-05-12 Thread Jiangjie Qin
Mayuresh, this is about the old producer instead of the new Java producer. Jamie, In the old producer, if you use sync mode, the list of message will be sent as a batch. On the other hand, if you are using async mode, the messages are just put into the queue and batched with other messages. Notice

Re: Compression and batching

2015-05-12 Thread Mayuresh Gharat
Well, the batch size is decided by the value set for the property : "batch.size"; "The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration

Compression and batching

2015-05-12 Thread Jamie X
Hi, I'm wondering when you call kafka.javaapi.Producer.send() with a list of messages, and also have compression on (snappy in this case), how does it decide how many messages to put together as one? The reason I'm asking is that even though my messages are only 70kb uncompressed, the broker comp