> Could you explain a bit more what you want to achieve through batching? > Better throughput or atomicity?
Sure! I've assumed that there's per-message atomicity and a per-partition ordering guarantee with KafkaProducer.send(), but nothing beyond that. My hopes are to reduce latency from when my subsystem is handed a message to when I receive my ack back from Kafka (acks=-1). My subsystem will receive multiple messages for a Node (msg1,msg2,msg3,msg4), but I need to acquire a semaphore for that Node before sending msg1, and the callback in the send() will need to release it before I can send msg2. As Kafka can have messages much bigger than my subsystem will get, I can do application-layer batching and build msgA which aggregates msg2,msg3,msg4 and send(msgA) to do a "bulk" send. This is the only way I can ensure that msg1 is safely written before msg2 is sent. To me, the semaphore is unfortunate since it's artificially slowing message rate down just in case the send() fails. I appreciate it'd be awesome if Kafka could track this; but if it's failing to write my message, it's likely having issues that would impede its ability to track the failure state too. JAmes On Mon, Feb 23, 2015 at 4:36 PM, Jun Rao <j...@confluent.io> wrote: > Could you explain a bit more what you want to achieve through batching? > Better throughput or atomicity? > > Thanks, > > Jun > > On Thu, Feb 19, 2015 at 4:09 PM, JAmes Atwill <jatw...@linuxstuff.org> > wrote: > > > Hey Jun, > > > > That's what I've got right now, semaphore before send() and release in > the > > callback. Am I correct in understanding that there's no way to do any > > batching with KafkaProducer itself (other than have a "bulk" message > which > > would just be a single message with multiple messages for a particular > > Node)? > > > > JAmes > > > > On Thu, Feb 19, 2015 at 2:50 PM, Jun Rao <j...@confluent.io> wrote: > > > > > You can register a callback for each message sent. The callback will be > > > called when the message is sent successfully or failed. > > > > > > Thanks, > > > > > > Jun > > > > > > On Tue, Feb 17, 2015 at 4:11 PM, JAmes Atwill <jatw...@linuxstuff.org> > > > wrote: > > > > > > > Hi! > > > > > > > > I'm using the new KafkaProducer in 0.8.2.0. > > > > > > > > I have thousands of "Nodes" which receive messages. Each message > > > > idempotently mutates the state of the Node, so while duplicate > messages > > > are > > > > fine, missed messages are not. > > > > > > > > I'm writing these messages into a topic with dozens of partitions. > > > > > > > > Am I correct in believing that I'll have to manually manage having > one > > > > message "in flight" per "node" at a time? Or is there a mechanism to > > say > > > > "This message and all messages after it for this partition were > > > rejected"? > > > > (or something similar) > > > > > > > > Thanks! > > > > > > > > JAmes > > > > > > > > > >