> Could you explain a bit more what you want to achieve through batching?
> Better throughput or atomicity?

Sure! I've assumed that there's per-message atomicity and a per-partition
ordering guarantee with KafkaProducer.send(), but nothing beyond that.

My hopes are to reduce latency from when my subsystem is handed a message
to when I receive my ack back from Kafka (acks=-1).

My subsystem will receive multiple messages for a Node
(msg1,msg2,msg3,msg4), but I need to acquire a semaphore for that Node
before sending msg1, and the callback in the send() will need to release it
before I can send msg2.  As Kafka can have messages much bigger than my
subsystem will get, I can do application-layer batching and build msgA
which aggregates msg2,msg3,msg4 and send(msgA) to do a "bulk" send.  This
is the only way I can ensure that msg1 is safely written before msg2 is
sent.

To me, the semaphore is unfortunate since it's artificially slowing message
rate down just in case the send() fails. I appreciate it'd be awesome if
Kafka could track this; but if it's failing to write my message, it's
likely having issues that would impede its ability to track the failure
state too.

  JAmes


On Mon, Feb 23, 2015 at 4:36 PM, Jun Rao <j...@confluent.io> wrote:

> Could you explain a bit more what you want to achieve through batching?
> Better throughput or atomicity?
>
> Thanks,
>
> Jun
>
> On Thu, Feb 19, 2015 at 4:09 PM, JAmes Atwill <jatw...@linuxstuff.org>
> wrote:
>
> > Hey Jun,
> >
> > That's what I've got right now, semaphore before send() and release in
> the
> > callback. Am I correct in understanding that there's no way to do any
> > batching with KafkaProducer itself (other than have a "bulk" message
> which
> > would just be a single message with multiple messages for a particular
> > Node)?
> >
> >   JAmes
> >
> > On Thu, Feb 19, 2015 at 2:50 PM, Jun Rao <j...@confluent.io> wrote:
> >
> > > You can register a callback for each message sent. The callback will be
> > > called when the message is sent successfully or failed.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Feb 17, 2015 at 4:11 PM, JAmes Atwill <jatw...@linuxstuff.org>
> > > wrote:
> > >
> > > > Hi!
> > > >
> > > > I'm using the new KafkaProducer in 0.8.2.0.
> > > >
> > > > I have thousands of "Nodes" which receive messages. Each message
> > > > idempotently mutates the state of the Node, so while duplicate
> messages
> > > are
> > > > fine, missed messages are not.
> > > >
> > > > I'm writing these messages into a topic with dozens of partitions.
> > > >
> > > > Am I correct in believing that I'll have to manually manage having
> one
> > > > message "in flight" per "node" at a time? Or is there a mechanism to
> > say
> > > > "This message and all messages after it for this partition were
> > > rejected"?
> > > > (or something similar)
> > > >
> > > > Thanks!
> > > >
> > > >   JAmes
> > > >
> > >
> >
>

Reply via email to