Re: New Producer Public API

Jay Kreps Fri, 31 Jan 2014 11:08:19 -0800

Oliver,

Yeah that was my original plan--allow the registration of multiple
callbacks on the future. But there is some additional implementation
complexity because then you need more synchronization variables to ensure
the callback gets executed even if the request has completed at the time
the callback is registered. This also makes it unpredictable the order of
callback execution--I want to be able to guarantee that for a particular
partition callbacks for lower offset messages happen before callbacks for
higher offset messages so that if you set a highwater mark or something it
is easy to reason about. This has the added benefit that callbacks execute
in the I/O thread ALWAYS instead of it being non-deterministic which is a
little confusing.

I thought a single callback is sufficient since you can always include
multiple actions in that callback, and I think that case is rare anyway.

I did think about the possibility of adding a thread pool for handling the
callbacks. But there are a lot of possible configurations for such a thread
pool and a simplistic approach would no longer guarantee in-order
processing of callbacks (you would need to hash execution over threads by
partition id). I think by just exposing the simple method that executes in
the I/O thread you can easily implement the pooled execution using the
therad pooling mechanism of your choice by just having the callback use an
executor to run the action (i.e. make an AsyncCallback that takes a
threadpool and a Runnable or something like that). This gives the user full
control over the executor (there are lots of details around thread re-use
in executors, thread factories, etc and trying to expose configs for every
variation will be a pain). This also makes it totally transparent how it
works; that is if we did expose all kinds of thread pool configs you would
still probably end up reading our code to figure out exactly what they all
did.

-Jay

On Fri, Jan 31, 2014 at 9:39 AM, Oliver Dain <od...@3cinteractive.com>wrote:

> Hmmm.. I should read the docs more carefully before I open my big mouth: I
> just noticed the KafkaProducer#send overload that takes a callback. That
> definitely helps address my concern though I think the API would be
> cleaner if there was only one variant that returned a future and you could
> register the callback with the future. This is not nearly as important as
> I'd thought given the ability to register a callback - just a preference.
>
>
>
>
>
> On 1/31/14, 9:33 AM, "Oliver Dain" <od...@3cinteractive.com> wrote:
>
> >Hey all,
> >
> >I¹m excited about having a new Producer API, and I really like the idea of
> >removing the distinction between a synchronous and asynchronous producer.
> >The one comment I have about the current API is that it¹s hard to write
> >truly asynchronous code with the type of future returned by the send
> >method. The issue is that send returns a RecordSend and there¹s no way to
> >register a callback with that object. It is therefore necessary to poll
> >the object periodically to see if the send has completed. So if you have n
> >send calls outstanding you have to check n RecordSend objects which is
> >slow. In general this tends to lead to people using one thread per send
> >call and then calling RecordSend#await which removes much of the benefit
> >of an async API.
> >
> >I think it¹s much easier to write truly asynchronous code if the returned
> >future allows you to register a callback. That way, instead of polling you
> >can simply wait for the callback to be called. A good example of the kind
> >of thing I¹m thinking is the ListenableFuture class in the Guava
> >libraries:
> >
> >https://code.google.com/p/guava-libraries/wiki/ListenableFutureExplained
> >
> >
> >HTH,
> >Oliver
> >
>
>

Re: New Producer Public API

Reply via email to