Hi Steven,
  Speaking only for myself, I agree with you. I think these settings/tweaks
are the easiest short term way to get some proper non-blocking behavior.
Long term, it seems like having a secondary queue in the client to hold raw
messages until meta is available and then start blocking or dropping
messages once too many are queued.

For those interested, I submitted a patch to add the following options:
pre.initialize.topics
pre.initialize.timeout.ms

And then a new public method isInitialized() that the caller can check and
make a decision to blow up or accept the failure and continue. If
initialized is false, any sends will fast fail until the initialization
completes.

Patch is attached here:
https://issues.apache.org/jira/browse/KAFKA-1835

Not familiar with Kafka's processes, so any feedback welcome.

Thanks,
Paul

On Mon, Jan 5, 2015 at 1:47 PM, Steven Wu <stevenz...@gmail.com> wrote:

> " preinitialize.metadata=true/false" can help to certain extent. if the
> kafka cluster is down, then metadata won't be available for a long time
> (not just the first msg). so to be safe, we have to set "
> metadata.fetch.timeout.ms=1" to fail fast as Paul mentioned. I can also
> echo Jay's comment that on-demand fetch of metadata might be more
> efficient, since cluster may have many topics that a particular producer
> may not care.
>
> so I plan to do sth similar to what Paul described.
> - metadata.fetch.timeout.ms=1
> - enqueue msg to a pending queue when topic metadata not available.
> - have a background thread check when metadata become available and drain
> the pending queue
> - optionally, prime topic metadata asynchronously during init (if
> configured)
>
> Just wondering whether above should be the default behavior  of best-effort
> non-blocking delivery in kafka clients. then we don't have to reinvent the
> wheels.
>
> Thanks,
> Steven
>
>
>
> On Mon, Dec 29, 2014 at 11:48 AM, Jay Kreps <jay.kr...@gmail.com> wrote:
>
> > I don't think a separate queue will be a very simple solution to
> implement.
> >
> > Could you describe your use case a little bit more. It does seem to me
> that
> > as long as the metadata fetch happens only once and the blocking has a
> > tight time bound this should be okay in any use case I can imagine. And,
> of
> > course, by default the client blocks anyway whenever you exhaust the
> memory
> > buffer space. But it sounds like you feel it isn't. Maybe you could
> > describe the scenario a bit?
> >
> > I think one thing we could do is what was discussed in another thread,
> > namely add an option like
> >   preinitialize.metadata=true/false
> > which would default to false. When true this would cause the producer to
> > just initialize metadata for all topics when it is created. Note that
> this
> > then brings back the opposite problem--doing remote communication during
> > initialization which tends to bite a lot of people. But since this would
> be
> > an option that would default to false perhaps it would be less likely to
> > come as a surprise.
> >
> > -Jay
> >
> > On Mon, Dec 29, 2014 at 8:38 AM, Steven Wu <stevenz...@gmail.com> wrote:
> >
> > > +1. it should be truly async in all cases.
> > >
> > > I understand some challenges that Jay listed in the other thread. But
> we
> > > need a solution nonetheless. e.g. can we maintain a separate
> > > list/queue/buffer for pending messages without metadata.
> > >
> > > On Tue, Dec 23, 2014 at 12:57 PM, John Boardman <
> boardmanjo...@gmail.com
> > >
> > > wrote:
> > >
> > > > I was just fighting this same situation. I never expected the new
> > > producer
> > > > send() method to block as it returns a Future and accepts a Callback.
> > > > However, when I tried my unit test, just replacing the old producer
> > with
> > > > the new, I immediately started getting timeouts waiting for
> metadata. I
> > > > struggled with this until I went into the source code and found the
> > > wait()
> > > > that waits for the metadata.
> > > >
> > > > At that point I realized that this new "async" producer would have to
> > be
> > > > executed on its own thread, unlike the old producer, which
> complicates
> > my
> > > > code unnecessarily. I totally agree with Paul that the contract of
> > send()
> > > > is being completely violated with internal code that can block.
> > > >
> > > > I did try fetching the metadata first, but that only worked for a few
> > > calls
> > > > before the producer decided it was time to update the metadata again.
> > > >
> > > > Again, I agree with Paul that this API should be fixed so that it is
> > > truly
> > > > asynchronous in all cases. Otherwise, it cannot be used on the main
> > > thread
> > > > of an application as it will block and fail.
> > > >
> > >
> >
>

Reply via email to