Hey Paul, Here are the constraints: 1. We wanted the storage of messages to be in their compact binary form so we could bound memory usage. This implies partitioning prior to enqueue. And as you note partitioning requires having metadata (even stale metadata) about topics. 2. We wanted to avoid prefetching metadata for all topics since there may be quite a lot of topics. 3. We wanted to make metadata fetching lazy so that it would be possible to create a client without having an active network connection. This tends to be important when services are brought up in development or test environments where it is annoying to have to control the dependency graph when starting things.
This blocking isn't too bad as it only occurs on the first request for each topic. Our feeling was that many things tend to get setup on a first request (DB connections are established, caches populated, etc) so this was not unreasonable. If you want to pre-initialize the metadata to avoid blocking on the first request you can do so by fetching the metadata using the producer.partitionsFor(topic) api at start-up. -Jay On Thu, Dec 18, 2014 at 9:07 AM, Paul Pearcy <ppea...@gmail.com> wrote: > > Hello, > > Playing around with the 0.8.2-beta producer client. One of my test cases > is to ensure producers can deal with Kafka being down when the producer is > created. My tests failed miserably because of the default blocking in the > producer with regard to metadata.fetch.timeout.ms. The first line of new > producer is waitOnMetadata which is blocking. > > I can handle this case by loading topic meta on init and setting the > timeout value to very low metadata.fetch.timeout.ms and either throwing > away messages or creating my own internal queue to buffer. > > I’m surprised the metasync isn’t done async. If it fails, return that in > the future/callback. This way the API could actually be considered safely > async and the producer buffer could try to hold on to things until > block.on.buffer.full kicks in to either drop messages or block. You’d > probably need a partition callback since numPartitions wouldn’t be > available. > > The implication is that people's apps will work fine if first messages are > sent while kafka server is up, however, if kafka is down and they restart > their app, the new producer will block all sends and blow things up if you > haven't written your app to be aware of this edge case. > > > Thanks, > > Paul >