Hey guys, The locking argument is correct for very small records (< 50 bytes), batching will help here because for small records locking becomes the big bottleneck. I think these use cases are rare but not unreasonable.
Overall I'd emphasize that the new producer is way faster at virtually all use cases. If there is a use case where that isn't true, let's look at it in a data driven way by comparing the old producer to the new producer and looking for any areas where things got worse. I suspect the "reducing allocations" argument to be not a big thing. We do a number of small per-message allocations and it didn't seem to have much impact. I do think there are a couple of big producer memory optimizations we could do by reusing the arrays in the accumulator in the serialization of the request but I don't think this is one of them. I'd be skeptical of any api that was too weird--i.e. introduces a new way of partitioning, gives back errors on a per-partition rather than per message basis (given that partitioning is transparent this is really hard to think about), etc. Bad apis end up causing a ton of churn and just don't end up being a good long term commitment as we change how the underlying code works over time (i.e. we hyper optimize for something then have to maintain some super weird api as it becomes hyper unoptimized for the client over time). Roshan--Flush works as you would hope, it blocks on the completion of all outstanding requests. Calling get on the future for the request gives you the associated error code back. Flush doesn't throw any exceptions because waiting for requests to complete doesn't error, the individual requests fail or succeed which is always reported with each request. Ivan--The batches you send in the scala producer today actually aren't truely atomic, they just get sent in a single request. One tricky problem to solve when user's do batching is size limits on requests. This can be very hard to manage since predicting the serialized size of a bunch of java objects is not always obvious. This was repeatedly a problem before. -Jay On Tue, Apr 28, 2015 at 4:51 PM, Ivan Balashov <ibalas...@gmail.com> wrote: > I must agree with @Roshan – it's hard to imagine anything more intuitive > and easy to use for atomic batching as old sync batch api. Also, it's fast. > Coupled with a separate instance of producer per > broker:port:topic:partition it works very well. I would be glad if it finds > its way into new producer api. > > On a side-side-side note, could anyone confirm/deny if SimpleConsumer's > fetchSize must be set at least as batch bytes (before or after > compression), otherwise client risks not getting any messages? >