+1 for checking this in as is. For a from-scratch rewrite like this, I
prefer to do incremental reviews on a standalone subproject until it is
complete and stable to be merged into the main codebase. Looking forward to
the patch!

Thanks,
Neha


On Thu, Jan 23, 2014 at 10:23 AM, Jay Kreps <jay.kr...@gmail.com> wrote:

> Hey all,
>
> I have been working on a rewrite of the producer as described in the wiki
> below and discussed in a few previous threads:
> https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite
>
> My code is still has some bugs and is a bit rough in parts, but it
> functions in the basic cases. I did some basic performance tests over
> localhost, and the new approach has paid off quite significantly--for small
> (10 byte) messages a single thread on my laptop can send over 1m
> messages/second, and with larger messages easily maxes out the server.
>
> The difference between "sync" and "async" largely producer disappears--all
> requests immediately return a future response which can be used to get the
> behavior of either sync or async usage and we batch whenever the producer
> is under load using a "group commit"-like approach. You can encourage
> additional batching by incurring a small amount of latency (as before).
>
> Let's talk about how to integrate this code.
>
> This is a from-scratch rewrite of the producer code. As such it is a pretty
> major change. So far I have mostly been working on my own. I'd like to
> start getting feedback before I get too far along--no point in my polishing
> things that are going to be significantly revised in review, after all.
>
> As such here is what I would propose:
>
> 1. I'll put up a preliminary patch. Since this code is a completely
> standalone module it will not destabilize the existing server or existing
> producer (in fact there is no change to those). I will avoid including
> build support in this patch until we get the gradle stuff worked out so as
> to not break that patch (hopefully that moves along). Let's take this patch
> "as is" but with no expectation that the code is complete or that checkin
> implies everyone agrees with every design decision. I will follow-up with
> subsequent patches as we do reviews and discussions.
>
> 2. I'll send out a few higher-level topics for discussion threads. Let's
> get to consensus on these. I think micro-reviewing minor correctness issues
> won't be productive until we make higher level decisions. The topics. I'd
> like to discuss include
> a. The producer code:
>      - The public API
>      - The configurations: their names, and the general knobs we are
>      - Client message serialization
>      - The instrumentation to have
>      - The blocking and batching behavior
> b. The common code and few other cross-cutting policy things
>      - The approach to protocol definition and request serialization
>      - The config definition helper code
>      - The metrics package
>      - The project layout
>      - The java coding style and the use of java
>      - The approach to logging
>
> This is somewhat backwards, but I think it will be easier to handle changes
> that fall out of these discussions against an existing code base that is
> checked in otherwise each revision will be a brand new very large patch.
>
> If no objections I will toss up this code and kick off some of these
> discussions.
>
> -Jay
>

Reply via email to