Hey Guozhang, I just took a quick look at the KIP, is it very similar to mirror maker with message handler?
Thanks, Jiangjie (Becket) Qin On Thu, Jul 23, 2015 at 10:25 PM, Ewen Cheslack-Postava <e...@confluent.io> wrote: > Just some notes on the KIP doc itself: > > * It'd be useful to clarify at what point the plain consumer + custom code > + producer breaks down. I think trivial filtering and aggregation on a > single stream usually work fine with this model. Anything where you need > more complex joins, windowing, etc. are where it breaks down. I think most > interesting applications require that functionality, but it's helpful to > make this really clear in the motivation -- right now, Kafka only provides > the lowest level plumbing for stream processing applications, so most > interesting apps require very heavyweight frameworks. > * I think the feature comparison of plain producer/consumer, stream > processing frameworks, and this new library is a good start, but we might > want something more thorough and structured, like a feature matrix. Right > now it's hard to figure out exactly how they relate to each other. > * I'd personally push the library vs. framework story very strongly -- the > total buy-in and weak integration story of stream processing frameworks is > a big downside and makes a library a really compelling (and currently > unavailable, as far as I am aware) alternative. > * Comment about in-memory storage of other frameworks is interesting -- it > is specific to the framework, but is supposed to also give performance > benefits. The high-level functional processing interface would allow for > combining multiple operations when there's no shuffle, but when there is a > shuffle, we'll always be writing to Kafka, right? Spark (and presumably > spark streaming) is supposed to get a big win by handling shuffles such > that the data just stays in cache and never actually hits disk, or at least > hits disk in the background. Will we take a hit because we always write to > Kafka? > * I really struggled with the structure of the KIP template with Copycat > because the flow doesn't work well for proposals like this. They aren't as > concrete changes as the KIP template was designed for. I'd completely > ignore that template in favor of optimizing for clarity if I were you. > > -Ewen > > On Thu, Jul 23, 2015 at 5:59 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > > Hi all, > > > > I just posted KIP-28: Add a transform client for data processing > > < > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+transform+client+for+data+processing > > > > > . > > > > The wiki page does not yet have the full design / implementation details, > > and this email is to kick-off the conversation on whether we should add > > this new client with the described motivations, and if yes what features > / > > functionalities should be included. > > > > Looking forward to your feedback! > > > > -- Guozhang > > > > > > -- > Thanks, > Ewen >