Re: [DISCUSS] KIP-28 - Add a transform client for data processing

Guozhang Wang Mon, 10 Aug 2015 18:50:08 -0700

Hi Jiangjie,

Not sure I understand the "What If user have interleaved groups of messages,
each group makes a complete logic?" Could you elaborate a bit?


About the committing functionality, it currently will only commit up to the
processed message's offset; the commit() call it self actually does more
than consumer committing offsets, but together with flushing the local
state and the producer.

Guozhang

On Fri, Jul 31, 2015 at 9:20 PM, Jiangjie Qin <[email protected]>
wrote:

> I think the abstraction of processor would be useful. It is not quite clear
> to me yet though which grid in the following API analysis chart this
> processor is trying to satisfy.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/New+consumer+API+change+proposal
>
> For example, in current proposal. It looks user will only be able to commit
> offsets for the last seen message. What If user have interleaved groups of
> messages, each group makes a complete logic? In that case, user will not
> have a safe boundary to commit offset.
>
>
> Is the processor client only intended to address the static topic data
> stream with semi-auto offset commit (which means user can only commit the
> last seen message)?
>
> Jiangjie (Becket) Qin
>
> On Thu, Jul 30, 2015 at 2:32 PM, James Cheng <[email protected]> wrote:
>
> > I agree with Sriram and Martin. Kafka is already about providing streams
> > of data, and so Kafka Streams or anything like that is confusing to me.
> >
> > This new library is about making it easier to process the data.
> >
> > -James
> >
> > On Jul 30, 2015, at 9:38 AM, Aditya Auradkar
> > <[email protected]> wrote:
> >
> > > Personally, I prefer KafkaStreams just because it sounds nicer. For the
> > > reasons identified above, KafkaProcessor or KProcessor is more apt but
> > > sounds less catchy (IMO). I also think we should prefix with Kafka
> > (rather
> > > than K) because we will then have 3 clients: KafkaProducer,
> KafkaConsumer
> > > and KafkaProcessor which is very nice and consistent.
> > >
> > > Aditya
> > >
> > > On Thu, Jul 30, 2015 at 9:17 AM, Gwen Shapira <[email protected]>
> > wrote:
> > >
> > >> I think its also a matter of intent. If we see it as "yet another
> > >> client library", than Processor (to match Producer and Consumer) will
> > >> work great.
> > >> If we see it is a stream processing framework, the name has to start
> > >> with S to follow existing convention.
> > >>
> > >> Speaking of naming conventions:
> > >> You know how people have stack names for technologies that are usually
> > >> used in tandem? ELK, LAMP, etc.
> > >> The pattern of Kafka -> Stream Processor -> NoSQL Store is super
> > >> common. KSN stack doesn't sound right, though. Maybe while we are
> > >> bikeshedding, someone has ideas in that direction :)
> > >>
> > >> On Thu, Jul 30, 2015 at 2:01 AM, Sriram Subramanian
> > >> <[email protected]> wrote:
> > >>> I had the same thought. Kafka processor, KProcessor or even Kafka
> > >>> stream processor is more relevant.
> > >>>
> > >>>
> > >>>
> > >>>> On Jul 30, 2015, at 2:09 PM, Martin Kleppmann <[email protected]
> >
> > >> wrote:
> > >>>>
> > >>>> I'm with Sriram -- Kafka is all about streams already (or topics, to
> > be
> > >> precise, but we're calling it "stream processing" not "topic
> > processing"),
> > >> so I find "Kafka Streams", "KStream" and "Kafka Streaming" all
> > confusing,
> > >> since they seem to imply that other bits of Kafka are not about
> streams.
> > >>>>
> > >>>> I would prefer "The Processor API" or "Kafka Processors" or "Kafka
> > >> Processing Client" or "KProcessor", or something along those lines.
> > >>>>
> > >>>>> On 30 Jul 2015, at 15:07, Guozhang Wang <[email protected]>
> wrote:
> > >>>>>
> > >>>>> I would vote for KStream as it sounds sexier (is it only me??),
> > second
> > >> to
> > >>>>> that would be Kafka Streaming.
> > >>>>>
> > >>>>>> On Wed, Jul 29, 2015 at 6:08 PM, Jay Kreps <[email protected]>
> > wrote:
> > >>>>>>
> > >>>>>> Also, the most important part of any prototype, we should have a
> > name
> > >> for
> > >>>>>> this producing-consumer-thingamgigy:
> > >>>>>>
> > >>>>>> Various ideas:
> > >>>>>> - Kafka Streams
> > >>>>>> - KStream
> > >>>>>> - Kafka Streaming
> > >>>>>> - The Processor API
> > >>>>>> - Metamorphosis
> > >>>>>> - Transformer API
> > >>>>>> - Verwandlung
> > >>>>>>
> > >>>>>> For my part I think what people are trying to do is stream
> > processing
> > >> with
> > >>>>>> Kafka so I think something that evokes Kafka and stream processing
> > is
> > >>>>>> preferable. I like Kafka Streams or Kafka Streaming followed by
> > >> KStream.
> > >>>>>>
> > >>>>>> Transformer kind of makes me think of the shape-shifting cars.
> > >>>>>>
> > >>>>>> Metamorphosis is cool and hilarious but since we are kind of
> > >> envisioning
> > >>>>>> this as more limited scope thing rather than a massive framework
> in
> > >> its own
> > >>>>>> right I actually think it should have a descriptive name rather
> > than a
> > >>>>>> personality of it's own.
> > >>>>>>
> > >>>>>> Anyhow let the bikeshedding commence.
> > >>>>>>
> > >>>>>> -Jay
> > >>>>>>
> > >>>>>>
> > >>>>>>> On Thu, Jul 23, 2015 at 5:59 PM, Guozhang Wang <
> [email protected]
> > >
> > >> wrote:
> > >>>>>>>
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> I just posted KIP-28: Add a transform client for data processing
> > >>>>>>> <
> > >>>>>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+transform+client+for+data+processing
> > >>>>>>> .
> > >>>>>>>
> > >>>>>>> The wiki page does not yet have the full design / implementation
> > >> details,
> > >>>>>>> and this email is to kick-off the conversation on whether we
> should
> > >> add
> > >>>>>>> this new client with the described motivations, and if yes what
> > >> features
> > >>>>>> /
> > >>>>>>> functionalities should be included.
> > >>>>>>>
> > >>>>>>> Looking forward to your feedback!
> > >>>>>>>
> > >>>>>>> -- Guozhang
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> -- Guozhang
> > >>>>
> > >>
> >
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

Reply via email to