So hold the stream for 15 minutes wouldn't cause too much performance problems?
On Wed, Apr 20, 2016 at 3:16 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Consumer' buffer does not depend on offset committing, once it is given > from the poll() call it is out of the buffer. If offsets are not committed, > then upon failover it will simply re-consumer these records again from > Kafka. > > Guozhang > > On Tue, Apr 19, 2016 at 11:34 PM, Henry Cai <h...@pinterest.com.invalid> > wrote: > > > For the technique of custom Processor of holding call to > context.forward(), > > if I hold it for 10 minutes, what does that mean for the consumer > > acknowledgement on source node? > > > > I guess if I hold it for 10 minutes, the consumer is not going to ack to > > the upstream queue, will that impact the consumer performance, will > > consumer's kafka client message buffer overflow when there is no ack in > 10 > > minutes? > > > > > > On Tue, Apr 19, 2016 at 6:10 PM, Guozhang Wang <wangg...@gmail.com> > wrote: > > > > > Yes we are aware of this behavior and are working on optimizing it: > > > > > > https://issues.apache.org/jira/browse/KAFKA-3101 > > > > > > More generally, we are considering to add a "trigger" interface similar > > to > > > the Millwheel model where users can customize when they want to emit > > > outputs to the downstream operators. Unfortunately for now there will > no > > > easy workaround for buffering, and you may want to do this in app code > > (for > > > example, in a customized Processor where you can control when to call > > > context.forward() ). > > > > > > Guozhang > > > > > > > > > On Tue, Apr 19, 2016 at 1:40 PM, Jeff Klukas <jklu...@simple.com> > wrote: > > > > > > > Is it true that the aggregation and reduction methods of KStream will > > > emit > > > > a new output message for each incoming message? > > > > > > > > I have an application that's copying a Postgres replication stream > to a > > > > Kafka topic, and activity tends to be clustered, with many updates > to a > > > > given primary key happening in quick succession. I'd like to smooth > > that > > > > out by buffering the messages in tumbling windows, allowing the > updates > > > to > > > > overwrite one another, and emitting output messages only at the end > of > > > the > > > > window. > > > > > > > > Does the Kafka Streams API provide any hooks that I could use to > > achieve > > > > this kind of windowed "buffering" or "deduplication" of a stream? > > > > > > > > > > > > > > > > -- > > > -- Guozhang > > > > > > > > > -- > -- Guozhang >