Hi Mathias
Thank you for your feedback.
I'm still a bit confused about what approach one should take. My
KafkaStreams application is pretty standard for KafkaStreams: it takes a
few Table-like topics, group and aggregates some of them so we can join
with others. Something like this:

KTable left = builder.table()
KTable right = builder.table()
var grouped = right.groupBy(//new key/value).aggregate(...)
left.leftJoin(grouped, //myFuncion).toStream(...)

Input and output topics are all Table-like topics, so I understand I need
"at least once" guarantee, but also need order guarantee at least for the
same Key. I mean, if you send 2 updates to the same key, I need a guarantee
I'll have the latest value for that key in the output topic. Is there a
recommended configuration for this?
Thanks again
Murilo

On Tue, 3 Dec 2019 at 04:29, Matthias J. Sax <matth...@confluent.io> wrote:

> That is correct. It depends on what guarantees you need though. Also
> note, that producers ofter write into repartitions topics to re-key data
> and for this case, no ordering guarantee can be provided anyway, as the
> single writer principle is "violated".
>
> Also note, that Kafka Streams can handle out-of-order data for most
> cases correctly and thus it should be ok to leave the default config
> values.
>
> But as always: it depends on your application and your requirements. As
> a rule of thumb: as long as you don't experience any issue, I would just
> go with default configs.
>
>
> -Matthias
>
>
> On 12/2/19 12:02 PM, Murilo Tavares wrote:
> > Hi everyone
> > In light of the discussions about order guarantee in Kafka, I am
> struggling
> > to understand how that affects KafkaStreams internal *KafkaProducer*.
> > In the official documentation, this section (
> >
> https://docs.confluent.io/current/streams/concepts.html#out-of-order-handling
> )
> > enumerates
> > 2 causes "that could potentially result in out-of-order data *arrivals*
> > with respect to their timestamps".
> > But I haven't found anything that mentioned how KafkaStreams *producers*
> > will handle errors, and how that could lead to out-of-order messages
> being
> > produced in output topics.
> > When I start my KafkaStreams application, I've seen the internal
> producers
> > use the below in its default configuration:
> >         enable.idempotence = false
> >         max.in.flight.requests.per.connection = 5
> >         retries = 2147483647
> >
> > So I guess that this could mean that at the end of my topology,
> > KafkaStreams could potentially send out of order messages to an output
> > topic if for some reason the message fails to be delivered to the broker,
> > as the internal producer would retry that.
> >
> > I've read that to guarantee order in the producers, one needs to set
> > "max.in.flight.requests.per.connection=1". But I wonder if one should
> > override this configuration for KafkaStreams applications?
> >
> > Thanks
> > Murilo
> >
>
>

Reply via email to