Thanks very much for taking the time to answer, Matthias! Very much
appreciated
All the best,
Marcus
On Wed, Apr 7, 2021 at 10:22 PM Matthias J. Sax wrote:
> Sorry for late reply...
>
>
> > I only see issues of out of order data in my re-partitioned topic as a
> result of a rebalance happenin
Sorry for late reply...
> I only see issues of out of order data in my re-partitioned topic as a result
> of a rebalance happening.
If you re-partition, you may actually see out-of-order data even if
there is no rebalance. In the end, during repartitioning you have
multiple upstream writers for
Thanks Matthias - that's great to know.
> Increasing the grace period should not really affect throughput, but
> latency.
Yes, a slip of the tongue on my part, you’re right :-)
One last question if I may? I only see issues of out of order data in my
re-partitioned topic as a result of a rebalan
> will it consider a timestamp in the body of the message, if we have
> implemented a custom TimeExtractor?
Yes.
> Or, which I feel is more likely - does TimeExtractor stream time only apply
> later on once deserialisation has happened?
Well, the extractor does apply after deserialization, bu
Thanks for your reply Matthias, and really great talks :-)
You’re right that I only have one input topic - though it does have 20
partitions.
The pointer to max.task.idle.ms cleared something up for me; I read the
following line from Kafka docs but couldn’t find what configuration they were
r
In general, Kafka Streams tries to process messages in timestamp order,
ie, oldest message first. However, Kafka Streams always need to process
messages in offset order per partition, and thus, the timestamp
synchronization applied to records from different topic (eg, if you join
two topics).
Ther