If you have out-of-order data, there is no guarantee that the current record has larger timestamp than previous record. Data is still processed in offset order.
Also, you use selectKey() and groupByKey() thus triggering a repartioning that may introduce out-of-order data downstream even if all your original input data is ordered by timestamp. -Matthias On 12/13/18 10:53 PM, Dmitry Minkovsky wrote: > I just discovered that 2.1.0 included MAX_TASK_IDLE_MS_CONFIG and dropped > everything to play with it! Exciting! > > I built the following topology, attempting to synchronize two > topic-partitions: > https://gist.github.com/dminkovsky/45aa29aefefad564f9663cd36ad21ce1. > > However, I keep failing at > https://gist.github.com/dminkovsky/45aa29aefefad564f9663cd36ad21ce1#file-gistfile1-txt-L42. > Am I misunderstanding how this feature works? The two topic-partitions > appear to be grouped together in a single task: > > [2018-12-13 16:38:29,082] DEBUG > (org.apache.kafka.streams.processor.internals.TaskManager:120) > stream-thread > [migration-fed2f8fa-150e-494c-b1c8-0594f6d52a29-StreamThread-1] Adding > assigned tasks as active: {0_0=[join-requests-0, > settings-update-requests-0], > 1_0=[migration-KSTREAM-AGGREGATE-STATE-STORE-0000000008-repartition-0]} > > > So shouldn't this new config enable effectively synchronizing them? > > > Thank you, > > Dmitry >
signature.asc
Description: OpenPGP digital signature