Hi everybody! I'm so excited to be here asking a first question about Flink DataStream API.
I have a basic enrichment pipeline (event time). Basically, there's a main stream A (Kafka source) being enriched with the info of 2 other streams: B and C. (Kafka sources as well). Basically, the enrichment graph consists in 2 stages: 1. The stream A is enriched with stream B, resulting in stream (A, B) 2. The stream (A, B) is enriched with C, resulting in a stream (A, B, C). I've created a side output for late events A. The flow for these events would be slightly different: 1. The stream A is enriched by fetching the info from an external service using Flin Async I/O, resulting in stream (A, B) 2. Then, the stream (A, B) is enriched with C, resulting in a stream (A, B, C) (same before) Note that late A events can arrive out of order (weeks in same cases) I was wondering where the late-events flow should be defined. One graph/job for both flows or 2 graph/jobs with different times? What's the general pattern for cases like this? Many thanks, Jose Velasco