I'll attempt to answer your questions. If we have allowed lateness to be greater than 0 (say 5), then if an event > which arrives at window end + 3 (within allowed lateness), > (a) it is considered late and included in the window function as a > late firing ? >
An event with a timestamp that falls within the window's boundaries that arrives when the current watermark is at window end + 3 will be included as a late event that has arrived within the allowed lateness. > (b) Are the late firings under the control of the trigger ? > Yes, the trigger is involved in all firings, late or not. > (c) If there are may events like this - are there multiple window > function invocations ? With the default event time trigger, each late event causes a late firing. You could use a custom trigger to implement other behaviors. > (d) Are these events (still within window end + allowed lateness) also > emitted via the side output late data ? > No. The side output for late events is only used to collect events that fall outside the allowed lateness. > 2. If an event arrives after the window end + allowed lateness - > (a) Is it excluded from the window function but still emitted from the > side output late data ? > Yes. > (b) And if it is emitted is there any attribute which indicates for > which window it was a late event ? > No, the event is emitted without any additional information. (c) Is there any time limit while the late side output remains active > for a particular window or all late events channeled to it ? There is no time limit; the late side output remains operative indefinitely. Hope that helps, David On Wed, Dec 11, 2019 at 1:40 PM M Singh <mans2si...@yahoo.com> wrote: > Thanks Timo for your answer. I will try the prototype but was wondering > if I can find some theoretical documentation to give me a sound > understanding. > > Mans > > On Wednesday, December 11, 2019, 05:44:15 AM EST, Timo Walther < > twal...@apache.org> wrote: > > > Little mistake: The key must be any constant instead of `e`. > > > On 11.12.19 11:42, Timo Walther wrote: > > Hi Mans, > > > > I would recommend to create a little prototype to answer most of your > > questions in action. > > > > You can simple do: > > > > stream = env.fromElements(1L, 2L, 3L, 4L) > > .assignTimestampsAndWatermarks( > > new AssignerWithPunctuatedWatermarks{ > > extractTimestamp(e) = e, > > checkAndGetNextWatermark(e, ts) = new Watermark(e) > > }) > > > > stream.keyBy(e -> e).window(...).print() > > env.execute() > > > > This allows to quickly create a stream of event time for testing the > > semantics. > > > > I hope this helps. Otherwise of course we can help you in finding the > > answers to the remaining questions. > > > > Regards, > > Timo > > > > > > > > On 10.12.19 20:32, M Singh wrote: > >> Hi: > >> > >> I have a few questions about the side output late data. > >> > >> Here is the API > >> > >> |stream .keyBy(...) <- keyed versus non-keyed windows .window(...) <- > >> required: "assigner" [.trigger(...)] <- optional: "trigger" (else > >> default trigger) [.evictor(...)] <- optional: "evictor" (else no > >> evictor) [.allowedLateness(...)] <- optional: "lateness" (else zero) > >> [.sideOutputLateData(...)] <- optional: "output tag" (else no side > >> output for late data) .reduce/aggregate/fold/apply() <- required: > >> "function" [.getSideOutput(...)] <- optional: "output tag"| > >> > >> > >> > >> Apache Flink 1.9 Documentation: Windows > >> < > https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/operators/windows.html#late-elements-considerations> > > >> > >> > >> > >> > >> > >> Apache Flink 1.9 Documentation: Windows > >> > >> < > https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/operators/windows.html#late-elements-considerations> > > >> > >> > >> > >> Here is the documentation: > >> > >> > >> Late elements > >> > >> considerations< > https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/operators/windows.html#late-elements-considerations> > > >> > >> > >> When specifying an allowed lateness greater than 0, the window along > >> with its content is kept after the watermark passes the end of the > >> window. In these cases, when a late but not dropped element arrives, > >> it could trigger another firing for the window. These firings are > >> called |late firings|, as they are triggered by late events and in > >> contrast to the |main firing| which is the first firing of the window. > >> In case of session windows, late firings can further lead to merging > >> of windows, as they may “bridge” the gap between two pre-existing, > >> unmerged windows. > >> > >> Attention You should be aware that the elements emitted by a late > >> firing should be treated as updated results of a previous computation, > >> i.e., your data stream will contain multiple results for the same > >> computation. Depending on your application, you need to take these > >> duplicated results into account or deduplicate them. > >> > >> > >> Questions: > >> > >> 1. If we have allowed lateness to be greater than 0 (say 5), then if > >> an event which arrives at window end + 3 (within allowed lateness), > >> (a) it is considered late and included in the window function as > >> a late firing ? > >> (b) Are the late firings under the control of the trigger ? > >> (c) If there are may events like this - are there multiple window > >> function invocations ? > >> (d) Are these events (still within window end + allowed lateness) > >> also emitted via the side output late data ? > >> 2. If an event arrives after the window end + allowed lateness - > >> (a) Is it excluded from the window function but still emitted > >> from the side output late data ? > >> (b) And if it is emitted is there any attribute which indicates > >> for which window it was a late event ? > >> (c) Is there any time limit while the late side output remains > >> active for a particular window or all late events channeled to it ? > >> > >> Thanks > >> > >> Thanks > >> > >> Mans > >> > >> > >> > >> > >> > >