Each event in the stream has an associated timestamp as metadata. A timestamp assigner simply extracts the timestamp from the user object for that purpose.
There is no per-operator watermark, but with `assignTimestampsAndWatermarks` you may insert an operator that overrides upstream watermarks. Watermarks are markers that flow through the stream to inform operators about the progression of time. Time-sensitive operators (e.g. window operators) use the watermark to take action at specific points in time, such as at the time boundary of a window. When a window function is invoked for a given window, any elements emitted by the function take the 'max timestamp' of the window. For example, if a given window spans 11:00 to 12:00, any elements produced by the window function will have a timestamp of 12:00. The window function is fired when the current time (as indicated by the watermark) reaches 12:00. Now, imagine that an event arrives after 12:00 that has a timestamp of, say 11:55. That record would be considered late, but logically still belongs in the 11:00-12:00 window. Assuming the window operator was configured with allowed lateness of at least 5 minutes, the window would be re-evaluated, and any elements produced would have a timestamp of 12:00. You can chain together window operators. For example, we described an "hourly" window above. A subsequent "daily" window could further aggregate the results of each hour. If a late firing were to occur in the "hourly" window operator, the subsequent "daily" window operator would observe a late element and apply its own lateness logic. Hope this helps! Eron On Wed, Dec 20, 2017 at 11:10 PM, Jagadish Venkatraman < jagadish1...@gmail.com> wrote: > Hey Flink developers, > > I'm trying to understand the behavior of timestamp assignments in Apache > Flink. > > Here's my current understanding: > > 1. A *TimestampAssigner* returns the timestamp of each incoming message, > while a *WatermarkAssigner* emits the watermark for a source. (If > per-operator watermarks are defined, they over-ride the watermarks > generated from the source) > 2. For each operator, its watermark is determined as the minimum of all the > incoming watermarks from its prior operators. > > I was not sure on how the *timestamp* of the output from an operator is > determined? For instance, let's say, for a pane emitted from an > *EventTimeTumblingWindow*, what is its "timestamp"? > > What is the timestamp of a late-arrival? > > If I perform another EventTime window, on the output from a previous > window, is it defined? > > If there are pointers to the source code I need to look at, I'd appreciate > that as well. Thank you for the help. > > -- > Jagadish V, > Graduate Student, > Department of Computer Science, > Stanford University >