Your understanding seems roughly correct. When the watermark is talked about as a timestamp or "one dimensional" concept it is because we're implicitly talking about the watermark *at the current processing time*. As the current processing time moves forward, the value of the watermark changes too.
There is also a requirement that the watermark only moves forward. On Thu, Jun 15, 2017 at 2:39 PM Haibo Chen <haiboc...@cloudera.com> wrote: > Hi all, > > While I was going over The Beam Model [model evolution] > <https://docs.google.com/presentation/d/1SHie3nwe-pqmjGum_QDznPr-B_zXCjJ2VBDGdafZme8/edit#slide=id.g12846a6162_0_5> > to > learn the basics of the model, I found the explanation of watermark (in > slide 27), "No timestamp earlier than the watermark will be seen" and "It > declares that no event times earlier than this point are expected to appear > in the future", hard to understand. > > From the graph in the slide, the watermark seems to be a two-dimensional > concept, whereas timestamp (regardless of event or processing time) is > one-dimensional. Hence, my confusion around the explanation. It seems to me > that we can only talk about watermark in the context of processing time. > My own interpretation of water ,based on the graph, is > > Given a point (e, p)on the the watermark curve, at processing time p, the > system is confident (since watermark is just heuristics) that no events > happened earlier than e are expected to be seen. > > Is the understanding roughly correct? I plan to read the Dataflow paper to > get a more precise understanding, but would also like to hear explanations > in a less formal terms. Any help is greatly appreciated. > > Best, > Haibo Chen >