Hi Arvid, > Yes, I think it would be sufficient to say that the relative time doesn't > need to move at the same speed as wall clock without going into details > such as backpressure (that should be moved to the doc of the WMContext).
Makes sense. I've updated the FLIP. In the final PR I will try to phrase that better with a bit more detail. > I understand your concern and we can keep the original semantics. I'd still > like to see the name of the accessor to be a bit more specific for the task > at hand. RelativeClock is a very broad name because it's supposed to be > generic. Then we should have a specific name for the clock in the context, > such as getInputActivityClock or so. +1 I like this name and I've updated the FLIP. Best, Piotrek pt., 26 lip 2024 o 14:47 Arvid Heise <ahe...@confluent.io.invalid> napisał(a): > Hi Piotr, > > > > 1. `RelativeClock` looks like a super-interface of > > > `org.apache.flink.util.clock.Clock`. Should we also reflect that > > > accordingly by extending? This should not break anything. > > > > It should be fine to do so, but to me the question is if we should do so? > > It doesn't give any benefit right now, > > if necessary either we or someone else can do that in the future. While > in > > the meantime it makes the APIs > > somehow a bit more strict/more defined, so a bit higher chance of causing > > some problems if we decide to > > change something. > One obvious advantage is that we could use ManualClock for testing. Other > than that, I'd be fine to leave as-is and adjust later. > > > > > > 2. Since `RelativeClock` is rather general, I'd also propose to change > > the > > > javadoc to not include the pausing part. > > > > My intention was not to enforce in the contract that progress of the > timer > > must be paused, but the interface > > doesn't guarantee that the returned relativeTimeNanos value will follow > the > > wall clock 1 to 1. That concrete > > classes can pause progress of that time if needed. I wanted to state that > > clearly, to better highlight the > > difference against the `Clock`. > > > > Maybe the java doc should be clarified? > Yes, I think it would be sufficient to say that the relative time doesn't > need to move at the same speed as wall clock without going into details > such as backpressure (that should be moved to the doc of the WMContext). > > > > > > 4. I now realized that I changed semantics compared to the proposal: > this > > > idle clock would already calculate the time difference (now - last > > event). > > > That may narrow down the usefulness but makes the only known use case > > more > > > explicit. > > > > > > Do we have other (future) use cases that could profit from having > access > > to > > > the relative time of the last event occurring? Basically would we ever > > need > > > (now - some event) or (now - some periodic time)? > > > > I'm not sure. Idle clock seems a bit too specific to me. Also, at least > > currently, idle time at the watermark > > generator level is not calculated accurately. Exposing an approximate > value > > via API might make some > > false impression that it's accurate? > I understand your concern and we can keep the original semantics. I'd still > like to see the name of the accessor to be a bit more specific for the task > at hand. RelativeClock is a very broad name because it's supposed to be > generic. Then we should have a specific name for the clock in the context, > such as getInputActivityClock or so. > > Best, > > Arvid > > > On Fri, Jul 26, 2024 at 12:21 PM Piotr Nowojski <pnowoj...@apache.org> > wrote: > > > Hi Zakelly, > > > > > Could we just use `org.apache.flink.util.clock.Clock` instead of > > introducing a > > > new one? > > > > We can not use it as is, because of this java doc in it: > > > > * <h3>Relative Time</h3> > > * > > * <p>This time advances at the same speed as the <i>absolute time</i>, > but > > the timestamps can only > > * be referred to relative to each other. (...). Relative time behaves > > similar to {@link System#nanoTime()}. > > > > Also I don't feel like squeezing some artificial clocks, that blocks out > > part of the elapsed time, should be > > squeezed together with the regular clock that returns the absolute time. > It > > feels to me like it would be > > confusing to potential users. From this point of view, I like adding a > new > > dedicated interface that actually > > highlights this possibility. > > > > > How to generate time or whether to pause is an implementation-related > > thing (e.g. > > > org.apache.flink.util.clock.ManualClock) > > > > Actually it looks like that someone has thought it through when making > > ManualClock `@PublicEvolving`. > > I guess it exists because of unit tests and when `Clock` was being marked > > as `@PublicEvolving` someone > > either got carried away, or wanted to expose it to Flink users also for > the > > purpose of unit testing. > > > > Either way, I don't have a strong feeling. I would prefer to keep the new > > clock under a new interface. But > > we can also change the Java docs of the pre-existing `Clock`. > > > > Best, > > Piotrek > > > > pt., 26 lip 2024 o 10:51 Zakelly Lan <zakelly....@gmail.com> napisał(a): > > > > > Hi Piotr, > > > > > > I'd +1 for this idea. I partially share the same concern with Arvid. > > Could > > > we just use `org.apache.flink.util.clock.Clock` instead of introducing > a > > > new one? Thus only a new implementation is required IIUC. > > > My thinking is that clock is a general need for timing. How to generate > > > time or whether to pause is an implementation-related thing (e.g. > > > org.apache.flink.util.clock.ManualClock). WDYT? > > > > > > > > > Best, > > > Zakelly > > > > > > > > > On Thu, Jul 25, 2024 at 9:02 PM Piotr Nowojski < > piotr.nowoj...@gmail.com > > > > > > wrote: > > > > > > > Hi Arvid, > > > > > > > > > 1. `RelativeClock` looks like a super-interface of > > > > > `org.apache.flink.util.clock.Clock`. Should we also reflect that > > > > > accordingly by extending? This should not break anything. > > > > > > > > It should be fine to do so, but to me the question is if we should do > > so? > > > > It doesn't give any benefit right now, > > > > if necessary either we or someone else can do that in the future. > While > > > in > > > > the meantime it makes the APIs > > > > somehow a bit more strict/more defined, so a bit higher chance of > > causing > > > > some problems if we decide to > > > > change something. > > > > > > > > > 2. Since `RelativeClock` is rather general, I'd also propose to > > change > > > > the > > > > > javadoc to not include the pausing part. > > > > > > > > My intention was not to enforce in the contract that progress of the > > > timer > > > > must be paused, but the interface > > > > doesn't guarantee that the returned relativeTimeNanos value will > follow > > > the > > > > wall clock 1 to 1. That concrete > > > > classes can pause progress of that time if needed. I wanted to state > > that > > > > clearly, to better highlight the > > > > difference against the `Clock`. > > > > > > > > Maybe the java doc should be clarified? > > > > > > > > > 4. I now realized that I changed semantics compared to the > proposal: > > > this > > > > > idle clock would already calculate the time difference (now - last > > > > event). > > > > > That may narrow down the usefulness but makes the only known use > case > > > > more > > > > > explicit. > > > > > > > > > > Do we have other (future) use cases that could profit from having > > > access > > > > to > > > > > the relative time of the last event occurring? Basically would we > > ever > > > > need > > > > > (now - some event) or (now - some periodic time)? > > > > > > > > I'm not sure. Idle clock seems a bit too specific to me. Also, at > least > > > > currently, idle time at the watermark > > > > generator level is not calculated accurately. Exposing an approximate > > > value > > > > via API might make some > > > > false impression that it's accurate? > > > > > > > > Best, > > > > Piotrek > > > > > > > > śr., 24 lip 2024 o 15:56 Arvid Heise <ar...@apache.org> napisał(a): > > > > > > > > > Hi Piotr, > > > > > > > > > > thank you very much for addressing this issue. I'm convinced that > the > > > > > approach is the right solution also in contrast to the > alternatives. > > > > > Ultimately, only WatermarkGenratorWithIdleness needs to be adjusted > > > with > > > > > this change. > > > > > > > > > > My only concerns are regarding the actual code. > > > > > 1. `RelativeClock` looks like a super-interface of > > > > > `org.apache.flink.util.clock.Clock`. Should we also reflect that > > > > > accordingly by extending? This should not break anything. > > > > > 2. Since `RelativeClock` is rather general, I'd also propose to > > change > > > > the > > > > > javadoc to not include the pausing part. > > > > > 3. Instead I'd reflect the pausing part in the accessor similarly > to > > > your > > > > > proposal. Note the naming that is more specific > > > > > > > > > > /** > > > > > * Returns a RelativeClock that represents the time that the > > > > > respective input of this > > > > > * WatermarkGenerator has been idle. This clock does not advance if > > > > > the input is blocked by the > > > > > * runtime (e.g., backpressure or watermark alignment cause the > whole > > > > > subtask to be blocked). > > > > > * > > > > > * @see RelativeClock > > > > > */ > > > > > RelativeClock getIdleClock(); > > > > > > > > > > 4. I now realized that I changed semantics compared to the > proposal: > > > this > > > > > idle clock would already calculate the time difference (now - last > > > > event). > > > > > That may narrow down the usefulness but makes the only known use > case > > > > more > > > > > explicit. > > > > > > > > > > Do we have other (future) use cases that could profit from having > > > access > > > > to > > > > > the relative time of the last event occurring? Basically would we > > ever > > > > need > > > > > (now - some event) or (now - some periodic time)? > > > > > > > > > > Best, > > > > > > > > > > Arvid > > > > > > > > > > On Wed, Jul 24, 2024 at 3:01 PM Martijn Visser < > > > martijnvis...@apache.org > > > > > > > > > > wrote: > > > > > > > > > > > Hi Piotr, > > > > > > > > > > > > We've talked offline about this proposal and I think it would be > > > > > beneficial > > > > > > for users to get this fixed. +1 overall, and thanks for writing > it > > > > down. > > > > > > > > > > > > Best regards, > > > > > > > > > > > > Martijn > > > > > > > > > > > > On Tue, Jul 23, 2024 at 5:45 PM Piotr Nowojski < > > pnowoj...@apache.org > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > A bit unusual FLIP [1], as this is a bug fix for a problem > that I > > > > have > > > > > > > recently discovered [2]. However I think FLIP is required, as > > > > properly > > > > > > > fixing the issue requires changes to the public API. > > > > > > > > > > > > > > As this is a bug fix, I would propose to back-port this change > to > > > > > > previous > > > > > > > releases (1.19 and 1.20), despite this changing Public API. > > > > > > > > > > > > > > Best, > > > > > > > Piotrek > > > > > > > > > > > > > > [1] https://cwiki.apache.org/confluence/x/oQvOEg > > > > > > > [2] https://issues.apache.org/jira/browse/FLINK-35886 > > > > > > > > > > > > > > > > > > > > > > > > > > > >