Hey,
Sure, I have create something that can be called a minimal reproducible
example. It's not the prettiest since it uses a lot of *Thread.sleep* but
it allows to be sure that the input is exactly what you want.
https://github.com/DomWos/FlinkTTF/tree/long-vs-timestamp

In the long-vs-timestamp branch, there is a bunch of classes including two
actual tests *TimestampPassingTTFLongSelect* &
*TimestampPassingTTFTimestampSelect
*both presents the same select and all operations are generally the same
apart from the selected field.
The output should be printed at the end, I didn't want to play much with
Sinks since I think printing shows the issue better.


Best Regards,
Dom.


śr., 18 mar 2020 o 08:13 Kurt Young <ykt...@gmail.com> napisał(a):

> Hi,
>
> AFAIK there is no special watermark generation logic for temporal table
> join operator. Could you share your example's codes then I can help to
> analyze and debug?
>
> Best,
> Kurt
>
>
> On Tue, Mar 17, 2020 at 9:53 PM Dominik Wosiński <wos...@gmail.com> wrote:
>
> > Hey Guys,
> > I have observed a weird behavior on using the Temporal Table Join and the
> > way it pushes the Watermark forward. Generally, I think the question is
> > *When
> > is the Watermark pushed forward by the Temporal Table Join?*
> >
> > The issue I have noticed is that Watermark seems to be pushed forward
> even
> > if elements are not generated, is that the expected behavior?
> >
> > I have created a simple test that takes two streams rates & ccyCodes.
> Now,
> > I do temporal table Join of rates & ccyCodes in two ways :
> > 1) SELECT ccyIsoCode, ccyIsoName, rate, rates_ts as ratesLong
> >
> > |  FROM RatesTable, LATERAL TABLE(ccyTable(rates_rowtime))
> > |  WHERE ccyIsoCode = ratesCcyIsoCode
> >
> >
> >
> > 2) SELECT ccyIsoCode, ccyIsoName, rate, rates_rowtime as ratesTs
> >
> > |  FROM RatesTable, LATERAL TABLE(ccyTable(rates_rowtime))
> > |  WHERE ccyIsoCode = ratesCcyIsoCode
> >
> > So, as You can see the only difference is the fact the in the first one I
> > am selecting the field that is the timestamp but wasn't marked with
> > *rowtime* in the second one I am selecting the actual *rowtime.*
> >
> > Now, I for the first method I need to reassign timestamps and watermarks,
> > which I do.  Finally, I create time windows of size let's say 7000
> > miliseconds and print the results. Now the unexpected behaviour I am
> facing
> > is the fact that, say I create four artificial records for rates that are
> > joined correctly with ccy with timestamps (1000, 5000, 8000, 20000)
> >
> >    - For the first method there is only one window generated with
> elements
> >    [1000, 5000]
> >    - But for the second method with *rowtime* there are two different
> >    windows [1000, 5000] and [8000], this basically means that the
> watermark
> >    for 20000 was generated.
> >
> > Is that the expected behavior ?? I was quite surprised that selecting
> > different field can actually yied different results in terms of
> > Watermarking.
> >
> > Best Regards,
> > Dom.
> >
>

Reply via email to