Hello folks,

I am looking to enrich rows from an unbounded streaming table by joining it
with a bounded static table while preserving rowtime for the streaming
table. For example, let's consider table two tables F and D, where F is
unbounded and D is bounded. The schema for the two tables is the following -

F:
 |-- C0: BIGINT
 |-- C1: STRING
 |-- R: TIMESTAMP(3) **rowtime**
 |-- WATERMARK FOR R: TIMESTAMP(3) AS `R` - INTERVAL '0' SECONDS

D:
 |-- C0: BIGINT
 |-- C1: STRING NOT NULL

I'd like to run the following query on this schema -

select sum(F.C0), D.C1, tumble_start(F.R, interval '1' second)
    from F join D ON F.C1 = D.C1
    group by D.C1, tumble(F.R, interval '1' second)

However, I run into the following error while running the above query -

"Rowtime attributes must not be in the input rows of a regular join. As a
workaround you can cast the time attributes of input tables to TIMESTAMP
before."

My understanding reading the docs is that Time Temporal Join is meant to
solve this problem. So I model table D as the following -

D:
 |-- C0: BIGINT
 |-- C1: STRING NOT NULL
 |-- R: TIMESTAMP(3)
 |-- WATERMARK FOR R: TIMESTAMP(3) AS `R` - INTERVAL '0' SECONDS
 |-- CONSTRAINT 2da2dd2e-9937-48cb-9dec-4f6055713004 PRIMARY KEY (C1)

With column D.R always set to 0 and modify the query as follows -

select sum(F.C0), D.C1, tumble_start(F.R, interval '1' second)
    from F join D FOR SYSTEM_TIME AS OF F.R ON F.C1 = D.C1
    group by D.C1, tumble(F.R, interval '1' second)

The above query runs but does not return any result. I have the following
data in D initially -
Emit D row=+I(0,"0",1970-01-01T00:00)@time=0
Emit D row=+I(1,"1",1970-01-01T00:00)@time=0
Emit D row=+I(2,"2",1970-01-01T00:00)@time=0
Emit D watermark=0

And F streams the following rows -
Emit F row=+I(0,"0",1970-01-01T00:00)@time=0
Emit F row=+I(1,"1",1970-01-01T00:00:10)@time=1000
Emit F watermark=1000

I expect that two rows in F will join with matching rows (on C1) in D and
produce some output. But I do not see anything in the output.

So I have the following questions -

1. Is time temporal join the correct tool to solve this problem?
2. What could be the reason for not getting any output rows in the result?

Thanks,
Satyam

Reply via email to