We can get a stream from a DataStream api by SideOutput. But it's hard to do
the same thing with Flink SQL.

I have an idea about how to get the late records while using Flink SQL.

Assuming we have a source table for the late records, then we can query late
records on it. Obviously, it's not a real dynamic source table, it can be a
virtual source.

After optimizing, we can get a graph with some window aggregate nodes, which
can produced late records. And another graph for handling late records with
a virtual source node.

[scan] --> [group] --> [sink]

[virtual scan] --> [sink]

Then we can just "connect" these window nodes into the virtual source node.

The "connect" can be done by the following:

1. A side output node from each window node;
2. A mapper node may needed to encoding the record from the window node to
match the row type of virtual source;

[scan] --> [group] --> [sink]
                 \
                   --> [side output] --> [mapper] --> [sink]


Does it make sense? Or is there another way in progress for the similar
purpose?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to