Hi Team,

I'm trying Flink for the first time and encountered an issue that I would
like to discuss and understand if there is a way to achieve my use case
with Flink.

*Use case:* I need to perform unbounded stream joins on multiple data
streams by listening to different Kafka topics. I have a scenario to join a
column in a table with multiple columns in another table by avoiding
duplicate joins. The main concern is that I'm not able to avoid duplicate
joins.

*Issue: *Given the nature of data, it is possible to have updates over
time, sent as new messages since Kafka is immutable. For a given key I
would like to perform join only on the latest message, whereas currently
Flink performs join against all messages with the key (this is what I'm
calling as duplicate joins issue).
Example: Say I have two Kafka streams "User" and "Task". And I want to join
"User" with multiple columns in "Task".
Join "UserID" in "User" with "PrimaryAssignee", "SecondaryAssignee" and
"Manager" in "Task".

Assuming I created and registered DataStreams.
Below is my query:

  SELECT * FROM Task t
   LEFT JOIN User ua ON t.PrimaryAssignee = ua.UserID
   LEFT JOIN User ub ON t.SecondaryAssignee = ub.UserID
   LEFT JOIN User uc ON t.Manager = uc.UserID

Say I have 5 different messages in Kafka with UserID=1000, I don't want to
perform 5 joins instead I want to perform join with the only latest message
with UserID=1000. Is there any way to achieve this without using Temporal
Table Functions?

*I cannot use Temporal Table Functions because of below reasons:*
1. I need to trigger JOIN operation for every new message in Kafka. Whereas
new messages in Temporal Table don't trigger JOIN operation.
2. I need to perform LEFT OUTER JOINS, whereas Temporal Table can only be
used for INNER JOINS
3. From what I understand, JOIN in Temporal Table can only be performed
using Primary key, so I won't be able to Join more than one key.


Could someone please help me with this? Please let me know if any of the
information is not clear or need more details.

 If this is not the correct email id, could you please point me to the
correct one.


Thanks in advance!

Reply via email to