SteveStevenpoor commented on PR #26810:
URL: https://github.com/apache/flink/pull/26810#issuecomment-3142813617

   Let's start from the example:
   
   SELECT u.user_id_0, u.name, o.order_id, p.payment_id, s.location 
       FROM Users u 
       RIGHT JOIN Orders o ON u.user_id_0 = o.user_id_1 
       RIGHT JOIN Payments p ON u.user_id_0 = p.user_id_2 
       INNER JOIN Shipments s ON p.user_id_2 = s.user_id_3
   
   So the original ast will look like:
   <img width="424" height="325" alt="image" 
src="https://github.com/user-attachments/assets/4a75d400-f516-4705-abb4-cedc0f1f27d8";
 />
   
   
   After right join to left join convertion:
   <img width="424" height="325" alt="image" 
src="https://github.com/user-attachments/assets/f69e85ec-efae-4d10-91a9-35d29a73d7a9";
 />
   
   
   In this MR I get rid of projections when converting to multi join (No INNER 
or LEFT joins but may be projections). So It will look like this:
   <img width="424" height="325" alt="image" 
src="https://github.com/user-attachments/assets/84a34557-2f74-4543-9558-654b8b2e9594";
 />
   
   
   So what's the problem with tests?
   StreamExecMultiJoin together with StreamingMultiJoinOperator highly relies 
on the fact that the original ast will have joins only on the left side. 
However, in the example above MultiJoin(P, O, U, S) need to firstly merge O and 
U, then P and JoinedRow from O and U, and then JoinedRow from P, O, U with S. 
But now it merges P and O first like we have different query structure.
   
   Why can't we construct MJ(O, U, P, S)?
   Because P must be on the right side of O, U to consider correct join types.
   
   What I try to do:
   After adding support for opt rel plan construction for right joins I've 
added semantic tests. Then I fixed StreamExecMultiJoin to generate proper join 
conditions without relying on the specific AST structure. Now I'm working on 
adapting StreamingMultiJoinOperator to work with described cases. I'm going to 
construct something like this:
   *Before*:
   When record arrives we start to construct resulting record from leftmost 
table
   With example above: MJ(P, O, U, S) ---> when record from User arrives we 
merge P and O, PO and U, POU and S leading to incorrect result. 
   
   *After*:
   When record arrives we start to construct resulting record from leftmost 
table from proper level:
   Example: MJ(P, O, U, S)
   <img width="532" height="279" alt="image" 
src="https://github.com/user-attachments/assets/3c662f4a-9a2e-4395-b0a6-dded9f686084";
 />
   This way we merge O and U, then P and OU, and then POU with S. Also we can 
save original join type for the level.
   
   Also it does not matter if we change right joins to left ones before 
JoinToMultiJoinRule or after. We will need to add projection on top of multi 
join anyway and handle right-side joins in StreamingMultiJoinOperator. Since we 
already have FlinkRightToLeftJoinRule I think it's ok to assume it will be 
applied before JoinToMultiJoinRule.
   
   @gustavodemorais let me know your thoughts on it, my friend.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to