Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22091 )

Change subject: IMPALA-13531: Calcite CTE frontend
......................................................................


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/22091/8/fe/src/main/java/org/apache/impala/planner/CTEBufferNode.java
File fe/src/main/java/org/apache/impala/planner/CTEBufferNode.java:

http://gerrit.cloudera.org:8080/#/c/22091/8/fe/src/main/java/org/apache/impala/planner/CTEBufferNode.java@33
PS8, Line 33:
> I noticed you've introduced LocalMultiSink—does this mean you've reused exi
There's a problem with ownership I haven't figured out how to handle yet.

Impala first produces a "single-node plan", which needs to be able to execute. 
It uses a single execution fragment, so execution is essentially 
single-threaded. For that case we have a single DataSink (root or table sink), 
and we need a SequenceNode to ensure CTEs are produced first (and in the right 
order, not implemented yet), and something to handle executing the CTE and 
saving the results (right now CTEBufferNode renamed to CTEProducerNode).

In a distributed plan, we introduce multiple fragments. Each has a DataSink to 
determine where the results go. There the DataSink for a CTE fragment is 
LocalMultiSink, which always has a CTEProducerNode as a child and doesn't do 
anything itself. In my mind it would handle the role CTEProducerNode currently 
fills, but I don't want to duplicate code.

LocalMultiSink conceptually pushes results to an local queue for independent 
consumers. MultiDataSink pushes results to multiple child DataSinks. So they 
have similar names, but do pretty different things.

MultiDataSink could be useful in a slightly different configuration, if we 
implemented CTE as an ExchangeNode at each consumer and create a DataStreamSink 
for each ExchangeNode under MultiDataSink. That implies cross-network traffic, 
which avoids some of the node scheduling changes that are currently needed, but 
currently MultiDataSink is used only with TableSink (which are terminal sinks) 
and would need additional work to function as part of an exchange.



--
To view, visit http://gerrit.cloudera.org:8080/22091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0840c0859d2fe25628d799a18d302cee1eb36e8
Gerrit-Change-Number: 22091
Gerrit-PatchSet: 13
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Anonymous Coward (816)
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Steve Carlin <[email protected]>
Gerrit-Comment-Date: Fri, 05 Dec 2025 16:25:50 +0000
Gerrit-HasComments: Yes

Reply via email to