jayzhan211 commented on code in PR #15300: URL: https://github.com/apache/datafusion/pull/15300#discussion_r2002213580
########## datafusion/sqllogictest/test_files/union.slt: ########## @@ -907,11 +907,56 @@ SELECT * FROM (SELECT y FROM u1 UNION ALL SELECT y FROM u2) ORDER BY y; 20 40 +query TT +explain SELECT * FROM (SELECT y FROM u1 UNION ALL SELECT y FROM u2) ORDER BY y; +---- +logical_plan +01)Sort: y ASC NULLS LAST +02)--Union +03)----Projection: CAST(u1.y AS Int64) AS y +04)------TableScan: u1 projection=[y] +05)----TableScan: u2 projection=[y] +physical_plan +01)SortPreservingMergeExec: [y@0 ASC NULLS LAST] +02)--UnionExec +03)----SortExec: expr=[y@0 ASC NULLS LAST], preserve_partitioning=[true] +04)------ProjectionExec: expr=[CAST(y@0 AS Int64) as y] +05)--------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 +06)----------DataSourceExec: partitions=1, partition_sizes=[1] +07)----SortExec: expr=[y@0 ASC NULLS LAST], preserve_partitioning=[false] +08)------DataSourceExec: partitions=1, partition_sizes=[1] + +# optimize_subquery_sort in create_relation removes Sort so the result is not sorted. Review Comment: Before #15201 - This plan doesn't meet the optimize rule condition because we didn't inline view table at this point, so the Sort is not removed. Now - The view table inlined so `Sort` is removed by this function. ########## datafusion/sqllogictest/test_files/union.slt: ########## @@ -907,11 +907,56 @@ SELECT * FROM (SELECT y FROM u1 UNION ALL SELECT y FROM u2) ORDER BY y; 20 40 +query TT +explain SELECT * FROM (SELECT y FROM u1 UNION ALL SELECT y FROM u2) ORDER BY y; +---- +logical_plan +01)Sort: y ASC NULLS LAST +02)--Union +03)----Projection: CAST(u1.y AS Int64) AS y +04)------TableScan: u1 projection=[y] +05)----TableScan: u2 projection=[y] +physical_plan +01)SortPreservingMergeExec: [y@0 ASC NULLS LAST] +02)--UnionExec +03)----SortExec: expr=[y@0 ASC NULLS LAST], preserve_partitioning=[true] +04)------ProjectionExec: expr=[CAST(y@0 AS Int64) as y] +05)--------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 +06)----------DataSourceExec: partitions=1, partition_sizes=[1] +07)----SortExec: expr=[y@0 ASC NULLS LAST], preserve_partitioning=[false] +08)------DataSourceExec: partitions=1, partition_sizes=[1] + +# optimize_subquery_sort in create_relation removes Sort so the result is not sorted. Review Comment: I have no idea why we have optimize plan rule `optimize_subquery_sort` in `create_relation`, I think we should move such rule to optimizer 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org