xiedeyantu opened a new pull request, #21075:
URL: https://github.com/apache/datafusion/pull/21075

   ## Which issue does this PR close?
   
   - Closes #.
   
   ## Rationale for this change
   
   This change introduces a dedicated optimizer option for a conservative 
`UNION DISTINCT` rewrite that can reduce redundant scans when union branches 
read from the same source and differ only by filter predicates.
   
   Making the rule configurable allows the optimization to be introduced safely 
behind an opt-in flag while clearly documenting its behavior and expected plan 
changes.
   
   ## What changes are included in this PR?
   
   - Adds a new optimizer config option: 
`datafusion.optimizer.enable_unions_to_filter`, disabled by default.
   - Registers the `UnionsToFilter` optimizer rule in the logical optimizer 
pipeline.
   - Documents the new option in the config definitions, including before/after 
plan examples.
   - Adds sqllogictest coverage in 
`datafusion/sqllogictest/test_files/union.slt` to verify both behaviors:
     - the original `UNION DISTINCT` shape is preserved when the option is 
disabled
     - the plan is rewritten to a single branch with a combined `OR` filter 
when the option is enabled
   
   ## Are these changes tested?
   
   Yes.
   
   This PR adds sqllogictest coverage for the new option in 
`datafusion/sqllogictest/test_files/union.slt`, including expected logical and 
physical plans for both the disabled and enabled configurations.
   
   ## Are there any user-facing changes?
   
   Yes.
   
   A new user-facing configuration option is added:
   
   - `datafusion.optimizer.enable_unions_to_filter`
   
   When enabled, eligible `UNION DISTINCT` queries may produce different 
optimized logical and physical plans, though query results are unchanged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to