alamb commented on code in PR #14273:
URL: https://github.com/apache/datafusion/pull/14273#discussion_r1929724090


##########
datafusion/sqllogictest/test_files/tpch/plans/q6.slt.part:
##########
@@ -31,13 +31,13 @@ logical_plan
 01)Projection: sum(lineitem.l_extendedprice * lineitem.l_discount) AS revenue
 02)--Aggregate: groupBy=[[]], aggr=[[sum(lineitem.l_extendedprice * 
lineitem.l_discount)]]
 03)----Projection: lineitem.l_extendedprice, lineitem.l_discount
-04)------Filter: lineitem.l_shipdate >= Date32("1994-01-01") AND 
lineitem.l_shipdate < Date32("1995-01-01") AND lineitem.l_discount >= 
Decimal128(Some(5),15,2) AND lineitem.l_discount <= Decimal128(Some(7),15,2) 
AND lineitem.l_quantity < Decimal128(Some(2400),15,2)
-05)--------TableScan: lineitem projection=[l_quantity, l_extendedprice, 
l_discount, l_shipdate], partial_filters=[lineitem.l_shipdate >= 
Date32("1994-01-01"), lineitem.l_shipdate < Date32("1995-01-01"), 
lineitem.l_discount >= Decimal128(Some(5),15,2), lineitem.l_discount <= 
Decimal128(Some(7),15,2), lineitem.l_quantity < Decimal128(Some(2400),15,2)]
+04)------Filter: lineitem.l_shipdate >= Date32("1994-01-01") AND 
lineitem.l_shipdate < Date32("1995-01-01") AND CAST(lineitem.l_discount AS 
Float64) >= Float64(0.049999999999999996) AND CAST(lineitem.l_discount AS 
Float64) <= Float64(0.06999999999999999) AND lineitem.l_quantity < 
Decimal128(Some(2400),15,2)

Review Comment:
   This will likely cause a performance regression as it will cast the entire 
`lineitem.l_discount` column to Float before comparison where previously it 
could compare to a constant. 



##########
datafusion/sqllogictest/test_files/tpch/plans/q11.slt.part:
##########
@@ -49,7 +49,7 @@ limit 10;
 logical_plan
 01)Sort: value DESC NULLS FIRST, fetch=10
 02)--Projection: partsupp.ps_partkey, sum(partsupp.ps_supplycost * 
partsupp.ps_availqty) AS value
-03)----Inner Join:  Filter: CAST(sum(partsupp.ps_supplycost * 
partsupp.ps_availqty) AS Decimal128(38, 15)) > 
__scalar_sq_1.sum(partsupp.ps_supplycost * partsupp.ps_availqty) * 
Float64(0.0001)
+03)----Inner Join:  Filter: CAST(sum(partsupp.ps_supplycost * 
partsupp.ps_availqty) AS Float64) > __scalar_sq_1.sum(partsupp.ps_supplycost * 
partsupp.ps_availqty) * Float64(0.0001)

Review Comment:
   I vaguely remember the use of Decimal here was important for TPCH results 
(maybe for correctness or something 🤔 )



##########
datafusion/core/tests/parquet/mod.rs:
##########
@@ -184,7 +184,13 @@ impl TestOutput {
 /// and the appropriate scenario
 impl ContextWithParquet {
     async fn new(scenario: Scenario, unit: Unit) -> Self {
-        Self::with_config(scenario, unit, SessionConfig::new()).await
+        let mut session_config = SessionConfig::new();
+        // TODO (https://github.com/apache/datafusion/issues/12817) once this 
is the default behavior, remove from here

Review Comment:
   Does this means that DataFusion will no longer prune predicates like 
`decimal_col = 5.0`?
   
   If so, this like a significant regression / issue for anyone who relies on 
decimal types (like comet for example)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to